An Analysis System for Protein-Protein Interaction Data Based on Graph Theory

그래프 이론 기반의 단백질-단백질 상호작용 데이타 분석을 위한 시스템

  • 진희정 (부산대학교 컴퓨터공학과) ;
  • 윤지현 (부산대학교 컴퓨터공학과) ;
  • 조환규 (부산대학교 정보컴퓨터공학부)
  • Published : 2006.06.01

Abstract

PPI(Protein-Protein Interaction) data has information about the organism has maintained a life with some kind of mechanism. So, it is used in study about cure research back, cause of disease, and new medicine development. This PPI data has been increased by geometric progression because high throughput methods are developed such as Yeast-two-hybrid, Mass spectrometry, and Correlated mRNA expression. So, it is impossible that a person directly manage and analyze PPI data. Fortunately, PPI data is able to abstract the graph which has proteins as nodes, interactions as edges. Consequently, Graph theory plentifully researched from the computer science until now is able to be applied to PPI data successfully. In this paper, we introduce Proteinca(PROTEin INteraction CAbaret) workbench system for easily managing, analyzing and visualizing PPI data. Proteinca assists the user understand PPI data intuitively as visualizing a PPI data in graph and provide various analytical function on graph theory. And Protenica provides a simplified visualization with gravity-rule.

단백질-단백질 상호작용(PPI : Protein-Protein Interaction) 데이타는 생물체가 어떠한 메커니즘으로 생명을 유지하는지에 대한 정보를 담고 있다. 질병 연구나 신약 연구를 위해서 PPI 데이타를 이용한 많은 연구들이 이루어지고 있다. 이러한 PPI 데이타의 크기는 Yeast-two-hybrid, Mass spectrometry과 Correlated mRNh expression과 같은 방법들로 인하여 점차 그 증가량이 커지고 있다. 따라서 단백질-단백질 상호작용 데이타의 방대한 양과 복잡한 구조로 인하여 사람이 직접 분석하는 것은 불가능하다. 다행히도 PPI 데이타는 단백질은 노드로, 상호작용은 에지로 표현함으로써 전산학의 그래프 구조로 추상화될 수 있다. 본 논문에서는 방대한 단백질-단백질 상호작용 데이타를 연구자가 다양한 방법으로 손쉽게 분석할 수 있는 워크벤치(workbench) 시스템인 Proteinca (PROTEin INteraction CAbaret)에 대하여 소개한다. Proteinca는 다앙한 데이타베이스의 PPI 데이타를 그래프이론 기반의 분석 기능들을 제공하며, 그래프로 가시화하여 사용자가 직관적으로 이해할 수 있도록 도와준다. 또한, 중력 모델 기반의 간략화 방법을 제공하여 사용자에게 중요 단백질 중심의 가시화를 제공한다.

Keywords

References

  1. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S and Bork P., 'Comparative assessment of large-scale data sets of protein-protein interactions,' Nature, 2002
  2. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU and Eisenberg D., 'The Database of Interacting Proteins: 2004 update,' Nucleic Acids Research, 2004 https://doi.org/10.1093/nar/gkh086
  3. Mewes HW, Frishman D, Guldener U, Mannhaupt G, Mayer K, Mokrejs M, Morgenstern B, Munsterkotter M, Rudd S and Weil B., 'MIPS: a database for genomes and protein sequences,' Nucleic Acids Research, 2000
  4. Hodges PE, McKee AH, Davis BP, Payne WE and Garrels JI., 'The Yeast Proteome Database (YPD): a model for the organization and presentation of genome-wide functional data,' Nucleic Acids Research, 1999
  5. Bader GD, Betel D and Hogue CW., 'BIND: the Biomolecular Interaction Network Database,' Nucleic Acids Research, 2003
  6. von Mering C, Huynen M, jaeggi D, Schmidt S, Bork P and Snel B., 'STRING: a database of predicted functional associations between proteins,' Nucleic Acids Research, 2003 https://doi.org/10.1093/nar/gkg034
  7. Karen R. Christie, Shuai Weng, Rarna Balakrishnan et. al, 'Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms,' Nucleic Acids Research, 2004 https://doi.org/10.1093/nar/gkh033
  8. Bobby-Joe Breitkreutz, Chris Stark and Mike Tyer, 'The GRID: The General Repository for Interaction Datasets,' Genome Biology, 2002
  9. Hui Ge, Zhihua Liu, George M. Church and Marc Vidal, 'Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae,' Nature, 2001
  10. Baker D and Sali A., 'Protein structure prediction and structural genomics,' Science, 2001
  11. Minghua Deng, Kui Zhang, Shipra Mehta and Ting Chen, 'Prediction of protein function using protein-protein interaction data,' Journal of Computational Biology, 2003
  12. Stanley Letovsky and Simon Kasif, 'Predicting protein function from protein/protein interaction data: a probabilistic approach,' Bioinformatics, 2003 https://doi.org/10.1093/bioinformatics/btg1026
  13. Vazquez A, Flamrnini A, Maritan A and Vespignani A., 'Global protein function prediction from protein-protein interaction networks,' Nature, 2003 https://doi.org/10.1038/nbt825
  14. Schwikowski B, Uetz P and Fields S. 'A network of protein-protein interactions in yeast,' Nature Biotechnology, 2000
  15. Haretsugu Hishigaki, Kenta Nakai, Toshihide Ono, Akira Tanigami and Toshihisa Takagi, 'Assessment of prediction accuracy of protein function from protein-protein interaction data,' Yeast, 2001
  16. Dongbo Bu, Yi Zhao, Lun Cai, Hong Xue, Xiaopeng Zhu, Hongchao Lu, Jingfen Zhang, Shiwei Sun, Lunjiang Ling, Nan Zhang, Guojie Li, Runsheng Chen, 'ㅆopological structure analysis of the protein-protein interaction network in budding yeast,' Nucleic Acids Research, 2003
  17. Kyungsook Han, Byong-Hyon Ju and Haemoon Jung, 'WebInter'Viewer: visualizing and analyzing molecular interaction networks,' Nucleic Acids Research, 2003
  18. Nizar N. Batada, 'CNplot: simple method to visualize pre-clustered networks,' Bioinformatics, 2004.
  19. Kurt Mehlhorn and Stefan Naher, 'LEDA: a platform for combinatorial and geometric computing,' Communications of the ACM archive, 1995 https://doi.org/10.1145/204865.204889
  20. Przulj N, Wigle DA and Jurisica I., 'Functional topology in a network of protein interactions,' Bioinjormatics, 2004 https://doi.org/10.1093/bioinformatics/btg415
  21. Peter Uetz, Benno Schwikowski and Stanlu Fields, 'A network of protein-protein interactions in yeast,' Nature Biotechnoloty, 2002
  22. Shipara Metha, Ting Chen, Minghua Deng, Kui Zhang and Fengzhu Sun, 'Prediction of protein function using protein-protein interaction data,' Journal of Moleculer Biology, 2003