Discovering Protein Complexes in Dense Reliable Neighborhoods of Protein Interaction Networks

Xiao-Li Li*, Chuan-Sheng Foo, See-Kiong Ng

Knowledge Discovery Department, Institute for Infocomm Research, Heng Mui Keng Terrace, 119613, Singapore. xlli@i2r.a-star.edu.sg

Proc LSS Comput Syst Bioinform Conf. August, 2007. Vol. 6, p. 157-168. Full-Text PDF

*To whom correspondence should be addressed.


Multiprotein complexes play central roles in many cellular pathways. Although many high-throughput experimental techniques have already enabled systematic screening of pairwise protein-protein interactions en masse, the amount of experimentally determined protein complex data has remained relatively lacking. As such, researchers have begun to exploit the vast amount of pairwise interaction data to help discover new protein complexes. However, mining for protein complexes in interaction networks is not an easy task because there are many data artefacts in the underlying protein-protein interaction data due to the limitations in the current high-throughput screening methods. We propose a novel DECAFF (Dense-neighborhood Extraction using Connectivity and conFidence Features) algorithm to mine for dense and reliable subgraphs in protein interaction networks. Our method is devised to address two major limitations in current high throughout protein interaction data, namely, incompleteness and high data noise. Experimental results with yeast protein interaction data show that the interaction subgraphs discovered by DECAFF matched significantly better with actual protein complexes than other existing approaches. Our results demonstrate that pairwise protein interaction networks can be effectively mined to discover new protein complexes, provided that the data artefacts in the underlying interaction data are taken into account adequately.


[CSB2007 Conference Home Page]....[CSB2007 Online Proceedings]....[Life Sciences Society Home Page]