CSB2010 Accurate Identification of Ortholog Groups Among Multiple Genomes

Accurate Identification of Ortholog Groups Among Multiple Genomes

Guanqun Shi*, Meng-Chih Peng, Tao Jiang

Department of Computer Science, University of California, Riverside, CA 92521, USA. gshi@cs.ucr.edu

Proc LSS Comput Syst Bioinform Conf. August, 2010. Vol. 9, p. 166-179. Full-Text PDF

*To whom correspondence should be addressed.


The identification of orthologous genes shared by multiple genomes plays an important role in evolutionary studies and gene functional analyses. Based on a recently developed accurate tool, called MSOAR 2.0, for ortholog assignment between a pair of closely related genomes based on genome rearrangement, we present a new system MultiMSOAR 2.0, to identify ortholog groups among multiple genomes in this paper. In the system, we construct gene families for all the genomes using sequence similarity search and clustering, run MSOAR 2.0 for all pairs of genomes to obtain the pairwise orthology relationship, and partition each gene family into a set of disjoint sets of orthologous genes (called super ortholog groups or SOGs) such that each SOG contains at most one gene from each genome. For each such SOG, we label the leaves of the species tree using 1 or 0 to indicate if the SOG contains a gene from the corresponding species or not. The resulting tree is called a tree of ortholog groups (or TOGs). We then label the internal nodes of each TOG based on the parsimony principle and some biological constraints. Ortholog groups are finally identified from each fully labeled TOG. In comparison with a popular tool MultiParanoid on simulated data, MultiMSOAR 2.0 shows significantly higher prediction accuracy. It also outperforms MultiParanoid and the Roundup multi-ortholog repository in real data experiments using gene symbols as a validation tool. In addition to ortholog group identification, MultiMSOAR 2.0 also provides information about gene births, duplications and losses in evolution, which may be of independent biological interest.


[ CSB2010 Conference Home Page ] .... [ CSB2010 Online Proceedings ] .... [ Life Sciences Society Home Page ]