A general graphical framework for detecting copy number variationsXiao-Lin Yin, Jing Li* Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH 44106, USA. jingli@case.edu Proc LSS Comput Syst Bioinform Conf. August, 2009. Vol. 8, p. 47-58. Full-Text PDF *To whom correspondence should be addressed. |
|
Array comparative genomic hybridization (aCGH) allows identification of copy number alterations across genomes. The key computational challenge in analyzing copy number variations (CNVs) using aCGH data or other similar data generated by a variety of array technologies is the detection of segment boundaries of copy number changes and inference of the copy number state for each segment. We have developed a novel statistical model based on the framework of conditional random fields (CRFs) that can effectively combine data smoothing, segmentation and copy number state decoding into one unified framework. Our approach (termed CRF-CNV) provides great flexibilities in defining meaningful feature functions, therefore it can effectively integrate local spatial information of arbitrary sizes into the model. For model parameter estimations, we have adopted the conjugate gradient (CG) method for likelihood optimization and developed efficient forward/backward algorithms within the CG framework. The method is evaluated using real data with known copy numbers as well as simulated data with realistic assumptions, and compared with two popular publicly available programs. Experimental results have demonstrated that CRF-CNV outperforms a Bayesian Hidden Markov Model-based approach on both datasets in terms of copy number assignments. Comparing to a non-parametric approach, CRF-CNV has achieved much greater precision while maintaining the same level of recall on the real data, and their performance on the simulated data is comparable. |
|
[ CSB2009 Conference Home Page ] .... [ CSB2009 Online Proceedings ] .... [ Life Sciences Society Home Page ] |