Gene Myers , Ph.D.
"Whole Genome Sequencing, Comparative Genomics, and Systems Biology"
The whole-genome shotgun sequencing method with paired end-reads has proven rapid and economical,
producing high-quality reconstructions of Drosophila (2000), Human (2001) and Mouse (2001), in
quick succession. We discuss the overall algorithmic strategy, the results one can expect by
comparing the whole genome assembly of Drosophila against the recently finished sequence, and
advances such as high-density solid state sequencing and single molecule detection systems.
We anticipate having the euchromatic portions of the genomes of twelve species of Drosophila in the
next year. We discuss the current state of the art in comparative gene finding, cis-control module
finding, and possible improvements. The hope of these approaches is that we will be able to accurately
identify the “parts lists” of the D. melanogaster genome, a basic prerequisite for systems biology.
We conclude with a segment on the possibility of a program of high-throughput in-situ image analysis
in Drosophila embryos. We describe what information we might collect and what we might be able to
infer form it. It is our contention that this may be the best way to understand development from a systems perspective.
Return to Keynote Speakers
Return to Program
Ron Shamir, Ph.D.
"Computational Dissection of Regulatory Networks Using Diverse High-throughput Data"
The maturation of high-throughput technologies and the availability of whole genome sequences make
it possible to apply holistic computational approaches to the study of biological systems. The use
of high-throughput technologies requires the development of advanced computational methods and tools
that would enable the elicitation of significant biological knowledge from the vast amounts of data
generated by these methods. Our group has been developing a battery of such methodologies and incorporated
some of them in several tools:
- CLICK (CLuster Identification via Connectivity Kernels): a clustering algorithm that combines
graph-theoretic approaches and statistical considerations to yield solutions that balance intra-cluster
homogeneity and inter-cluster separation.
- PRIMA (PRomoter Integration in Microarray Analysis): a promoter sequence analysis tool that aims at the
identification of transcription factors whose binding sites are significantly over-represented in promoters
of co-expressed genes. Using microarrays to compare the transcriptional response in wild-type and Atm-deficient
mice, we used CLICK and PRIMA to identify, on a genomic scale, a DNA damage transcriptional response that is
dependent on the ATM protein kinase, and dissected this response network into two major arms that are mediated
by the p53 and NF-_B transcriptional regulators.
- SAMBA (Statistical-Algorithmic Method for Bicluster Analysis): a method for finding subsets of genes that
manifest a significant co-expression within particular subsets of the conditions. The method is graph-theoretic
and based on a statistical model of the data generation. We demonstrated the utility of SAMBA in mining biological
knowledge out of large and highly heterogeneous genome-wide yeast datasets. These included gene expression profiles,
and data on protein-protein interactions, phenotypes and transcription factor binding locations. Our approach analyzes
such heterogeneous data set in an inherently integrative manner. SAMBA dissected the yeast system into modules, each
comprising a set of genes that share common features over diverse data sources. Using these modules, we were able to
predict the function of over 800 unknown genes, and validated some predictions experimentally. We were also able to
obtain broad perspectives on the interaction of transcription factors and modules, and on the hierarchical organization
of modules in yeast.
- EXPANDER (EXPression ANalyzer and DisplayER): an integrative platform for the analysis of gene expression data,
providing multiple analysis algorithms including CLICK, PRIMA and SAMBA, along with a variety of data normalization
and visualization utilities.
- SHARP (SHowcase for ATM Related Pathways): an interactive software environment that displays graphically biological
interaction networks, allows dynamic layout and navigation through these networks, and the superposition of DNA microarray
data on interaction maps.
- Binding Site Evolution: a novel genome-wide analysis method for detecting binding sites in aligned promoters of
related species, that is based primarily on identifying selection forces and not mere conservation. We demonstrate
the method to the data of several yeast species, and the analysis reveals novel fascinating details on the evolution
of transcription factor binding sites.
- MetaReg: a methodology for the representation and analysis of heterogeneous biological networks. The network elements
include mRNAs, proteins and metabolites, and cycles are allowed. We developed methods for the comparison of the model
prediction to actual measurements, and for the generation of hypotheses where discrepancy between predictions and
observations is large. We demonstrate the approach on the lysine biosynthesis pathway in yeast.
Our (more mature) tools are available at http://www.cs.tau.ac.il/~rshamir
Joint work, in parts, with Amos Tanay[2], Irit Gat-Viks[1], Ran Elkon1, Roded Sharan1[4], Chaim Linhart1,
Adi Maron1, Nir Orlev[2], Giora Sternberg[2], Martin Kupiec[3], Sharon Rashi-Elkeles[2], Yosef Shiloh[2]
[1] School of Compter Science, Sackler Faculty of Exact Sciences
[2] Department of Human Genetics, Sackler School of Medicine
[3] George S. Wise Faculty of Life Sciences, Tel Aviv University, Israel
[4] Current address: International Computer Science Institute, Berkeley, CA
Return to Keynote Speakers
Return to Program
Return to Top
|