Dienstag, 6. Juni 2017

Thoughts and questions after 10 years working in computational biology

It has been the 10th year that I work in the field of computational biology. And probably it is the right time to ask myself: where should my research next go?

Looking Back

I started in 2007 developing software modules and pipelines to allow quantitative analysis of biological systems. Together with Stefan Wiemann, my Ph.D. supervisor, I developed the KEGGgraph software to translate biological pathways in KEGG, previously mostly used visually, into graph models that can be analyzed formally. That led to the very first peer-review publication of mine in the field.

Following that, I spent three years understanding how microRNA regulate gene expression in human breast cancer and gastrointestinal tumor. There I had the opportunity to work with outstanding colleagues like Florian Haller, Stefan Uhlmann, Heiko Mannsperger, Özgür Sahin, Agnes Hovrat, Katherina Zweig, etc to study microRNAs using latest technologies such as reverse phase protein arrays and network analysis. That's the time when I was fascinated by systems biology.

In 2011 I joined Roche. I am fortunate to work with Clemens Broger and Martin Ebeling and spend my time, besides regular project support activities, on large-scale data analysis of gene expression data and on development of novel platforms to support early drug discovery. In 2014, we characterized an early induced network of four genes that are predictive of toxicity in vitro an in vivo by mining the TG-GATEs database. Early this year, we published a manuscript describing the BioQC software, which detects tissue heterogeneity in gene expression data using knowledge derived from a compendium of gene expression profiles that we collected. A few weeks ago, together with Faye Drawnel, Martin Ebeling, and Marco Prunotto, we published the proof-of-concept study of molecular phenotpying and its application in early drug discovery. The results suggest that by integrating molecular phenotyping, i.e. digital quantification of pre-selected pathway reporter genes shortly after compound perturbation, we can gain insights into both pathways that are associated with disease-relevant phenotype as well as compounds that induce desired phenotypic changes.

Looking Forward

What comes next? I only have a few vague ideas and am open to more new ones
  1. How to build software for data integration and interpretation in order to empower both disease understanding and drug discovery? In particular, how can we systematically and formally integrate genomic, transcriptomic, genomic, proteomic, and chemoinformatic data to inform the drug discovery process?
  2. How to formally generate and test hypothesis about genetic and pharmacological perturbation in silico?
  3. How to utilize single-cell and single-mutation level information for drug discovery?
I sense there is tension between the ever-increasing amount of information that is available to us and the limited time to digest them and to connect between them. In addition, project support activities and research into the questions, which in ideal cases do not conflict with but rather benefit from each other, need constant balancing. As Yuri Lazebnik put it in his legendary essay Can a biologist fix a radio?—Or, what I learned while studying apoptosis, it's time to make good tools and to keep your mind clear under adverse circumstances.

Just search and ask, until the next 10 years are gone.