Applications of Machine Learning to genome biology
Gerton Lunter studied mathematics in the Netherlands and briefly worked at Philips Laboratories before, in 2002, moving to Oxford and into computational biology, where he worked ever since. He is one of four co-founders of Genomics plc, a company that analyzes genetic data to find new drug targets. He currently divides his time between Genomics plc and the Centre for Computational Biology at the MRC Weatherall Institute of Molecular Medicine in Oxford, where he is Group Leader in Computational Biology and Artificial Intelligence.
Since the sequencing of the human genome in 2001, biology has become an increasingly data-rich science. In parallel, the field of Machine Learning (ML) has made remarkable progress in modeling large and complex data sets. This suggests to try and apply ML techniques to problems in biology. In this talk I will give recent examples, including from our group, of successful applications of ML methods to predict various intermediate phenotypes from sequence, including splicing and chromatin state. These models may be used to develop new scientific hypotheses; more immediately, they can be used to assess the potential phenotypic impact of polymorphisms or sporadic mutations. These models often exhibit special structure, such as reverse-complement symmetry, and I will show that dealing with this structure improves the quality of the model. Finally, I will show a surprising recent application of transfer learning.