I am currently a senior research scientist at DeepMind. I work on deciphering the human genome with machine learning. My previous work involved modeling RNA splicing and degradation, as well as predicting variant effect for coding and non-coding variants. I did my PhD at the Technical University of Munich (TUM) with Julien Gagneur on computational biology. Please refer to my Google Scholar for a complete list of my publications.

Email: s6juncheng [at] gmail [dot] com


Publications

Genetic variant interpretation

Biological Discoveries

Computational Immunology

Bioinformatics & Machine learning


Software

Here is a list of open source software that I developed or had major contribution to. These tools are typically implementation of machine learning models originated from research projects.


MMSplice & MTSplice

Predict variant effect on splicing. MMSplice is the winning model of the CAGI5 splicing challenge. MMSplice is also integrated in the popular general purpose variant effect predictor CADD. MTSplice enhances MMSplice by predicting tissue-specific variant effect. Currently, Muhammed Hasan Çelik and I are maintaining the tool.

ggpval

A R package to add statistical test and P-value annotations to ggpplot2. Currently, the user community and myself are maintaining the tool.

BERTMHC

A python package to re-train and predict with BERTMHC model, a transformer model to predict binding and presentation of peptides by MHC class II.

DCC

A python package to detect circRNAs from next-generation sequence data. Currently, the Dieterich lab is maintaining the tool.


I contributed the following projects:

kipoi

Kipoi (pronounce: kípi; from the Greek κήποι: gardens) is an API and a repository of ready-to-use trained models for genomics. It currently contains 2133 different models, covering canonical predictive tasks in transcriptional and post-transcriptional gene regulation.