Thompson Rivers University

Data Science Seminar Series

Join TRU’s Faculty of Science for the Data Science Seminar Series.

This Wednesday Feb. 24 at 5:30 p.m. join TRU Faculty Member Yan Yan as she presents Graph theoretic based computational methods for de novo peptide sequencing.

Click here to join the event through BlueJeans.


Machine learning has been widely applied to big biological data analytics such as peptide sequencing. Proteins are crucial entities of biological organisms and it is important to understand proteins and their functions in order to identify their sequences. To do so, a typical strategy is to break proteins into smaller partswhich are peptidesand infer peptide sequences first.  Tandem mass spectrometry (MS/MS) has emerged as a major technology for peptide sequencing because of its high-throughput and exceptional sensitivity. However, the MS/MS data is usually very noisy and misses data, therefore machine learning has been used to advance the sequence prediction. For example, to filter out noise and select features for the prediction models. In this talk, I’m going to introduce the peptide sequencing problem, some challenges and achievements in this area and how machine learning and graph-theoretical modelling have been applied to tackle the problem.


TRU Faculty Member Yan Yan is an assistant professor in TRU’s Department of Computing Science. She received her PhD in Biomedical Engineering at the University of Saskatchewan (UofS). Before joining TRU, she worked as a post-doctoral fellow in the Department of Computer Science at University of Western Ontario and the UofS. Her primary research interests include bioinformatics including proteomics with a focus on peptide sequencing, population genomics on large scale genome-wide data analysis and machine learning in computational biology.