2020 IJCAI IJCAI 2020

Improving Tandem Mass Spectra Analysis with Hierarchical Learning

Abstract

Tandem mass spectrometry is the most widely used technology to identify proteins in a complex biological sample, which produces a large number of spectra representative of protein subsequences named peptide. In this paper, we propose a hierarchical multi-stage framework, referred as DeepTag, to identify the peptide sequence for each given spectrum. Compared with the traditional one-stage generation, our sequencing model starts the inference with a selected high-confidence guiding tag and provides the complete sequence based on this guiding tag. Besides, we introduce a cross-modality refining module to asist the decoder focus on effective peaks and fine-tune with a reinforcement learning technique. Experiments on different public datasets demonstrate that our method achieves a new state-of-the-art performance in peptide identification task, leading to a marked improvement in terms of both precision and recall.

🌉 Interdisciplinary Bridge — Machine Learning and Reinforcement Learning
🧭 Keyword Pioneer — tandem mass spectrometry
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors