UROP Proceedings 2022-23

School of Engineering Department of Chemical and Biological Engineering 77 Improving Data Analysis Methods for Shotgun Proteomics Supervisor: LAM, Henry Hei Ning / CBE Student: CHUNG, Chun Kit / BIEN Course: UROP1100, Fall UROP2100, Spring As the application of spectral library searching continues to expand, there arises a necessity to ascertain the confidence level of the results obtained through this method. While the estimation of false discovery rates through target-decoy searching is a commonly employed method, the creation of satisfactory decoy remains a challenging task that requires attention. To address this, our research group introduces the "predicted decoy" approach, which involves shuffling peptide sequences to create decoy sequences and employing a pre-trained prediction model to generate artificial decoy spectra. Our objective in this article is to assess the current-stage performance of the prediction model by comparing it with the shuffle-and-reposition method, thereby evaluating the effectiveness of the proposed approach. Improving Data Analysis Methods for Shotgun Proteomics Supervisor: LAM, Henry Hei Ning / CBE Student: HOQUE, Afnan / BIEN Course: UROP1100, Summer Shotgun proteomics is the investigation of different forms of proteins using their respective peptides proxies after performing enzyme catalyzed hydrolysis of entire proteomes. These peptides are typically identified using liquid chromatography along with tandem mass spectrometry analysis. This technique can be utilized for various purposes including characterizing complex protein samples, allowing researchers to gain an understanding of cellular processes and the mechanism behind specific diseases. However, there are difficulties in extracting relevant data and utilizing these datasets as they are complex and vast. The focus of this report revolves around the usage of the technique of organizing spectra obtained from these peptides into clusters visually represented using a Graphical User Interface. Improving Data Analysis Methods for Shotgun Proteomics Supervisor: LAM, Henry Hei Ning / CBE Student: SAMI, Sufyan / SENG Course: UROP1000, Summer This research project aims to enhance the inhouse peptide spectra prediction model by incorporating Normalised Collision Energy (NCE) values. Peptides exhibit distinct spectra under different NCE conditions, and the existing model's limitation is its inability to accurately predict spectra for varying NCE values. The code was modified to include NCE values during model training and prediction. The NIST training dataset was utilised for initial training and evaluation against randomly generated peptides with assigned NCE values. Results will be compared to previous models, Prosit and Pdeep, which also integrate NCE values. The modified model's performance was analysed in predicting mass spectra for selected peptides, demonstrating the potential benefits of considering NCE values.

RkJQdWJsaXNoZXIy NDk5Njg=