Skip to Main content Skip to Navigation

Signal processing and analysis of PTR-TOF-MS data from exhaled breath for biomarker discovery

Abstract : The analysis of Volatile Organic Compounds (VOCs) in exhaled breath is a promising non-invasive approach in medicine for early diagnosis, phenotyping, disease and treatment monitoring and large-scale screening. Proton Transfer Reaction Time-Of-Flight Mass Spectrometry (PTR-TOF-MS) is of major interest for the real time analysis of VOCs and the discovery of new biomarkers in the clinics. However, there is currently a lack of methods and software tools for the processing of PTR-TOF-MS data from cohorts.We therefore developed a suite of algorithms that process raw data from the patient acquisitions, and build the table of feature intensities, through expiration and peak detection, quantification, alignment between samples, and missing value imputation. Notably, we developed an innovative 2D peak deconvolution model based on penalized splines signal regression, and a method to specifically select the VOCs from exhaled breath. The full workflow is implemented in the freely available ptairMS R/Bioconductor package. Our approach was validated both on experimental data (mixture of VOCs at standardized concentrations) and simulations, which showed that the sensitivity for the identification of VOCs from exhaled breath reached 99 %. A graphical interface was also developed to facilitate data analysis and result interpretation by experimenters (e.g., clinicians).We applied our methodology to the characterization of exhaled breath from mechanically ventilated adults with COVID-19 infection. Analysis of exhaled breath from 28 patients with an acute respiratory distress syndrome (ARDS) and COVID-19 infection, and 12 patients with non-COVID-19 ARDS were performed daily from the hospital admission to the discharge. First, classification models were built to predict the status of the infection, using the closest available acquisition to the entry into hospital, and achieved high prediction accuracies (93 %). Then, all the available data acquired during the hospital stay were used for the longitudinal analysis of the VOCs evolution as a function of the hospitalization time by mixed-effects modeling. Following feature ranking and selection, four biomarkers of COVID-19 infection were identified. Altogether, these results highlight the value of the PTR-TOF-MS data and the ptairMS software for biomarker discovery in exhaled breath.
Document type :
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, May 9, 2022 - 12:38:13 PM
Last modification on : Tuesday, May 10, 2022 - 3:52:55 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03662449, version 1


Camille Roquencourt. Signal processing and analysis of PTR-TOF-MS data from exhaled breath for biomarker discovery. Statistics [math.ST]. Université Paris-Saclay, 2022. English. ⟨NNT : 2022UPASG024⟩. ⟨tel-03662449⟩



Record views


Files downloads