Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
doi:10.5194/gmd-2016-312
© Author(s) 2017. This work is distributed
under the Creative Commons Attribution 3.0 License.
Methods for assessment of models
17 Jan 2017
Review status
A revision of this discussion paper was accepted for the journal Geoscientific Model Development (GMD) and is expected to appear here in due course.
STRAPS v1.0: Evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer
David O. Topping1,2, James Allan1,2, M. Rami Alfarra1,2, and Bernard Aumont3 1School of Earth and Environmental Science, University of Manchester, Manchester, M13 9PL, UK
2National Centre for Atmospheric Science, University of Manchester, Manchester, M13 9PL, UK
3LISA, UMR CNRS 7583, Universite Paris Est Creteil et Universite Paris Diderot, Creteil, France
Abstract. Our ability to model the chemical and thermodynamic processes that lead to secondary organic aerosol (SOA) formation is thought to be hampered by the complexity of the system. While there are fundamental models now available that can simulate the tens of thousands of reactions thought to take place, validation against experiments is highly challenging. Techniques capable of identifying individual molecules such as chromatography are generally only capable of quantifying a subset of the material present, making it unsuitable for a carbon budget analysis. Integrative analytical methods such as the Aerosol Mass Spectrometer (AMS) are capable of quantifying all mass, but because of their inability to isolate individual molecules, comparisons have been limited to simple data products such as total organic mass and O:C ratio. More detailed comparisons could be made if more of the mass spectral information could be used, but because a discrete inversion of AMS data is not possible, this activity requires a system of predicting mass spectra based on molecular composition.

In this proof of concept study, the ability to train supervised methods to predict electron impact ionisation (EI) mass spectra for the AMS is evaluated. Supervised Training Regression for the Arbitrary Prediction of Spectra (STRAPS), is not built from first principles. A methodology is constructed whereby the presence of specific mass-to-charge ratio (m/z) channels are fit as a function of molecular structure before the relative peak height for each channel is similarly fit using a range of regression methods. The widely-used AMS mass spectral database is used as a basis for this, using unit mass resolution spectra of laboratory standards.

Key to the fitting process is choice of structural information, or molecular fingerprint. Our approach relies on using supervised methods to automatically optimise the relationship between spectral characteristics and these molecular fingerprints. Therefore, any internal mechanisms or instrument features impacting on fragmentation are implicitly accounted for in the fitted model. Whilst one might expect a collection of keys specifically designed according to EI fragmentation principles to offer a robust basis, the suitability of a range of commonly available fingerprints is evaluated.

Initial results suggest the generic public 'MACCS' fingerprints provide the most accurate trained model when combined with both decision trees and random forests with median cosine angles of 0.94–0.97 between modelled and measured spectra. There is some sensitivity to choice of fingerprint, but most sensitivity is in choice of regression technique. Support Vector Machines perform the worst, with median values of 0.78–0.85 and lower ranges approaching 0.4 depending on the fingerprint used. More detailed analysis of modelled versus mass spectra demonstrates important composition dependent sensitivities on a compound-by-compound basis. This is further demonstrated when we apply the trained methods to a model α-pinene SOA system, using output from the GECKO-A model. This shows that use of a generic fingerprint referred to as 'FP4' and one designed for vapour pressure predictions ('Nanoolal') give plausible mass spectra, whilst the use of the MACCS keys perform poorly in this application, demonstrating the need for evaluating model performance against other SOA systems rather than existing laboratory databases on single compounds.

Given the limited number of compounds used within the AMS training dataset, it is difficult to prescribe which combination of approach would lead to a robust generic model across all expected compositions. Nonetheless, the study demonstrates the use of a methodology that would be improved with more training data and data from simple mixed systems for further validation. To facilitate further development of the method, including application to other instruments, the model code for re-training is provided via a public Github and Zenodo software repository.


Citation: Topping, D. O., Allan, J., Alfarra, M. R., and Aumont, B.: STRAPS v1.0: Evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer, Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-312, in review, 2017.
David O. Topping et al.
Interactive discussionStatus: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version      Supplement - Supplement
 
RC1: 'Promising concept.', Anonymous Referee #1, 17 Feb 2017 Printer-friendly Version 
AC1: 'Author response', David Topping, 11 Apr 2017 Printer-friendly Version Supplement 
 
RC2: 'Referee Review: Topping et al., 2017', Anonymous Referee #2, 18 Feb 2017 Printer-friendly Version 
AC2: 'Author response', David Topping, 11 Apr 2017 Printer-friendly Version Supplement 
David O. Topping et al.
David O. Topping et al.

Viewed

Total article views: 253 (including HTML, PDF, and XML)

HTML PDF XML Total BibTeX EndNote
189 46 18 253 5 20

Views and downloads (calculated since 17 Jan 2017)

Cumulative views and downloads (calculated since 17 Jan 2017)

Viewed (geographical distribution)

Total article views: 253 (including HTML, PDF, and XML)

Thereof 253 with geography defined and 0 with unknown origin.

Country # Views %
  • 1

Saved

Discussed

Latest update: 23 May 2017
Publications Copernicus
Download
Short summary
Our ability to model the chemical and thermodynamic processes that lead to secondary organic aerosol (SOA) formation is thought to be hampered by the complexity of the system. In this proof of concept study, the ability to train supervised methods to predict electron impact ionisation (EI) mass spectra for the AMS is evaluated to facilitate improved model evaluation. The study demonstrates the use of a methodology that would be improved with more training data and data from simple mixed systems.
Our ability to model the chemical and thermodynamic processes that lead to secondary organic...
Share