Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
© Author(s) 2016. This work is distributed
under the Creative Commons Attribution 3.0 License.
Development and technical paper
07 Nov 2016
Review status
A revision of this discussion paper is under review for the journal Geoscientific Model Development (GMD).
Reverse engineering model structures for soil and ecosystem respiration: the potential of gene expression programming
Iulia Ilie1, Peter Dittrich2,3, Nuno Carvalhais1,4, Martin Jung1, Andreas Heinemeyer5, Mirco Migliavacca1, James I. L. Morison8, Sebastian Sippel1, Jens-Arne Subke6, Matthew Wilkinson8, and Miguel D. Mahecha1,3,7 1Max Planck Institute for Biogeochemistry, Department Biogeochemical Integration, Hans-Knoell-Str. 10, 07745 Jena, Germany
2Bio Systems Analysis Group, Institute of Computer Science, Jena Centre for Bioinformatics and Friedrich Schiller University, 07745 Jena, Germany
3Michael Stifel Center Jena for Data-Driven and Simulation Science, 07745 Jena, Germany
4CENSE, Departamento de Ciências e Engenharia do Ambiente, Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa, Caparica, Portugal
5Department of Environment, Stockholm Environment Institute, University of York, York YO105NG, UK
6Biological and Environmental Sciences, School of Natural Sciences, University of Stirling, Stirling, UK
7German Centre for Integrative Biodiversity Research (iDiv), Deutscher Platz 5e, 04103 Leipzig, Germany
8Forest Research, Alice Holt Lodge, Farnham, Surrey, GU10 4LH, UK
Abstract. Accurate modelling of land-atmosphere carbon fluxes is essential for future climate projections. However, the exact responses of carbon cycle processes to climatic drivers often remain uncertain. Presently, knowledge derived from experiments complemented with a steadily evolving body of mechanistic theory provides the main basis for developing the respective models. The strongly increasing availability of measurements may complicate the traditional hypothesis driven path to developing mechanistic models, but it may facilitate new ways of identifying suitable model structures using machine learning as well. Here we explore the potential to derive model formulations automatically from data based on gene expression programming (GEP). GEP automatically (re)combines various mathematical operators to model formulations that are further evolved, eventually identifying the most suitable structures. In contrast to most other machine learning regression techniques, the GEP approach generates models that allow for prediction and possibly for interpretation. Our study is based on two cases: artificially generated data and real observations. Simulations based on artificial data show that GEP is successful in identifying prescribed functions with the prediction capacity of the models comparable to four state-of-the-art machine learning methods (Random Forests, Support Vector Machines, Artificial Neural Networks, and Kernel Ridge Regressions). The case of real observations explores different components of terrestrial respiration at an oak forest in south-east England. We find that GEP retrieved models are often better in prediction than established respiration models. Furthermore, the structure of the GEP models offers new insights to driver selection and interactions. We find previously unconsidered exponential dependencies of respiration on seasonal ecosystem carbon assimilation and water dynamics. However, we also noticed that the GEP models are only partly portable across respiration components; equifinality issues possibly preventing the identification of a "general" terrestrial respiration model. Overall, GEP is a promising tool to uncover new model structures for terrestrial ecology in the data rich era, complementing the traditional approach of model building.

Citation: Ilie, I., Dittrich, P., Carvalhais, N., Jung, M., Heinemeyer, A., Migliavacca, M., Morison, J. I. L., Sippel, S., Subke, J.-A., Wilkinson, M., and Mahecha, M. D.: Reverse engineering model structures for soil and ecosystem respiration: the potential of gene expression programming, Geosci. Model Dev. Discuss., doi:10.5194/gmd-2016-242, in review, 2016.
Iulia Ilie et al.
Iulia Ilie et al.
Iulia Ilie et al.


Total article views: 285 (including HTML, PDF, and XML)

HTML PDF XML Total Supplement BibTeX EndNote
216 55 14 285 14 7 20

Views and downloads (calculated since 07 Nov 2016)

Cumulative views and downloads (calculated since 07 Nov 2016)

Viewed (geographical distribution)

Total article views: 285 (including HTML, PDF, and XML)

Thereof 285 with geography defined and 0 with unknown origin.

Country # Views %
  • 1



Latest update: 23 May 2017
Publications Copernicus
Short summary
Accurate representation of land-atmosphere carbon fluxes is essential for future climate projections, although some of the responses of CO2 fluxes to climate often remain uncertain. The increase in available data allows for new approaches in their modelling. We automatically developed models for ecosystem and soil carbon respiration using a machine learning approach. When compared with established respiration models we found that they are better in prediction as well as offering new insights.
Accurate representation of land-atmosphere carbon fluxes is essential for future climate...