1Deutsches Zentrum für Luft- und Raumfahrt, Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany
2Department of Mechanical Engineering of the Aristotle University Thessaloniki, Thessaloniki, Greece
3TNO Environment and Geosciences, Utrecht, The Netherlands
4Institut für Meteorologie, Freie Universität Berlin, Germany
5Department of Environment an Planning, University of Aveiro, Portugal
6Center for International Climate and Environmental Research (CICERO), Oslo, Norway
7Institut für Energie- und Klimaforschung: Troposphäre, Foschungszentrum Jülich, Germany
Abstract. We summarise results from a workshop on "Model Benchmarking and Quality Assurance" of the EU-Network of Excellence ACCENT, including results from other activities (e.g. COST Action 732) and publications. A formalised evaluation protocol is presented, i.e. a generic formalism describing the procedure how to perform a model evaluation. This includes eight steps and examples from global model applications are given for illustration. The first and important step is concerning the purpose of the model application, i.e. the addressed underlying scientific or political question. We give examples to demonstrate that there is no model evaluation per se, i.e. without a focused purpose. Model evaluation is testing, whether a model is fit for its purpose. The following steps are deduced from the purpose and include model requirements, input data, key processes and quantities, benchmark data, quality indicators, sensitivities, as well as benchmarking and grading. We define "benchmarking" as the process of comparing the model output against either observational data or high fidelity model data, i.e. benchmark data. Special focus is given to the uncertainties, e.g. in observational data, which have the potential to lead to wrong conclusions in the model evaluation if not considered carefully.