Joint CO 2 state and flux estimation with the 4 D-Var system EURAD-IM

This idealized regional atmospheric inversion study assesses the potential of the 4-dimensional variational (4D-Var) method to estimate CO2 fluxes and the atmospheric CO2 concentration state jointly. In order to distinguish and quantify the surface-atmosphere CO2 fluxes, combining anthropogenic CO2 emissions, photosynthesis, and respiration, we include uncertainties of initial values, which arise from highly uncertain surface fluxes and night-time transport. Therefor a new calculation of the background error standard deviation for the CO2 fluxes was developed. To suppress spurious wiggles occurring from 5 advection, an absolute monotone advection scheme with low numeric diffusion and its adjoint has been implemented. The inversion by the EURopean Air pollution Dispersion-Inverse Model (EURAD-IM) with 5 km resolution in Central Europe is validated by synthetic half hourly measurements from eleven concentration towers. A significant improvement of the analysis is shown if initial values and CO2 fluxes are optimised jointly, compared to optimising CO2 fluxes alone, without estimating uncertainty of atmospheric concentration. We find that joint estimation of carbon fluxes and initial states requires a careful 10 balance of the background error covariance matrices but enables a more detailed analysis of atmospheric CO2 and the surfaceatmosphere fluxes.


Introduction
The quantification of terrestrial CO 2 sinks and sources is important in order to understand the carbon cycle and its fundamental role in the global climate system.On a global scale, approximately 30% of the anthropogenic emissions are removed by oceans (Wanninkhof et al., 2013) as well as 30% by terrestrial biosphere (Sitch et al., 2013), the latter one showing much larger temporal and spatial variability (Jung et al., 2011).
One widely used method for the estimation of the surface-atmosphere CO 2 exchange is atmospheric inverse modelling, using CO 2 concentration measurements and atmospheric transport models.On a global scale atmospheric inversions improve the quantification of natural and anthropogenic carbon fluxes (Tans et al., 1990;Enting et al., 1993;Rödenbeck et al., 2003;Rayner et al., 2008), although the uncertainty between studies is large, especially between continents (Gurney et al., 2002;Baker et al., 2006;Peters et al., 2010).
Locally the increase of spatio-temporal model resolution leads to even higher uncertainties of atmospheric inversion studies (Lauvaux et al., 2008;Broquet et al., 2011), mainly for two reasons: (1) A derivation of appropriate a priori surface carbon 1 Geosci.Model Dev. Discuss., doi:10.5194/gmd-2016-132, 2016 Manuscript under review for journal Geosci.Model Dev.Published: 19 August 2016 c Author(s) 2016.CC-BY 3.0 License.fluxes is difficult due to spatial heterogeneity.In particular European land use is highly variable since large parts of the terrestrial biosphere are actively managed by humans and are often embedded in densely populated areas, including cities and industrial areas (Peters et al., 2010).( 2) Atmospheric transport modelling introduces uncertainty to atmospheric CO 2 mixing ratio of several ppm due to advection (Lin and Gerbig, 2005) and vertical mixing (Gerbig et al., 2008).Although uncertainty from transport model errors is considerably lower than uncertainty from CO 2 surface fluxes (Tolk et al., 2009), the skills of the transport model are essential for CO 2 inversions.
Additionally to the mentioned difficulties of proper transport models and surface-atmosphere CO 2 fluxes one major obstacle for CO 2 inversions is the sparsity of representative concentration observations, e.g.Wu et al. (2016) addresses the question of optimal location of observations in inversion systems.To address uncertainty appropriately to the mentioned model processes and input data is among the most challenging issues for atmospheric CO 2 inversions.Tolk et al. (2011) compare the uncertainty reduction for different strategies of inverse carbon modelling, suggesting an optimisation of biosphere model parameters or a pixel inversion of respiration and photosynthesis on a regional scale.
To reduce uncertainty of a priori fluxes, additional carbon flux measurements from eddy covariance towers can be used .Lauvaux et al. (2008), Cooley et al. (2013), and Zhu et al. (2014) combine this so-called bottom-up approach with top-down atmospheric inversion in order to quantify regional carbon fluxes.
To reduce errors in atmospheric transport processes, different strategies are investigated.For example additional information from radon measurements (Broquet et al., 2011) or the energy balance (Tolk et al., 2009) are used to gain information of atmospheric CO 2 dispersion.Studies with an horizontal resolution of a few km (Ahmadov et al., 2007(Ahmadov et al., , 2009;;Pillai et al., 2011) increase confidence in modelled CO 2 mixing ratios compared to coarser resolution, although comparisons of mesoscale models reveal discrepancies for the vertical boundary layer (Sarrat et al., 2007;Geels et al., 2007).Uncertainty of vertical model transport is considerably higher during night due to uncertain mixing height (Lauvaux et al., 2008;Dolman et al., 2009;Kretschmer et al., 2014), leading to a poor representativity of concentration during morning hours.Still, optimisation of initial conditions is often mitigated by long spin-up periods or is investigated at a coarse subspace (Peylin et al., 2005;Lauvaux et al., 2008).
This study investigates how CO 2 flux uncertainty from inversions can be reduced by a proper accounting of uncertainty of atmospheric concentration.The feasibility and limitations of a comprehensive 4D-Var inversion system to analyse anthropogenic emissions, photosynthesis, ecosystem respiration, and initial values at each grid cell jointly, are validated in this study.This paper is organised as follows.Section 2 delivers the theoretical background of the joint optimisation of fluxes and initial values, Sect. 3 describes the implementation into the EURAD-IM.Numerical experiments validating the impact of erroneous initial values are presented and discussed in Sect. 4. Section 5 summarises the results.

Theory for joint variational assimilation
In this study we want to optimise two parameters with the 4D-Var data assimilation method: initial values and flux factors jointly.While the proceeding of optimising initial values is well-known, we want to visualise the basic concept of optimising  Memmesheimer et al. (1995) and spatial averaged biogenic fluxes calculated with WRF-CLM at 24 July 2012 (absolute weights sum up to 1).An example with an analysed flux factor of 0.75 for photosynthesis is shown, decreasing the total amount but preserving the profile.flux factors and joint optimisation.The main idea (Elbern et al., 2000) is to reduce the degree of freedom of the flux rate space by not optimising the fluxes at each time step t i .Rather it is pointed out that, due to the better knowledge of the diurnal cycle of fluxes, one efficient parameter to optimise is their diurnal amplitude, as simulated by the forward model.
This study optimises three fluxes, anthropogenic emissions, photosynthesis, and respiration.For simplicity we will define the flux space to be of the same dimension as the state space (R n ).The flux rate vector f ∈ R n scales the a priori knowledge of the background flux U b by an optimisation factor per grid point and per flux type, while the relative diurnal profile variation remains unchanged.For notational convenience we define U b ∈ R n×n to be a diagonal matrix.
The actual fluxes are thus (diag(a) designates a diagonal matrix with entries of the vector a) with U i+1/2 denoting the fluxes in the time interval [t i , t i+1 ] introduced instantaneously at t i+1/2 .For notational convenience subindices in this paper will be used exclusively to identify discrete time.The simulated time period is defined as Figure 1 shows the diurnal cycle of the three CO 2 fluxes, anthropogenic emissions, photosynthesis, and respiration, where the photosynthesis is exemplarily rescaled.The actual parameter of optimisation will be g := ln(f ) due to two reasons: (1) The transformation results in analog partial costs for flux factors g and 1/g.(2) Since f > 0 it can be described by a log-normal distribution, resulting in a Gaussian probability density function for g.Fletcher and Zupanski (2006) and Fletcher (2010) describe 4D-Var systems with hybrid Gaussian and log-normal distribution in general.
The idea for the main advantage of joint optimisation of initial values and flux factors is depicted in Fig. 2. While the influence Optimising only initial values leads to strong forcing at the start of the assimilation window.Optimising only flux rates can not change the initial values and may force the analysis to give more weight at the starting time than desirable.
of optimised initial values decreases with longer forecast period, optimised flux factors can hardly adjust a mismatch between model and observation at the beginning of a forecast.
In the following we briefly present the theoretical background and the formulation of the cost function J , following Elbern et al. (2000) and Elbern et al. (2007).We assume a background field x b 0 ∈ R n , which is the initial concentration value of the passive tracer CO 2 , and g b = (0, . . ., 0) ∈ R n , the background logarithmic flux factors, are given.Our a priori knowledge about the fluxes is actually represented by the flux strength U b t 1/2 ,...,t N −1/2 given in µmol CO2 m 2 s .If [t 0 , . . ., t N ] denotes the assimilation window considered, we set further observations y i ∈ R pi at any time step t i .
In the following subscripts i always refer to time t i .The background field x b 0 evolves from time t 0 to t i by the use of the model M 0,i : (2) The observation operator H i projects the model state and the flux factor from R 2n to the observation space R pi .This mapping H i allows to compare observation y i with the model equivalent H i (x b i , g b ), by the innovation vector d i (3) Following Courtier et al. (1994) we define the cost function J in dependence of the increments, which is the difference between the optimisation parameters and the a priori knowledge of initial values and flux factors, B ∈ R n×n , K ∈ R n×n , and R i ∈ R pi×pi respectively.The cost function J can now be defined as The operators H i ∈ R 2n×pi , M 0,i ∈ R 2n×2n are tangent linear approximations with for sufficiently small δx 0 and δg.Consequently the gradient of J is where (M 0,i ) T and H T i are the adjoint operators, with (M 0,i ) T modelling backwards in time from t i to t 0 .Since the construction of the adjoint model uses the derivative with respect to the optimisation parameter we have to distinguish between the adjoint model related to initial values x 0 and flux factors g.See Appendix A for a detailed derivation of (M 0,i ) T with respect to the initial values and flux factors.A careful balancing of the background error covariance matrices (BECM) B and K and the matrix R for the observation errors is decisive for the analysis.

Model description
The model applied here is derived from the multiscale EURopean Air pollution Dispersion-Inverse Model (EURAD-IM), which is a state of the art Chemistry Transport Model (CTM) (Elbern et al., 2007;Goris and Elbern, 2015) including gas phase, aerosols and pollen.
The model domain uses the Lambert conformal projection, where the center of the coarse domain is located at 54 • N latitude, 12.5 • E longitude and has the staggered Arakawa C-grid structure in this study.A nesting configuration with a horizontal resolution of 15×15 km 2 and 5×5 km 2 (Fig. 3 (a)) is used.The model domain includes 23 vertical levels using terrain following σ coordinates up to 100 hPa.For the parent domain constant lateral boundary conditions are taken, the usage of lateral boundary values from global as provided in Copernicus Atmosphere Monitoring Service (CAMS) is feasible as well.
For simplicity the background concentration of CO 2 for all domains is assumed to be constant 386 ppmV.

Observations
The measurement towers used in this study are part of the FLUXNET (http://fluxnet.ornl.gov/ ) network, with a spatial focus on the TR32 Rur catchment area (http://tr32new.uni-koeln.de/) in western Germany.All stations are listed in detail in Table  sites, see also Table 1. 1. Measurements are taken every 30 minutes and the data will be assimilated between 06 and 18 UTC, resulting in 340 observations in total.Except from two towers (BIS and WUE) all towers provide measurements below 30 m, which is in the surface layer of our model.Four towers provide also measurements in several vertical heights.To ease comparison no distinction between high altitude stations as Ochsenkopf and flat stations is made here.

Meteorological Input
The EURAD-IM uses the Weather Research and Forecasting (WRF) model version 3.6.1 (Skamarock et al., 2008), a mesoscale numerical weather prediction system as meteorological driver, to provide hourly meteorological input.The necessary meteo- Information System (GIS).

Biogenic CO 2 fluxes
Hourly photosynthesis, leaf-, and soil-respiration are calculated using the Community Land Model (CLM) version 4.0 (Oleson et al., 2010), which is used as land surface scheme for WRF.While photosynthesis is already included in this configuration of CLM according to Collatz et al. (1991) and Farquhar et al. (1980), leaf respiration was added according to the same publications.While leaf respiration is in dependence of the maximum carboxylation capacity, the calculation of photosynthesis depends further on the internal leaf CO 2 and O 2 pressure, the amount of radiation, the availability of the enzyme Rubisco, and the Michaelis-Menten constants for CO 2 and O 2 .The Leaf Area Index (LAI) is monthly Moderate Resolution Imaging Spectroradiometer (MODIS) data, averaged for the years 2001-2010, provided as a part of the WRF preprocessing system.
The CLM sub-grid hierarchy is simplified to allow only one Plant Functional Type (PFT) per grid cell, following Oleson et al.
Soil respiration is implemented as an Arrhenius type equation in dependence of temperature of soil layer four (ca.11-15 cm depth), following Lloyd and Taylor (1994, Eq. (8)).Other CO 2 fluxes from the biosphere are neglected in this work.

Forward model of EURAD-IM
The transport-diffusion equation which describes the dispersion of the mixing ratio of CO 2 , C(r, t) given in dependence of spatial location r ∈ R 3 and time t ∈ [t 0 , t N ], as a passive tracer can be written as (Elbern and Schmidt, 2001) The equation includes terms for advection, turbulent diffusion, and additional fluxes.Here u designates the wind field, the air density and K e is the eddy diffusion tensor.The sinks and sources to the atmosphere, consisting of anthropogenic emissions, biogenic respiration and photosynthesis are aggregated to the flux e.
The EURAD-IM forward model discretises Eq. ( 8) with an operator splitting approach due to different numerical characteristics of the major transport processes advection and turbulent diffusion (Yanenko, 1971).The operators for advection and diffusion are processed for time steps t i to t i+1/2 and t i+1/2 to t i+1 while fluxes are inserted instantaneously at time points t i+1/2 .A symmetric order of the operators where the flux operator is defined as (background fluxes U b are given in µmol CO2 m 2 s and t := t i+1 − t i is given in seconds as well) We remember that U b i+1/2 f represents actually three fluxes, each with an independent flux factor for anthropogenic emissions, photosynthesis, and respiration respectively.
Tests with a non-monotone advection scheme display strong deterioration of CO 2 mixing ratio if strong spatial gradients of the wind strength occur, see Fig. 5.To account for this problem the Walcek scheme (Walcek, 2000) is used to solve the advection process.Applying dimensional dependent densities, allows splitting of the forward equation into one-dimensional realisations.
The Walcek scheme solves the 1-dim transport equation based on the van Leer algorithm (Van Leer, 1977) with monotonic constraints and a flux adjustment around local mixing ratio extrema.This aggregates mass numerically and reduces numerical diffusion at extrema.It is absolute monotone, positive definite, of order two, produces only low numeric diffusion, and is numerical efficient.Although an exact parallelisation of the Walcek scheme is numerically not efficient, a good approximation with a small overlap of two grid cells was realised, which preserves the numerical efficiency.

Adjoint EURAD-IM model
The adjoint model M T discretises the adjoint of the forward PDE, Eq. ( 8).Let C * designate the adjoint variable, then the adjoint PDE has the form where φ represents perturbation from the reference CO 2 flux e of Eq. ( 8).
Although the flux factor f = e g is constant during the forward run, the adjoint flux factor f * changes with (M) T during the background run and, in contrast to Eq. ( 9 operator splitting to solve Eq. ( 12), : with T i,i+1 merging the transport operators from time t i to t i+1 See Appendix A for a detailed derivation of Eq. ( 13).
The adjoint diffusion and flux operators D T and (F f ) T are constructed with the automatic differentiation tool Tapenade (Hascoet and Pascual, 2013) from the discretisation of the forward PDE Eq. ( 8): The derivative of the functions specified by D and F f are calculated applying the chain rule to get the tangent linear of the forward operators.This linearised code is transposed afterwards resulting in D T and (F f ) T .
To construct the adjoint of the advection operators (A 1 ) T and (A 2 ) T a different approach has been chosen.The Walcek scheme is not differentiable due to the monotony and automatic differentiation is not directly applicable to obtain adjoint code.Following the suggestion of Gou and Sandu (2011) the continuous adjoint for advection has been used.The adjoint PDE of Eq. ( 11) can be also used by the forward Walcek scheme, if the reverse winds and a rescaling for C * is used.The work cited above also suggests for 4D-Var optimisation problems to construct adjoint routines for diffusion and advection, as it is done here: automatic differentiation for diffusion and the continuous adjoint for advection.

Preconditioning of the cost function and minimisation
The background error covariance matrices (BECM) B and K are designed to capture the spatial correlation between initial values and surface fluxes respectively.Increasing radii of influence lead to larger condition numbers of B, which makes efficient minimisation more difficult.Using the increments δx 0 and δg multiplied with the inverse square roots of B and K, a formulation of J equivalent to Eq. ( 4) is possible: The gradient of J with respect to v and w reads then 1.The forward run, solving Eq. ( 8).Our a priori knowledge x b 0 , g b is used as first guess, allowing to apply only the square root (and its transposed) of the BECM's B and K.
2. The calculation of the gradient of J during the backward time loop and its preconditioning resulting in: 3. The usage of limited memory quasi-Newton minimiser L-BFGS (Liu and Nocedal, 1989;Nocedal, 1980), modified for parallel usage, which calculates a new vector (v T , w T ) T and saves it for the next iteration.
4. The determination of improved initial states x 0 = B 1/2 v − x b 0 and flux factors g = K 1/2 w − g b that can now be used iteratively for step 1.
Steps 1 to 4 are repeated until the convergence criterion is fulfilled.In practice the two most recent evaluations of the cost function must differ by less than 1.0 E−6.

Background error covariance matrices modelling
Both BECM's B and K are modelled using the diffusion approach.Following Weaver and Courtier (2001) a generalised diffusion equation (GDE) is formulated with a polynomial in the Laplacian, which is self-adjoint.This allows a factorisation of B into B 1/2 B T/2 and analog for K. Also anisotropic and inhomogeneous spatial radii of influence can be modelled in this manner.

Initial value background error covariance matrix
The modelling of B is the same as in Elbern et al. (2007) and is given in short form here for convenience.
The background error standard deviation depends on the initial concentration with c IV = 0.025.The decomposition of B (and analog for K) is given by where Σ is the diagonal matrix of standard deviations, C is the correlation matrix with Λ a diagonal normalisation matrix and L v,h the vertical and horizontal diffusion operator.W is also a diagonal matrix accounting for the extension of each vertical layer k = 1, . . ., k max = 23.13) iteratively and the transport operator T (Eq.( 14)), we derive with an equidistant time interval [t 0 , . . ., t N ]: t := t i+1 − t i for i = 0, . . ., N − 1, see Appendix A for a detailed derivation.
Thus the adjoint model with respect to the flux factor is proportional to U b (T) T .
With regard to Eq. ( 17) the quantity K T/2 U b (T) T is proportional to the observational part of the gradient of the preconditioned cost function.We want to construct the diagonal of K such that we achieve the following properties of the 4D-Var system: 1. Priority of optimisation of fluxes is ordered according to their strength.
2. Satisfactory sensitivity of the optimisation system to fluxes with small absolute amount.

Robustness of the system.
A compromise between counteracting properties 2 and 3 has to be accomplished.Identical twin experiments not presented here indicate that for constant diagonal of K, the 4D-Var system EURAD-IM shows a tendency to optimise only the largest fluxes, but is too conservative for smaller fluxes and thus property 2 is not satisfied.This especially holds true for anthropogenic emissions, dominated by few very large point sources.
Using a specific construction for the standard deviation of K in dependence of enables an approximation of ∇J which fulfills the three properties above see Appendix B for a derivation of Eqs. ( 24) and (25).Equation ( 24) is implemented for each of the three fluxes considered, thus the constant c and max s |U b (s)| are with respect to anthropogenic emissions, photosynthesis, and respiration respectively.
Thus, for l = 1 (i.e. the diagonal of K is constant) the right hand side of Eq. ( 25) depends linearly on U b .With increasing l the sensitivity of the system to smaller fluxes increases while the robustness decreases.
After several tests l = 4 has been chosen for anthropogenic emissions and l = 2 for photosynthesis and respiration, which is caused by different spatial heterogeneity of the three fluxes.The off-diagonal elements of K are calculated with the diffusion approach analog to the technique outlined for matrix B. A correlation of 0.4 is used between photosynthesis and respiration, whereas no correlation is assumed between anthropogenic emissions and the two biogenic fluxes.

Numerical experiments with synthetic data
The general concept of identical twin experiments is to execute a forward run, of which selected values are used as synthetic or artificial observations.This run is commonly called the nature run.The first guess, also called a priori run, starts thereafter with disturbed initial values and flux rates.It is also called the background run.The identical twin analysis procedure aims to assess how well the 4D-Var analysis is able to reconstruct the disturbed parameters, assimilating the given synthetic observations of varying completeness.The model run with optimised parameters is called the analysis run.Daley (1991) values the benefits of identical twin experiments, however concludes, they "err on the optimistic side".
Numerical experiments are executed with four different background configurations listed in Table 2.For each configuration two experiments are executed.One optimising (1) only flux factors for anthropogenic emissions, biogenic respiration, and photosynthesis and another optimising (2) initial values and flux factors jointly for each grid cell.We call (1) hereinafter "only flux factor analysis" and (2) "joint analysis".
The model domain encompasses central Europe with 5 km resolution (see Fig. 3).After a model spinup time of 30 h spanning 23 July 00 UTC to 24 July 2012 06 UTC a 12 h 4D-Var analysis run is initialised from 06-18 UTC.Synthetic measurements from the nature run are taken every 30 minutes (the model time step is 120 s) at eleven stations (see Fig. 3 and Table 1).
The observation errors are constantly 2 ppmV and the error covariance matrices R i are diagonal.The standard deviation for initial background values is x b 0 (k) • c IV (Eq.( 18)).The standard deviation of the flux factors (i.e. the constant c in Eq. ( 24)) equals 1.4 for anthropogenic emissions, 0.6 for photosynthesis, and 0.8 for respiration respectively.The matrix K is modelled with l = 4 (see Sect. 3.8) for anthropogenic emissions and with l = 2 for biogenic fluxes.The reason is that the spatial distribution of flux variations is much smoother for biogenic fluxes compared to anthropogenic emissions.While the amount of biogenic fluxes at different grid cells has mostly the same order of magnitude, the main contribution to anthropogenic emissions is caused by very few power plants, e.g Weisweiler and Niederaußem in the Rur catchment area.Figure 6 shows a difference plot of configuration 4 of the NmA (nature run minus joint analysis run) initial values in different vertical layers.White spots indicating compliance of the analysis and nature run.The overall initial value correction is very good with a small overestimation at the lower layers.Due to the setup of the experiment with sparse observations the meteorology is decisive for the distribution of corrections in the analysis run.Since easterly winds are predominant during the assimilation window, correction of the initial values occurs at all measurement stations and to the east of them.The vertical boundary layer rises up to 2000 m, such that correction of initial values can be seen up to this height (Fig. 6 (h)).Artificial overestimation can be seen in layer 12 (Fig. 6 (g)) southeast of Ochsenkopf.Additional tests not presented here, showed that a longer assimilation window increases this artificial deterioration of the analysis.Due to temporal high frequency measurements and the diffusive nature of atmospheric transport the LBFGS amplifies small wiggles, primarily during later iterations, resulting in an artificial over-and underestimation.
A comparison of the assimilated flux factors for configuration four, calculated once with joint analysis and once with only flux factor analysis, illustrates the benefits of the joint analysis.The results of the joint analysis in the left column of Fig. 7 are discussed first.
The flux factor analysis of anthropogenic emissions is dominated by a few but large point sources.In the Rur catchment area, due to easterly winds, the power plant Niederaußem is captured by the observation sites Jülich and Selhausen and is better reanalysed than Weisweiler, which is located a few kilometres downstream from both observation sites.The optimised flux factors of biogenic respiration (Fig. 7 (c)) and photosynthesis (Fig. 7 (e)) meet the nature run very well.Although slight underestimation persists, the flux factors of the joint analysis are mainly between background and nature run.
The analysis of the only flux factor optimisation has a stronger overestimation of anthropogenic emissions as compared to the joint analysis due to 2 ppmV higher initial values of the background run (Fig. 7 (a), (b)).Both biogenic fluxes are underestimated in large areas, as observations at early times result in a too strong forcing of photosynthesis, decreasing also biogenic respiration due to later observations.In the area of the observation stations Jülich and Selhausen, biogenic respiration is also overestimated (Fig. 7 (d)), a clear deterioration of analysis performance compared to the joint analysis.
A comparison of the cost reduction for the four configurations (listed in Table 2) for joint optimisation and only flux factor optimisation is shown in Fig. 8.For each configuration the cost function shows a stronger decrease (by a factor of ≈ 20) for the joint optimisation compared to the flux factor optimisation.Time-series of CO 2 concentrations for configuration 4 are depicted in Fig. 9.In general a good compliance of the nature run and joint analysis at the observation sites is seen.The CO 2 concentration of the background run and the only flux factor analysis at the initial time is always two ppmV higher than the nature run.Surprisingly also the only flux factor analysis is often in accordance with the nature run, except at Selhausen, Jülich, and Hegyhatsal.As can be seen from Fig. 7 2.
shows clearly that just an evaluation of the time-series does not guarantee a sufficient analysis.
The vertical concentration profiles at the last hour of the assimilation window of the towers with measurements in higher layers We can now write in accordance to Eq. ( 9) and ( 10) as F is equivalent to Eq. ( 10).Using Eq. ( A3), the adjoint model can be derived which is Eq. ( 13).Equation (A6) can now be used iteratively for the calculation of (M 0,i ) T .For the sake of shorter notation we write which is Eq. ( 23).
Appendix B: Derivation of the diagonal of K In this Sect.the specific construction of the diagonal of K (Eq.( 24)) and an approximation of the gradient of the cost function (Eq.( 25)) is shown.First an approximation for the flux factor part of such that we get Eq.( 25).As already mentioned in Sect.3.6 and 3.8.2 the notation presented here is straightforward to apply also for three fluxes but is avoided here for the sake of clarity.

Figure 1 .
Figure 1.The diurnal time profile of power generation given by Memmesheimer et al. (1995) and spatial averaged biogenic fluxes calculated

Figure 2 .
Figure 2. If a priori initial values and flux rates are defective, the joint optimisation combines the possibility of analysing both error sources.

Figure
Figure 3. Topography of (a) the nests with 15 and 5 km resolution and (b) the 5 km domain with the location of the synthetic observation Figure 3. Topography of (a) the nests with 15 and 5 km resolution and (b) the 5 km domain with the location of the synthetic observation rological initial and lateral boundary data are taken from six-hourly ECMWF (http://www.ecmwf.int)reanalysis, interpolated to 0.225 • × 0.225 • .The most important variables are wind, temperature, shortwave radiation and planetary boundary layer height.3.3 Anthropogenic CO 2 emissions Anthropogenic emissions, introduced as first guess for inversion, are taken from Toegepast Natuurwetenschappelijk Onderzoek (TNO) inventory (H.Denier van der Gon, personal communication, May 2013), following the methodology outlined in Kuenen et al. (2014) and Pouliot et al. (2012) for air pollutants.It provides point and area sources of CO 2 splitted into ten source categories for Europe for the reference year 2005.Annual values are disaggregated for each source category in dependence of month, day and hour.See also Memmesheimer et al. (1995) for details.Hourly values are then interpolated to model time step.Area sources are scaled down to 5 km resolution, using land cover and infrastructure information with the Geographic

Figure 4 .
Figure 4. WRF-CLM output of photosynthesis, leaf and soil respiration at 23 July 2012 13 UTC with 5 km horizontal resolution.

Figure 5 .
Figure 5. Zoom of the 5 km horizontal resolution grid, started with constant mixing ratio after four hour run-time at a high vertical level (∼ 373 − 282 hPa) barely influenced from surface fluxes.The wind field is normalised to 30 m/s.
Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-132,2016 Manuscript under review for journal Geosci.Model Dev.Published: 19 August 2016 c Author(s) 2016.CC-BY 3.0 License.3.8.2Flux factor background error standard deviation We present the variance modelling for the flux factor BECM K. Joint optimisation of initial values and flux factors requires a distinction of the adjoint models with respect to the two optimisation parameters.Although the flux factor f = e g is constant during one iteration, the adjoint flux factor f * changes with (M) T depending on the background flux strength U b .We want to investigate the influence of U b on ∇J hereinafter.Using Eq. (

13
Geosci.Model Dev.Discuss., doi:10.5194/gmd-2016-132,2016 Manuscript under review for journal Geosci.Model Dev.Published: 19 August 2016 c Author(s) 2016.CC-BY 3.0 License.The non-diagonal entries of B and K are calculated using an influence radius of 30 km at the bottom layer, 45 km at the top of the modelled planetary boundary layer and 60 km at the top of the model domain.The L-BFGS optimisation is carried out for at most 30 iterations.
the correction of the flux factors close to Wuestebach, Cabauw, and Ochsenkopf, from the optimisation of only flux factors is wrong.This

Figure 6 .
Figure 6.Zoom of the difference from initial values NmA (nature run minus joint analysis run) of configuration 4 for different layers (black crosses show synthetic observation sites).The given heights are valid for the U.S. standard atmosphere.

Figure 7 .Figure 8 .
Figure 7. Analysed flux factors of anthropogenic emissions (first row), biogenic respiration (second row), and photosynthesis (third row).The analysis at the left column shows results of the joint initial value and flux factor assimilation while the right column shows an analysis which optimised only flux factors.The two biogenic fluxes are shown at the surface level, while the anthropogenic emissions are given at ≈ 270 m height, which is the layer with the highest impact due to power plants.Green plus signs indicate the location of the two biggest power plants Niederaußem and Weisweiler.

Figure 9 .
Figure 9. Time-series of CO2 concentration of configuration four at observations sites for observations (red), background run (black), joint analysis (blue) and flux factor (FF) analysis (green).

Table 2 .
Configuration of background initial states and flux factors, applied for anthropogenic emissions, photosynthesis, and respiration.