CarbonTracker

Previous topic

The Observation Operator Object

Next topic

The Optimizer Object

This Page

The StateVector and EnsembleMember Object

Revision History: File created on 28 Jul 2010.

The module statevector implements the data structure and methods needed to work with state vectors (a set of unknown parameters to be optimized by a DA system) of different lengths, types, and configurations. Two baseclasses together form a generic framework:

As usual, specific implementations of StateVector objects are done through inheritance form these baseclasses. An example of designing your own baseclass StateVector we refer to Chapter 5: Modifying the state vector.

class da.baseclasses.statevector.StateVector(DaCycle=None)

The StateVector object first of all contains the data structure of a statevector, defined by 3 attributes that define the dimensions of the problem in parameter space:

  • nlag
  • nparameters
  • nmembers

The fourth important dimension nobs is not related to the StateVector directly but is initialized to 0, and later on modified to be used in other parts of the pipeline:

  • nobs

These values are set as soon as the Initialize() is called from the The Inverse Pipeline. Additionally, the value of attribute isOptimized is set to False indicating that the StateVector holds a-priori values and has not been modified by the The Optimizer Object.

StateVector objects can be filled with data in two ways
  1. By reading the data from file
  2. By creating the data through a set of method calls

Option (1) is invoked using method ReadFromFile(). Option (2) consists of a call to method MakeNewEnsemble()

Each option makes use of the same call to GetNewMember(), to create a data container to be filled: the EnsembleMember.

Once the StateVector object has been filled with data, it is used in the pipeline and only two more methods are invoked from there:

  • Propagate(), to advance the StateVector from t=t to t=t+1
  • WriteToFile(), to write the StateVector to a NetCDF file for later use

The methods are described below:

Initialize()

Initialize the object by specifying the dimensions

ReadFromFile(filename)
Parameters:filename – the full filename for the input NetCDF file
Return type:None

Read the StateVector information from a NetCDF file and put in a StateVector object In principle the input file will have only one two datasets inside called:

  • meanstate, dimensions [nlag, nparamaters]
  • ensemblestate, dimensions [nlag,nmembers, nparameters]

This NetCDF information can be written to file using WriteToFile()

WriteToFile(filename)
Parameters:filename – the full filename for the output NetCDF file
Return type:None

Write the StateVector information to a NetCDF file for later use. In principle the output file will have only one two datasets inside called:

  • meanstate, dimensions [nlag, nparamaters]
  • ensemblestate, dimensions [nlag,nmembers, nparameters]

This NetCDF information can be read back into a StateVector object using ReadFromFile()

MakeNewEnsemble(lag, covariancematrix=None)
Parameters:
  • lag – an integer indicating the time step in the lag order
  • covariancematrix – a matrix to draw random values from
Return type:

None

Make a new ensemble, the attribute lag refers to the position in the state vector. Note that lag=1 means an index of 0 in python, hence the notation lag-1 in the indexing below. The argument is thus referring to the lagged state vector as [1,2,3,4,5,..., nlag]

The optional covariance object to be passed holds a matrix of dimensions [nparams, nparams] which is used to draw ensemblemembers from. If this argument is not passed it will ne substituted with an identity matrix of the same dimensions.

GetNewMember(memberno)
Parameters:memberno – an integer indicating the ensemble member number
Return type:an empty ensemblemember object

Return an ensemblemember object

Propagate()
Return type:None

Propagate the parameter values in the StateVector to the next cycle. This means a shift by one cycle step for all states that will be optimized once more, and the creation of a new ensemble for the time step that just comes in for the first time (step=nlag). In the future, this routine can incorporate a formal propagation of the statevector.

class da.baseclasses.statevector.EnsembleMember(membernumber)
An ensemble member object consists of:
  • a member number
  • parameter values
  • an observation object to hold sampled values for this member

Ensemble members are initialized by passing only an ensemble member number, all data is added by methods from the StateVector. Ensemble member objects have almost no functionality except to write their data to file using method WriteToFile()

__init__(membernumber)
Parameters:memberno – integer ensemble number
Return type:None

An EnsembleMember object is initialized with only a number, and holds two attributes as containter for later data:

  • ParameterValues, will hold the actual values of the parameters for this data
  • ModelSample, will hold an Observation object and the model samples resulting from this members’ data
WriteToFile(DaCycle)
Return type:None

Write an EnsembleMember information to a NetCDF file for later use. The standard output filename is parameters.DDD.nc where DDD is the number of the ensemble member. Standard output file location is the dir.input of the DaCycle object. In principle the output file will have only one dataset inside called parametervalues which is of dimensions nparameters. This dataset can be read and used by a ObservationOperator object.

Note

if more, or other information is needed to complete the sampling of the ObservationOperator you can simply inherit from the baseclass ensemblemember and overwrite this WriteToFile function.