Guide to config-files

General remarks

Instead of .csv files als in previos versions, now .toml markup is used. An overview is provided here https://github.com/toml-lang/toml

Setup sample files

To illustrate the configuration, some sample files are needed.

# make a folder in the the directory as larda, larda-cfg, and larda-connectordump
mkdir -p example-data/categorize/2018 && cd example-data/categorize/2018/
wget http://devcloudnet.fmi.fi/cnet/limassol/processed/categorize/2018/20180208_limassol_categorize.nc
cd ../../../
mkdir -p example-data/classification/2018 && example-data/classification/2018
wget http://devcloudnet.fmi.fi/cnet/limassol/products/classification/2018/20180208_limassol_classification.nc

Campaigns config

The larda-cfg/campaigns.toml config file is used to provide the general context of the data.

[lacros_cycare_example]
    location = "Limassol"
    coordinates = [34.677, 33.038]
    altitude = 11
    mira_azi_zero = 154
    duration = [["20161018", "today"]]
    systems = ["CLOUDNET", "MIRA"]
    system_only.CLOUDNET =  [['20161101', '20190107'], ['20180101', '20180214']]
    system_only.MIRA =  [['20161101', '20180401']]
    cloudnet_stationname = 'limassol'
    info_text_loc = 'default'
    #info_text_loc = 'info_lacros.toml'
    param_config_file = 'params_cycare_example.toml'
    connectordump = '/home/larda3/larda-connectordump/'

Parameter config

The second level of configuration defines the parameters for each system in a file such as params_cycare_example.toml. The file lists different systems, such as MIRA, CLOUDNET or POLLYNET. Each systems’ config has three parts. An example is given below.

path

Defines the paths, where to find the (netcdf) files containing the parameters.

base_dir
directory to start filesearch
matching_subdirs
regex, that describes the subpaths (including filename) that are matching_subdirs
date_in_filename
named groups in regex that identify the part of the filename, that contains the date

Note

implicitly it is assumed, that the timestamp in the filename is the beginning of measurements

generic

Set of properties, that is used for each parameter as default if not specified explicitly for the parameter itself.

params

Define the settings for each variable, that should be handled by larda.

time_variable
name of the time variable in the netcdf file
time_conversion
function that converts the time variable to unix timestamp
time_microsec_variable
if given specifies the variable containing the microseconds
time_millisec_variable
if given specifies the variable containing the milliseconds
range_variable
name of the range variable in the netcdf file
range_conversion
function that converts the range variable to meters. 'none' if already given in that unit
var_conversion
function that converts the variable variable to meters. 'none' if no conversion is desired
colormap
colormap to use by default
which_path
name of the path definition that matches the files for this parameter
ncreader
which reader to use
identifier_rg_unit
name of the range unit attribute in the netcdf varibale
identifier_var_unit
name of the var unit attribute in the netcdf varibale
identifier_var_lims
name of the var limits attribute in the netcdf varibale
var_lims
define limits of variable directly
var_name
name of the variable
vel_variable
velocity variable for reading spectra
dimorder
toggle the order of dimensions (i.e. mira nc file)
meta.*
dictionary of meta information extracted from variables, var attributes or global attributes

Example

The section for the Cloudnet configration in the params_cycare_example.toml might look like below. The absolute paths in base_dir will likely have to be adapted.

[CLOUDNET]
    [CLOUDNET.path.categorize]
        # mastering regex (here to exclude ppi and stuff)
        base_dir = '/home/larda3/example-data/categorize/'
        matching_subdirs = '(\d{4}\/\d{8}.*.nc)'
        date_in_filename = '(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})_'
    [CLOUDNET.path.productsclass]
        # mastering regex (here to exclude ppi and stuff)
        base_dir = '/home/larda3/example-data/classification/'
        matching_subdirs = '(\d{4}\/\d{8}.*.nc)'
        date_in_filename = '(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})_'
    [CLOUDNET.generic]
        # this general settings need to be handed down to the params
        time_variable = 'time'
        range_variable = 'height'
        colormap = "gist_rainbow"
        time_conversion = 'beginofday'
        range_conversion = 'sealevel2range'
        var_conversion = 'none'
        ncreader = 'timeheight'
        # if identifier is given read from ncfile, else define here
        identifier_rg_unit = 'units'
        identifier_var_unit = 'units'
        identifier_var_lims = 'plot_range'
        identifier_fill_value = 'missing_value'
        #var_lims = [-40, 20]
        meta.version = "gattr.software_version"
        meta.history = "gattr.history"
        meta.source = "vattr.source"
        meta.latitude = "var.latitude"
    [CLOUDNET.params.Z]
        variable_name = 'Z'
        which_path = 'categorize'
        var_conversion = 'z2lin'
    [CLOUDNET.params.LDR]
        variable_name = 'ldr'
        which_path = 'categorize'
        var_conversion = 'z2lin'
    [CLOUDNET.params.T]
        variable_name = 'temperature'
        which_path = 'categorize'
        range_variable = 'model_height'
    [CLOUDNET.params.beta]
        variable_name = 'beta'
        which_path = 'categorize'
    [CLOUDNET.params.depol]
        variable_name = 'lidar_depolarisation'
        which_path = 'categorize'
        var_unit = '%'
        var_lims = [0.0, 0.3]
    [CLOUDNET.params.CLASS]
        variable_name = 'target_classification'
        which_path = 'productsclass'
        var_unit = ""
        var_lims = [0, 10]
        colormap = 'cloudnet_target'
        fill_value = -99

Note

var_conversion allows for chained functions, such as var_conversion = 'z2lin,extrfromaxis2(0)'. See pyLARDA.helpers.get_converter_array().

A template option is available for repeating datasets in different campaigns:

[CLOUDNET]
    template = 'temp_cloudnet.toml'
    [CLOUDNET.path.categorize]
        # mastering regex (here to exclude ppi and stuff)
        base_dir = '/home/larda3/example-data/categorize/'
        matching_subdirs = '(\d{4}\/\d{8}.*.nc)'
        date_in_filename = '(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})_'
    [CLOUDNET.path.productsclass]
        # mastering regex (here to exclude ppi and stuff)
        base_dir = '/home/larda3/example-data/classification/'
        matching_subdirs = '(\d{4}\/\d{8}.*.nc)'
        date_in_filename = '(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})_'

The generic and params section are then defined in larda-cfg/temp_cloudnet.toml. In the pyLARDA.ParameterInfo.ParameterInfo, the template is updated with the campaign configuration. Hence, single generic or params configurations in the template can be overwritten.

The configuration can be checked by running python3 ListCollector.py Afterwards the connectordump at larda-connectordump/lacros_cycare_example/connector_CLOUDNET.json should look similar to

{
"categorize": [
    [
    [
        "20180208-000000",
        "20180209-000000"
    ],
    "./2018/20180208_limassol_categorize.nc"
    ]
],
"productsclass": [
    [
    [
        "20180208-000000",
        "20180209-000000"
    ],
    "./2018/20180208_limassol_classification.nc"
    ]
]
}