Validation Experiment Protocols

This page gives protocols for the different VALUE experiments, i.e., mostly technical details on the cross validation, the data sources and the data format. For the rationale behind every experiment please refer to 

Douglas Maraun, Martin Widmann, José M. Gutierrez, Sven Kotlarski, Richard E. Chandler, Elke Hertig, Joanna Wibig, Radan Huth, Renate A.I. Wilcke (2015) VALUE ‐ A Framework to Validate Downscaling Approaches for Climate Change Studies, Earth's Future, 3(1), 1-14. Doi: 10.1002/2014EF000259

Please refer to the terms and conditions for details about the data policy (including publication of the validation results and the downscaled data on the VALUE portal). The validation results generated by the VALUE validation protal for each method can be shared in three different modes: private use only, sharing the results with VALUE members for purely scientific interest, or  publish them in a public mode according to a Creative Commons by-license, similar to open access journals. The downscaled data can be published separately (to participate in core VALUE publications, publication in public mode is mandatory for both the validation results and the data); redistribution of the raw data is not allowed. Every contributor is free to choose how many stations/regions to downscale and in which experiments to participate. The more, the more rigorously a method is tested, of course.

A special issue presenting the most relevant results is planned, based on the data uploaded prior to 30 September 2015 (deadline for experiments 1a and 1b). Key papers - aiming to present pan European results - will be organised top-down. Contribution to these papers requires a substantial contribution to the downscaling, and is subject to approval by the lead author. Additional papers (e.g., with regional foci or on special topics) can - coordinated with the VALUE steering group - be designed via a grass-root approach. A link to organise these papers will follow soon.   

EXPERIMENTS:

 


EXPERIMENT 1(a): Perfect Predictor, Station Data (ECA&D)

Time Range and Cross Validation

01.01.1979 - 31.12.2008

Sub-periods for 5-fold cross validation:
1979 - 1984; 1985 - 1990; 1991 - 1996; 1997 - 2002; 2003 - 2008 

Repeat the calibration/validation process of the downscaling method five times, considering each of the superiods (in sucession) as the test set and the remaining four subperiods for calibration (e.g. the first test set would be 1979-1984 with the downscaling method calibrated using 1985 - 2008; the second test set would be 1985-1990 with the method calibrated using 1979-1984 and 1991-2008; and so on).  The resulting five cross-validated test sub-series should be combined into one series spanning 01.01.1979 - 31.12.2008. 

Predictors

Predictors are to be taken from ERA-Interim and can be downloaded from the ECMWF:

A standard pre-selection of commonly used predictors is available from the User Data Gateway (UDG) of the  Santander MetGroup (this dataset is used by the statistical downscaling portal). Registration (and acceptance of the ECMWF data policy) is required. Subsets of this dataset can be 1) downloaded in a user-friendly form using a web browser or 2) remotely accessed from R (more information here; also in pdf). The same predictors from several CMIP5 Earth System Models will be also available for succeeding experiments.

For those groups applying bias correction techniques, a dataset (text file) with the ERA-Interim data (precip and temp) for the closest gridboxes for the 86 stations is available in the VALUE datasets page). This dataset has been prepated to facilitate the work to these groups. 

Additionally, those groups applying bias correction techniques should use a second predictor dataset, obtained from a 0.11º resolution RCM (KNMI-RACMO22E) driven by ERA-Interim (taken from the EURO-CORDEX dataset). In order to minimize the workload for this experiment, this data has been processed and prepared following the same ASCII format as the observations, considering the nearest gridboxes to the 86 stations (VALUE_RACMO011_86_v2.zip; see VALUE datasets page for details). The results for this predictor dataset should be uploaded into the VALUE validation portal indicating the experiment Experiment_1a_RCM. Note that Experiment_1a should be used in the standard case (when the predictors are taken from ERA-Interim). 

Predictands

Daily station data for the 86 stations selected from ECA&D across Europe (VALUE_ECA_86_v2.zip). This dataset is available for download in the VALUE datasets page (new version with some corrections of outliers and inconsistencies is available since 17/4/2014).

Data Upload

Contributions should be submitted to the VALUE validation portal indicating the following experiments: Experiment_1a (predictors from ERA-Interim); Experiment_1a_RCM (predictors from the KNMI RCM driven by ERA-Interim; only for Bias Correction methods).

Data Format for Upload

Data need to be uploaded as ASCII files, separately for each variable. Data for different stations are combined into one file. For deterministic methods, one file is required. For stochastic methods, 100 realisations from the underyling (potentially) time-varying pdf are required, each realisation in a single file, and all combined into a single zipped archive. 

The format of each ASCII file is identical to that of the predictand dataset, and consist of a header and data rows for each date. The first column contains the dates (YYYYMMDD), all other columns data for each downscaled station. Data are comma-separated, missing values are denoted as NaN by default (however any other missing code can be chosen and indicated in the validation portal in the uploading phase).

Header row:
YYYYMMDD, station_id of first downscaled station,..., station_id of last downscaled station

Data rows:
date, data for first downscaled station,..., data for last downscaled station

station_id refers to the code of the station as described in the file stations.txt file of the VALUE_ECA_86 dataset (see the description of the data format for more details).


EXPERIMENT 1(b.1): Perfect Predictor, Gridded data (E-OBS, closest gridboxes to the stations)

Time Range, Cross Validation, Predictors and Data Format

Same as Experiment_1a

Predictands

E-OBS gridded observations for the (0.22º) gridboxes closest to the 86 locations used in Experiment_1a. The dataset VALUE_EOBS_86_v2.zip is available for download in the VALUE datasets page

Data Upload and Data Format

Contributions should be submitted to the VALUE validation portal indicating the following experiments: Experiment_1b_1 (predictors from ERA-Interim); Experiment_1b_1_RCM (predictors from the KNMI RCM driven by ERA-Interim; only for Bias Correction methods). 

The data format should be the same as the previous experiment.


EXPERIMENT 1c: Perfect Predictor, nested station data (spatial aspects)

Time Range, Cross Validation, Predictors and Data Format

Same as Experiment_1a. Moreover, a dataset  (text file) with the ERA-Interim data (precip and temp) for the closest gridboxes for the 53 stations is available in the VALUE datasets page). This dataset has been prepated to facilitate the work groups applying bias correction techniques.

Predictands

Daily station data for precipitation and temperatures for a set of 53 stations from ECA in Germany. The dataset VALUE_ECA_53_Germany_spatial_v1.zip is available for download in the VALUE datasets page. This dataset will be used for the spatial validation case study of Experiment 1a. 

Data Upload and Data Format

Contributions should be submitted to the VALUE validation portal indicating the experiment Experiment_1c (predictors from ERA-Interim). 

The data format should be the same as the one described for Experiment 1a.


EXPERIMENT 1d: Perfect Predictor, station data (multi-variable aspects)

Time Range, Cross Validation, Predictors and Data Format

Same as Experiment_1a. Moreover, a dataset (text file) with the ERA-Interim data (precip, temp, wind, humidity and cloud cover) for the closest gridboxes for the 12 stations is available in the VALUE datasets page). This dataset has been prepated to facilitate the work groups applying bias correction techniques. 

Predictands

Subset of VALUE_ECA_86 including twelve stations in Germany with daily data for precipitation, temperatures, daily mean wind speed, daily maximum wind speed, relative humidity, and cloud cover. The dataset VALUE_ECA_12_Germany_multivar_v3.zip is available for download in the VALUE datasets page. This dataset is used for the multi-variable case study. 

Data Upload and Data Format

Contributions should be submitted to the VALUE validation portal indicating the experiment Experiment_1d (predictors from ERA-Interim). 

The data format should be the same as the one described for Experiment 1a.


[NEW] EXPERIMENT 2(a): GCM Predictor, Station Data (ECA&D)

Using the models trained as in Experiment 1(a), but considering the whole period (1979-2008; no cross-validation in this case), Experiment 2(a) consists of obtaining (daily) downscaled values from the predictors for a particular GCM (historical and RCP8.5 scenarios) for the periods 1986-2005 (reference AR5 period) and  2040 - 2100 (including mid- and long-term future periods), respectively. The main goal of this experiment is to attribute the large variability observed in simulated historical trends in Experiment 1a to a number of potential factors: 1) families of methods, 2) size of the domain, 3) predictors used.

The predictands are the same as in Experiment 1a (VALUE_ECA_86_v2.zip) as well as the data format of the outputs to be uploaded to the VALUE validation portal (Experiment_2) before the deadline. 

Predictors are to be taken from ERA-Interim for training and from EC-EARTH_r12i1p1 projections for downscaling.

The same subset of common predictos have been prepared (trimmed for the Europan domain) for ERA-Interim and EC-Earth and are available for download in the VALUE datasets page.

Outputs from the historical (1986-2005) and future RCP8.5 (2040-2100) scenarios will be sent as separate files or joined in a single file for each station following the same format described in Experiment 1a.

Some background information on the definition of this experiment are described in the document VALUE_Experiment2.pdf. Note that this experiment was described in Maraun et al. (2015) as Experiment 3.

This experiment is being expanded to cover an additional domain in South America.

[NEW] EXPERIMENT 2(b): GCM Predictor, Gridded data (E-OBS)

As Experiment 1(a) but considering the E-OBS gridded data as predictand. 

AttachmentSize
VALUE_Experiment2.pdf832.28 KB