RTV Tutorial on Configuration Files¶
Quick Reference:
To execute validation scenario via RTV configuration file you need to run rtv
from command line and provide a path to the configuration file, like this:
rtv /path/to/config/file
Currently supported formats:
yaml
json
NOTE: This tutorial uses yaml
format for examples in most places.
Structure¶
A valid configuration file for RTV should have two main sections:
definitions
- this section holds a list of framework’s entities defined which will be used in the validation scenario.actions
- this section should hold a list of actions wich will be performed during validation scenario execution.
Minimal example:
yaml
:
definitions:
- name: csv_reader
class: CSVReader
delimiter: "|"
actions:
- read:
reader: csv_reader
source: vector.csv
output_name: vector_data
json
:
{
"definitions": [
{
"name": "csv_reader",
"class": "CSVReader",
"delimiter": "|"
}
],
"actions": [
{
"read": {
"reader": "csv_reader",
"source": "vector.csv",
"output_name": "vector_data"
}
}
]
}
Definitions¶
Each element in the list of definitions in definitions
section of the
configuration file should have following required fields:
name
: You can think of it as an alias or a variable name, that you can later use in the config to reference defined entity.class
: A constructor class name of the entity.
The rest of the definition fields are arbitrary parameters for certain entity.
In previous example delimiter
field is a parameter of CSVReader
.
NOTE: You can find a list of available entities/classes and their parameters in the following sections of this tutorial.
Actions¶
The common structure for actions
section entry is as follows:
actions:
- <action_type>:
- <action_param>: ...
# ...
- <action_param>: ...
# ...
A set of <action_param>
fields is specific to a certain action type.
Example with read
<action_type>
:
actions:
- read:
- reader: csv_reader
source: vector.csv
output_name: vector_table_data
- reader: txt_reader
source: vector.txt
output_name: vector_text_data
NOTE: You will find info on availabe <action_type>
and realated
<action_param>
in the following section of this tutorial.
During the validation run the actions will be executed in order that they were defined in the config, so the following example will lead to an error:
actions:
- transform:
input: vector_data
output_name: transformed_vector_data
transformers: vector_transposer
- read:
reader: csv_reader
source: vector.csv
output_name: vector_data
transform
action will raise an exception when trying to access vector_data
entry as it will only be available after successful read
action execution.
Available actions¶
read
¶
Used to read data from arbitrary source(s), convert it to RTV internal data representation and save it to the current scenario’s data store.
Fields:
reader
: A name of theReader
entity to use for the action execution.source
: A path to a source.output_name
: A unique (to the current scenario) name that will be used to store and reference the action’s result.pattern
: Optional field, a regex pattern to match more than one source file. If this field is provided thensource
should be a path to a directory with source files to match thepattern
.Defaults to empty string.prefix_key
: Optional field, a prefix string to prepend to every key of resulting data entry. Defaults to empty string.
Example:
Read reference.csv
and target.csv
source files and save resulting data
as reference
and target
respectively:
definitions:
- name: csv_reader
class: CSVReader
actions:
- read:
- reader: csv_reader
source: reference.csv
output_name: reference
prefix_key: ref
- reader: csv_reader
source: iterations/
pattern: iter_(\d+).csv
# will match: iter_001.csv, iter_002.csv...
output_name: target
write
¶
Used to write a data entry to some output destination using Writer
entity.
Fields:
input
: A name of the data entry to write tooutput
.writer
: A name of the definedWriter
entity to use for the action execution.output
: An action result’s output destination. Actual type depends on thewriter
implementation.
Example:
Write result
data entry to a json file named validation_result.json
using
JSONWriter
entity.
definitions:
# ...
- name: json_writer
class: JSONWriter
# ...
actions:
# ...
- write:
input: result
writer: json_writer
output: validation_result
transform
¶
Used to transform data entries using Transformer
entities and save the result as a
new data entry.
Fields:
input
: A name of the data entry to transform.transformers
: A name (or a list of names) ofTransformer
entity to use for the action execution.output_name
: A unique name that will be used to store and later reference the result of the action.
Example:
Transform result
data entry using inverse_transformer
and save the
transformed result to result_transformed
data entry.
definitions:
# ...
- name: inverse_transformer
class: InverseTransformer
# ...
actions:
# ...
- transform:
input: result
writer: inverse_transformer
output: result_transformed
# ...
validate
¶
Used to perform validation on target
data entry against reference
data
entry using single or multiple Validation
entities.
Fields:
reference
: A data entry name to use as reference.target
: A data entry name to use as target.validations
: A name (or a list of names) ofValidation
entity to use for the action execution.output_name
: A unique name that will be used to store and reference the result of the action.
Example:
Validate a
data entry against b
data entry using v1
validation and write
the resulting data entry to result
.
definitions:
# ...
- name: mae
class: MeanAbsoluteError
threshold: 0.5
- name: v1
class: StrategyValidation
strategies: mae
keys: all
# ...
actions:
# ...
- validate:
reference: b
target: a
validations: v1
output_name: result
# ...
free
¶
Used to remove data entries from the current scenario data store.
Fields:
targets
: Names of data entries to remove.
Example:
Remove a
and b
data entries.
actions:
- free:
targets: [a,b]