Data columns: field_spec and field_path¶
There are some points to note about the nomenclature that the pypond
and pond
code bases use to refer to the “columns of data” in the time series event objects. This TimeSeries
:
DATA = dict(
name="traffic",
columns=["time", "value", "status"],
points=[
[1400425947000, 52, "ok"],
[1400425948000, 18, "ok"],
[1400425949000, 26, "fail"],
[1400425950000, 93, "offline"]
]
)
contains two columns: value
and status
.
However this TimeSeries
:
DATA_FLOW = dict(
name="traffic",
columns=["time", "direction"],
points=[
[1400425947000, {'in': 1, 'out': 2}],
[1400425948000, {'in': 3, 'out': 4}],
[1400425949000, {'in': 5, 'out': 6}],
[1400425950000, {'in': 7, 'out': 8}]
]
)
contains only one column direction
, but that column has two more columns - in
and out
- nested under it. In the following examples, these nested columns will be referred to as “deep paths.”
When specifying columns to the methods that set, retrieve and manipulate data, we use two argument types: field_spec
and field_path
. They are similar yet different enough to warrant this document.
field_path¶
A field_path
refers to a single column in a series. Any method that takes a field_path
as an argument only acts on one column at a time. The value passed to this argument can be either a string, a list or None
.
String variant¶
When a string is passed, it can be one of the following formats:
- simple path - the name of a single “top level” column. In the
DATA
example above, this would be eithervalue
orstatus
. - deep path - the path pointing to a single nested columns with each “segment” of the path delimited with a period. In the
DATA_FLOW
example above, the incoming data could be retrieved withdirection.in
as thefield_path
.
List variant¶
When a list
(or tuple
) is passed as a field_path
, each element in the iterable is a single segment of the path to a column. So to compare with the string examples:
['value']
would be equivalent to the stringvalue
.['direction', 'in']
would be equivalent to the stringdirection.in
.
This is particularly important to note because this behavior is different than passing a list to a field_spec
arg.
None
¶
If no field_path
is specified (defaulting to None
), then the default column value
will be used.
field_spec¶
A field_spec
refers to one or more columns in a series. When a method takes a field_spec
, it may act on multiple columns in a TimeSeries
. The value passed to this argument can be either a string, a list or None
.
String variant¶
The string variant is essentially identical to the field_path
string variant - it is a path to a single column of one of the following formats:
- simple path - the name of a single “top level” column. In the
DATA
example above, this would be eithervalue
orstatus
. - deep path - the path pointing to a single nested columns with each “segment” of the path delimited with a period. In the
DATA_FLOW
example above, the incoming data could be retrieved withdirection.in
as thefield_path
.
List variant¶
Passing a list
(or tuple
) to field_spec
is different than the aforementioned behavior in that it is explicitly referring to one or more columns. Rather than each element being segments of a path, each element is a full path to a single column.
Using the previous examples:
['in', 'out']
would act on both thein
andout
columns from theDATA
example.['direction.in', 'direction.out']
- here each element is a fully formed “deep path” to the two data columns in theDATA_FLOW
example.
The lists do not have to have more than one element: ['value'] == 'value'
.
NOTE: accidentally passing this style of list to an arg that is actually a field_path
will most likely result in an EventException
being raised. Passing something like ['in', 'out']
as a field_path
will attempt to retrieve the nested column in.out
which probably doesn’t exist.
None
¶
If no field_spec
is specified (defaulting to None
), then the default column value
will be used.
field_spec_list¶
This is a less common variant of the field_spec
. It is used in the case where the method is going to specifically act on multiple columns. Otherwise, it’s usage is identical to the list variant of field_spec
.