Appearance
Data Sets
Data sets define the data that can be presented in visualisations within AireFrame. This page describes the different types of data sets and how to create them.
Once you have created a data set, they can be used within data pipelines which form the basis of visualisations.
WARNING
A data set cannot be edited once it has been created. If you need to make changes, you will need to create a new data set.
TIP
Data sets can only be deleted if they are not being used by any data pipelines, or composite data sets.
Source Data Sets
A source data set defines data that is extracted directly from a data source.
To create a data extract, you must provide a JSON definition that describes the data you want to extract. This is specific to the data source you are using. More information about the definitions for the bundled data source definitions are available:
- AireFrame (Internal)
- AireForms
- AireFlow
- AireGlu - JSON definition is determined by the schema returned from this user-configured endpoint
TIP
AireFrame uses the Monaco editor to provide syntax highlighting and code completion.
Use the keyboard shortcut Ctrl + Space
to bring up the code completion menu.
Subject Identifier
For some data sources you will need to specify what subject identifier should be passed down to the data source to extract the data.
The options are:
- Internal - AireFrame's internal UUID-based identifier
- External - The identifier received from the subject provider
- Custom Field - Any required custom field (configurable on the subject configuration page).
WARNING
If you use a custom field as the subject identifier, you must ensure that the field is populated for the subject(s) being viewed. An error will occur if the field is not populated.
Composite Data Sets
A composite data set allows you to combine data from multiple source data sets into a single data set. This is useful when you want to create visualisations over data from multiple sources.
TIP
Composite data sets can only be created from cacheable source data sets.
It is important to note that no ordering is applied to the data points in a composite data set. The displayed order is determined by the visualisation configuration that uses the data set.
Stacked Data Sets
A stacked data set combines multiple data sets by 'stacking' the data points on top of each other.
To do this you define stacked fields, which contain one or more related fields from the source data sets. All fields within a stacked field must have the same data type.
Example
Given these two data sets:
A1 | A2 |
---|---|
1 | "A" |
2 | "B" |
B1 | B2 |
---|---|
101 | "Y" |
102 | "Z" |
We define two stacked fields:
Field 1
with fieldsA1
,B1
Field 2
with fieldsA2
,B2
We would get the following stacked data set:
Field 1 | Field 2 |
---|---|
1 | "A" |
2 | "B" |
101 | "Y" |
102 | "Z" |
Linked Data Sets
A linked data set allows you to link related data points across multiple data sets. For example, you may want to show information about a form submission alongside a task for that form submission.
TIP
Data points are always linked by subject, it is not possible to link data points across subjects.
To do this you must define a pipeline for each source data set. There are a few requirements for the pipeline:
- Each pipeline must contain at least one aggregator to ensure only a single data point is returned from each source data set for linking.
- Each pipeline must return a unique set of field keys to ensure there is no overlap between the data points returned from each source data set.
- All output field keys are prefixed with the source data set key
Example
Given these two data sets:
A
1 | 2 |
---|---|
1 | "A" |
2 | "B" |
B
1 | 2 |
---|---|
101 | "Y" |
102 | "Z" |
We define a pipeline for each data set:
- For data set A, we use the max by aggregator based on column
1
- For data set B, we use the min by aggregator based on column
1
We would get the following linked data set:
A_1 | A_2 | B_1 | B_2 |
---|---|---|---|
2 | "B" | 101 | "Y" |
There is a further option to group by correlation id. For this to work correctly, the correlation ids must match across the data sets.
Correlation Id Grouping Example
Given these two data sets:
A
Correlation Id | 1 | 2 |
---|---|---|
"c1" | 1 | "A" |
"c2" | 2 | "B" |
B
Correlation Id | 1 | 2 |
---|---|---|
"c1" | 101 | "Y" |
"c1" | 102 | "Z" |
"c2" | 102 | "Z" |
We define a pipeline for each data set:
- For data set A, we use the max by aggregator based on column
1
- For data set B, we use the min by aggregator based on column
1
We would get the following linked data set:
Correlation Id | A_1 | A_2 | B_1 | B_2 |
---|---|---|---|---|
"c1" | 1 | "A" | 101 | "Y" |
"c2" | 2 | "B" | 102 | "Z" |