Data Sets

Data sets define the data that can be presented in visualisations within AireFrame. This page describes the different types of data sets and how to create them.

Once you have created a data set, they can be used within data pipelines which form the basis of visualisations.

WARNING

A data set cannot be edited once it has been created. If you need to make changes, you will need to create a new data set.

TIP

Data sets can only be deleted if they are not being used by any data pipelines, or composite data sets.

Source Data Sets

A source data set defines data that is extracted directly from a data source.

To create a data extract, you must provide a JSON definition that describes the data you want to extract. This is specific to the data source you are using. More information about the definitions for the bundled data source definitions are available:

AireFrame (Internal)
AireForms
AireFlow
AireGlu - JSON definition is determined by the schema returned from this user-configured endpoint

TIP

AireFrame uses the Monaco editor to provide syntax highlighting and code completion.

Use the keyboard shortcut Ctrl + Space to bring up the code completion menu.

Subject Identifier

For some data sources you will need to specify what subject identifier should be passed down to the data source to extract the data.

The options are:

Internal - AireFrame's internal UUID-based identifier
External - The identifier received from the subject provider
Custom Field - Any required custom field (configurable on the subject configuration page).

WARNING

If you use a custom field as the subject identifier, you must ensure that the field is populated for the subject(s) being viewed. An error will occur if the field is not populated.

Composite Data Sets

A composite data set allows you to combine data from multiple source data sets into a single data set. This is useful when you want to create visualisations over data from multiple sources.

TIP

Composite data sets can only be created from cacheable source data sets.

It is important to note that no ordering is applied to the data points in a composite data set. The displayed order is determined by the visualisation configuration that uses the data set.

Stacked Data Sets

A stacked data set combines multiple data sets by 'stacking' the data points on top of each other.

To do this you define stacked fields, which contain one or more related fields from the source data sets. All fields within a stacked field must have the same data type.

Example

Given these two data sets:

A1	A2
1	"A"
2	"B"

B1	B2
101	"Y"
102	"Z"

We define two stacked fields:

Field 1 with fields A1, B1
Field 2 with fields A2, B2

We would get the following stacked data set:

Field 1	Field 2
1	"A"
2	"B"
101	"Y"
102	"Z"

Linked Data Sets

A linked data set allows you to link related data points across multiple data sets. For example, you may want to show information about a form submission alongside a task for that form submission.

TIP

Data points are always linked by subject, it is not possible to link data points across subjects.

To do this you must define a pipeline for each source data set. There are a few requirements for the pipeline:

Each pipeline must contain at least one aggregator to ensure only a single data point is returned from each source data set for linking.
Each pipeline must return a unique set of field keys to ensure there is no overlap between the data points returned from each source data set.
- All output field keys are prefixed with the source data set key

Example

Given these two data sets:

1	2
1	"A"
2	"B"

1	2
101	"Y"
102	"Z"

We define a pipeline for each data set:

For data set A, we use the max by aggregator based on column 1
For data set B, we use the min by aggregator based on column 1

We would get the following linked data set:

A_1	A_2	B_1	B_2
2	"B"	101	"Y"

There is a further option to group by correlation id. For this to work correctly, the correlation ids must match across the data sets.

Correlation Id Grouping Example

Given these two data sets:

Correlation Id	1	2
"c1"	1	"A"
"c2"	2	"B"

Correlation Id	1	2
"c1"	101	"Y"
"c1"	102	"Z"
"c2"	102	"Z"

We define a pipeline for each data set:

For data set A, we use the max by aggregator based on column 1
For data set B, we use the min by aggregator based on column 1

We would get the following linked data set:

Correlation Id	A_1	A_2	B_1	B_2
"c1"	1	"A"	101	"Y"
"c2"	2	"B"	102	"Z"

Data Sources

AireGlu Data Source

Endpoints

SubjectRead

StructuredData

Data Sets

Source Data Sets

Subject Identifier

Composite Data Sets

Stacked Data Sets

Linked Data Sets

AireGlu Data Source

Endpoints

SubjectRead

StructuredData

Data Sets ​

Source Data Sets ​

Subject Identifier ​

Composite Data Sets ​

Stacked Data Sets ​

Linked Data Sets ​

Data Sets

Source Data Sets

Subject Identifier

Composite Data Sets

Stacked Data Sets

Linked Data Sets