TIMELINE_COLUMN - TIMELINE_TABLE
Description
The TIMELINE_COLUMN and TIMELINE_TABLE functions create a continuous timeline for a given time unit (e.g., days). They can be used to aggregate values from one or multiple tables onto one common time axis. This makes it easy to show the trend of a KPI over a period of time.
The TIMELINE functions create a temporary table in the background (the timeline table) with all timestamps of the specified unit in a defined range. The difference is that TIMELINE_COLUMN returns the timestamp column of the temporary timeline table, and TIMELINE_TABLE references the full timeline table. Therefore, the use of the functions depends on whether a column or a table is required. As an example, inside PU-functions or for COUNT_TABLE, the TIMELINE_TABLE function must be used.
It is possible to specify a column based partitioning, in which case the time range is added for each partition.
The timeline table is always sorted by the timestamp column. If a partition is specified, the timestamps within each partition are sorted.
Syntax
TIMELINE_COLUMN and TIMELINE_TABLE have the same arguments:
TIMELINE_COLUMN ( timeunit, time_columns [, partition ] [, time_range] )
TIMELINE_TABLE ( timeunit, time_columns [, partition ] [, time_range] )
timeunit:
YEARS
,QUARTERS
,MONTHS
,WEEKS
,DAYS
,HOURS
,MINUTES
time_columns: Zero or more timestamp columns, which should participate in the timeline table. They are propagated to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE in the same query.
partition:
PARTITION [ partition_configuration ] BY ( partition_column AS alias, ... )
partition_column AS alias: One or more optional columns to specify the partitioning of the timeline. The alias is needed to be able to access the respective partition column in the timeline table.
partition_configuration: One of the following options:
FIRST_OCCURRENCE TO LAST_OCCURRENCE
. For every partition, adds missing timestamps between the first and last value of the partition (default).FIRST_OCCURRENCE TO TIMELINE_END
. For every partition, adds missing timestamps between the first value of the partition and the end of the timeline.TIMELINE_START TO LAST_OCCURRENCE
. For every partition, adds missing timestamps between the start of the timeline and the last value of the partition.TIMELINE_START TO TIMELINE_END
. For every partition, adds missing timestamps in the full time range of the timeline.
time_range: Manually sets the time range for the timeline using DATE constants. One of:
FROM ( date_constant )
: Specifies the start of the timeline.TO ( date_constant )
: Specifies the end of the timeline.FROM ( date_constant ), TO ( date_constant )
: Specifies start and end of the timeline.
TIMELINE result
TIMELINE_COLUMN and TIMELINE_TABLE both create a temporary timeline table, which contains one timeline column. This timeline column contains all distinct rounded timestamps inside the given range. Inside the range, no timestamp based on the time unit is missing. Timestamps that are not part of the input column(s) are filled in automatically. All rows are distinct.
Partitioning
The partition columns specify for which groups the timeline is created. The timeline for each group is added to the timeline table. If no time range is provided, the start and end timestamp of the timeline are determined by the minimum and maximum timestamp among all groups (i.e. the smallest and largest timestamp among all contributing timestamp columns).
The timeline table stays sorted by the timestamps in the partitions. Because of this, window functions like LEAD and LAG can be used without having to specify an ORDER BY. The partition configuration specifies which values are filled in for each partition.
Row limit
The overall maximum number of output rows is limited to 100 million rows. When using PARTITION BY with a lot of groups, this limit can be reached easily since the timeline is added for each group. If the generated table exceeds this limit, the TIMELINE query needs to be adapted by using the FROM and TO options to further restrict the timeline, or by choosing a different time unit.
Configuration Propagation
The TIMELINE configuration is propagated to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE in the same query. Some configurations can be conflicting, which would result in different timeline tables. Other configurations are not conflicting and will be merged.
Conflicting configurations
Conflicting configurations lead to different timeline tables if different conflicting configurations occur inside the same query. Since different timeline tables cannot be joined, this usually leads to "No common table" errors. Therefore, inside one query, it is best practice to specify the configuration only once inside one TIMELINE_COLUMN or TIMELINE_TABLE occurrence. The configuration propagation will propagate this configuration to all other occurrences of TIMELINE_COLUMN and TIMELINE_TABLE.
The following configurations are conflicting:
timeunit: It is not possible to have different time units in the same timeline table.
partition: The PARTITION BY statement needs to be written completely once; different PARTITION BYs from different TIMELINE_COLUMN or TIMELINE_TABLE occurrences will not be merged.
time_range: It is not possible to have different time ranges in the same timeline table.
Merge-able configurations
This configuration is merge-able:
time_columns: All occurring timestamp columns from different TIMELINE_COLUMN and TIMELINE_TABLE occurrences in the same query will be merged. This means that all of those time columns contribute to the same timeline table.
NULL handling
NULL values in the input timestamp columns behave as if they were not there. That means that they do not influence the range of the timeline, and there will be no NULL value inside the timeline column. When all input columns are empty or all values of all input columns are NULL, then the resulting timeline has zero rows, except when a range is given by the FROM and TO parameters.
This also holds when the timeline is partitioned: Partitions where the range for that partition cannot be calculated because there are no (non-null) timestamp values in the input timestamp column, and there is no time range given by FROM and TO, the partition will not be part of the resulting timeline table.
Nesting
It is not allowed to nest multiple TIMELINE_COLUMN or TIMELINE_TABLE occurrences, i.e. it is not allowed to apply TIMELINE_COLUMN or TIMELINE_TABLE on another timeline table.
Joins
The timeline table cannot be joined to the rest of the Data Model. But the TIMELINE_TABLE function can be used as a target inside a PU-function to aggregate values from the data model onto the timeline.
Usage of TIMELINE_TABLE inside PU-functions
The timeline table is a table with distinct rows. Therefore, it is possible to use TIMELINE_TABLE inside PU-functions (similar to DOMAIN_TABLE) to aggregate values of the Data Model onto the timeline. This enables the calculation of different KPIs on the same timeline.
When using TIMELINE_TABLE as a target inside a PU-function, a timestamp column must be set inside the TIMELINE_TABLE call. The source values will be aggregated based on the timestamps of this timestamp column. If multiple timestamp columns are specified, the first one is taken as the target reference.
Filter propagation
Filters on tables from the Data Model are not propagated to the timeline table to ensure a continuous time axis even if filters are set. It is also not possible to use filter-aware functions (like FILTER_TO_NULL) inside any input argument of TIMELINE_COLUMN and TIMELINE_TABLE.
However, FILTER_TO_NULL can be used inside the PU-functions to only aggregate values that fulfill the filter condition to the timeline. By default, PU-functions ignore filters.
Filter propagation from the timeline table to the other tables from the Data Model is currently not supported. Filters which are applied directly to the timeline table only affect the timeline table itself, as well as potentially other temporary tables that are derived from the timeline table, e.g. DOMAIN_TABLEs.
[1] Missing timestamps are filled in and the timestamps are rounded to the given time unit, i.e. days in this case. Since no time range is given, the minimum and maximum timestamp are the start and end date: | |||||||||||||||
| |||||||||||||||
|
[2] Using the time range one can crop and extend the timeline between two given dates: | ||||||||||||||
| ||||||||||||||
|
[3] This example shows how to calculate a KPI on a timeline using a PU-function. The query calculates the number of activities per day. Days without any activities are also contained in the result. The example also shows how the configuration propagation works: In the first dimension, which returns the timeline column, only the time unit DAYS is specified. In the second dimension, the origin timestamp column is specified inside TIMELINE_TABLE. In the background, one single timeline table is created from both specifications: | ||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||
|
[4] This example shows how KPIs from different tables can be brought to the same timeline. The Data Model contains two Activity tables, which are not connected. For each Activity table, the number of activities per day is calculated. The query is similar to the one from the previous example, the only difference is that now, there is one additional dimension which calculates the KPI for the second Activity table. Since the timestamp columns specified in both TIMELINE_TABLE occurrences are merged into one configuration, both timestamp columns contribute to the same timeline table. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
[5] In this simple example, the timeline table is partitioned by the Country. For each partition, the timeline column contains all timestamps between the smallest and largest timestamp from that partition. Since the TIMELINE_TABLE occurrence is not used as a target table inside a PU-function, but used to access the partition column of the TIMELINE_TABLE, it is not necessary to specify a timestamp column inside the TIMELINE_TABLE function. All configurations are placed inside the TIMELINE_COLUMN function, and are propagated to the TIMELINE_TABLE function: | ||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||
|
[6] Same scenario as above, but with the Since the end of the time range is not specified using the | ||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||
|
[7] This example shows that a filter on a table from the Data Model is not propagated to the timeline table. Even though the FILTER statement filters on the source timestamp column of the timeline, the timeline column still shows the full time range. By using FILTER_TO_NULL inside the PU_COUNT, the filter is still taken into account in the actual PU_COUNT calculation, which returns 0 for values that do not pass the filter condition: | |||||||||||||||||||||
| |||||||||||||||||||||
|
[8] This example illustrates the behavior of PU-functions when aggregating on a TIMELINE that is created from two timestamp columns of two connected tables. In the example query, the first column returns the timeline column of the timeline table. Inside the TIMELINE_COLUMN statement, the time unit to be used is specified (DAYS). The participating timestamp columns are specified in the TIMELINE_TABLE calls from the following four dimensions. These four dimensions, all created by PU_SUM statements, behave as follows:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|