ZSCORE
Description
This function calculates the z-score over an INT or a FLOAT. The output type is always FLOAT.
ZSCORE
can act as a standardization of data by mapping each value to the distance to the mean in multiples of standard deviations. This is especially useful in evaluating simple 2-, 3- or 6-sigma rules for outlier detection on a column.
Syntax
ZSCORE ( table.column [, PARTITION BY ( partition_column, ... )] )
column: The source column where rows are taken from.
partition_column: Optional partition column to specify groups in which
ZSCORE
should operate.
Tips
The
ZSCORE
of large FLOAT columns can have a large error.ZSCORE
over DATE columns should utilize a DateTime projection function.
Constants
If the input column contains only a single value, ZSCORE
returns NULL
NULL handling
If the input column contains NULL values, they are ignored and do not affect the calculation of the mean or standard deviation. The output for a NULL input value is NULL as well.
Partitioning
The partition columns specify groups. The ZSCORE
function operates independently within every group.
Examples
[1] Simple | ||||||||||||||||
| ||||||||||||||||
|
[2] | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
|
[3] In this example, the input column contains NULL values. NULL values are ignored and do not affect the calculation of the z-score. The output for a NULL input value is NULL as well: | ||||||||||||||||
| ||||||||||||||||
|
[4] | ||||||||||||||||||||||
| ||||||||||||||||||||||
|
[5] | ||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||
|
[6] | ||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||
|
[7] For a single input value, NULL is returned: | ||||||||
| ||||||||
|