EXCEPT and EXCEPT ALL Operators
The EXCEPT operator returns the rows from the first query that are not present in the result of the second query. It is used to perform set difference operations between two result sets.
EXCEPT ALL returns all rows from the first query that are not present in the second query, including duplicates. This means if a row appears N times in the first query and M times in the second, it will appear max(N-M, 0) times in the result.
Note:
EXCEPT ALLis supported in Spark-based engine, but is not available in Vertica.
Syntax
query1 EXCEPT [ALL] query2
query1: The first SELECT statement.query2: The second SELECT statement.ALL(optional): If specified, preserves duplicate rows (available in Spark-based engine only).
Parameters
Both
query1andquery2must return the same number of columns, and corresponding columns must have compatible data types.
Returns
Returns the set of rows from
query1that are not inquery2.If
ALLis specified (available in Spark-based engine only), duplicate rows are preserved according to set difference semantics.Column names in the result are taken from the first query.
Limits / Notes
The number and types of columns must match in both queries.
EXCEPT ALLis not supported in Vertica. Attempting to use it in these environments will result in aSQLValidationerror.EXCEPTwithoutALLremoves duplicates from the result (set semantics).NULL values are treated as equal when comparing rows.
Examples
Basic EXCEPT usage
-- Given these tables: -- table_a: id | name -- 1 | 'Alice' -- 2 | 'Bob' -- 3 | 'Charlie' -- table_b: id | name -- 2 | 'Bob' -- 4 | 'David' > SELECT id, name FROM table_a EXCEPT SELECT id, name FROM table_b;
Result:
id | name 1 | Alice 3 | Charlie
EXCEPT with duplicates (both engines)
-- Given: -- orders_2023: customer_id | product_id -- 1 | 100 -- 1 | 100 -- duplicate -- 2 | 200 -- 3 | 300 -- orders_2024: customer_id | product_id -- 1 | 100 -- 4 | 400 > SELECT customer_id, product_id FROM orders_2023 EXCEPT SELECT customer_id, product_id FROM orders_2024;
Result:
customer_id | product_id 2 | 200 3 | 300
Note: The duplicate (1, 100) is removed in the result.
EXCEPT ALL usage (extended syntax)
The EXCEPT ALL operator is supported in Spark-based engine. Unlike a standard EXCEPT, which returns only distinct rows, EXCEPT ALL retains duplicate rows based on the number of occurrences in the first set minus the occurrences in the second set.
Example in Spark-based engine:
-- Returns rows from the first table that are not in the second, preserving duplicates SELECT customer_id, product_id FROM orders_2023 EXCEPT ALL SELECT customer_id, product_id FROM orders_2024;
Result:
customer_id | product_id 1 | 100 -- One instance remains (2 in first - 1 in second = 1) 2 | 200 3 | 300
Note: The EXCEPT ALL operator is not supported in Vertica. In these environments, use the standard EXCEPT operator (which returns distinct rows) or utilize a LEFT ANTI JOIN for logic requiring duplicate retention.
EXCEPT with multiple columns
-- employees_dept_a: emp_id | name | salary -- 1 | 'John' | 50000 -- 2 | 'Jane' | 60000 -- 3 | 'Mike' | 55000 -- employees_dept_b: emp_id | name | salary -- 2 | 'Jane' | 60000 -- 4 | 'Sarah' | 65000 > SELECT emp_id, name, salary FROM employees_dept_a EXCEPT SELECT emp_id, name, salary FROM employees_dept_b;
Result:
emp_id | name | salary 1 | John | 50000 3 | Mike | 55000