Skip to main content

EXCEPT and EXCEPT ALL Operators

The EXCEPT operator returns the rows from the first query that are not present in the result of the second query. It is used to perform set difference operations between two result sets.

EXCEPT ALL returns all rows from the first query that are not present in the second query, including duplicates. This means if a row appears N times in the first query and M times in the second, it will appear max(N-M, 0) times in the result.

Note: EXCEPT ALL is supported in Spark-based engine, but is not available in Vertica.

Syntax

query1
EXCEPT [ALL]
query2
  • query1: The first SELECT statement.

  • query2: The second SELECT statement.

  • ALL (optional): If specified, preserves duplicate rows (available in Spark-based engine only).

Parameters

  • Both query1 and query2 must return the same number of columns, and corresponding columns must have compatible data types.

Returns

  • Returns the set of rows from query1 that are not in query2.

  • If ALL is specified (available in Spark-based engine only), duplicate rows are preserved according to set difference semantics.

  • Column names in the result are taken from the first query.

Limits / Notes

  • The number and types of columns must match in both queries.

  • EXCEPT ALL is not supported in Vertica. Attempting to use it in these environments will result in a SQLValidation error.

  • EXCEPT without ALL removes duplicates from the result (set semantics).

  • NULL values are treated as equal when comparing rows.

Examples

Basic EXCEPT usage

-- Given these tables:
-- table_a: id | name
--          1  | 'Alice'
--          2  | 'Bob'  
--          3  | 'Charlie'

-- table_b: id | name
--          2  | 'Bob'
--          4  | 'David'

> SELECT id, name FROM table_a
  EXCEPT
  SELECT id, name FROM table_b;

Result:

id | name
1  | Alice
3  | Charlie

EXCEPT with duplicates (both engines)

-- Given:
-- orders_2023: customer_id | product_id
--              1          | 100
--              1          | 100  -- duplicate
--              2          | 200
--              3          | 300

-- orders_2024: customer_id | product_id  
--              1          | 100
--              4          | 400

> SELECT customer_id, product_id FROM orders_2023
  EXCEPT
  SELECT customer_id, product_id FROM orders_2024;

Result:

customer_id | product_id
2           | 200
3           | 300

Note: The duplicate (1, 100) is removed in the result.

EXCEPT ALL usage (extended syntax)

The EXCEPT ALL operator is supported in Spark-based engine. Unlike a standard EXCEPT, which returns only distinct rows, EXCEPT ALL retains duplicate rows based on the number of occurrences in the first set minus the occurrences in the second set.

Example in Spark-based engine:

-- Returns rows from the first table that are not in the second, preserving duplicates
SELECT customer_id, product_id FROM orders_2023
EXCEPT ALL
SELECT customer_id, product_id FROM orders_2024;

Result:

customer_id | product_id
1           | 100        -- One instance remains (2 in first - 1 in second = 1)
2           | 200
3           | 300

Note: The EXCEPT ALL operator is not supported in Vertica. In these environments, use the standard EXCEPT operator (which returns distinct rows) or utilize a LEFT ANTI JOIN for logic requiring duplicate retention.

EXCEPT with multiple columns

-- employees_dept_a: emp_id | name     | salary
--                   1     | 'John'   | 50000
--                   2     | 'Jane'   | 60000
--                   3     | 'Mike'   | 55000

-- employees_dept_b: emp_id | name     | salary
--                   2     | 'Jane'   | 60000
--                   4     | 'Sarah'  | 65000

> SELECT emp_id, name, salary FROM employees_dept_a
  EXCEPT  
  SELECT emp_id, name, salary FROM employees_dept_b;

Result:

emp_id | name | salary
1      | John | 50000
3      | Mike | 55000