Skip to main content

SQL troubleshooting

If you're experiencing SQL issues with your transformation tasks, the following troubleshooting advice is available:

Delta extractions execute the merging of the delta chunks to their respective tables in the background. This process executes a DELETE statement to ensure that records that exist both in the table and the new delta chunk are cleared from the main table before being inserted, to avoid duplicates. In larger tables, and in cases where the delta chunk is too large, this operation may become too costly and does not complete in the maximum time allocated.

Suggested solutions

  • Always run an initial full extraction of a table before proceeding with scheduling/running any delta extractions on that table.

  • Ensure that delta extractions are scheduled to run frequently (at least once a day), so as to avoid the background operations being run on larger delta chunks.

  • Avoid running delta extraction tasks for tables that have not recently been extracted (>2 days), either fully or by delta.

This error may appear in queries with multiple joined tables, specifically USR02. The table joins are performed in such an order that the memory needed for the execution exceeds the allocated/available resources.

Suggested solutions

Users need to ensure that all temporary join tables used by the query have Table Statistics refreshed by including the following commands prior to the failing query or at the transformation point that defines each table:

SELECT ANALYZE_STATISTICS(‘TABLE_NAME’);

Source tables that have been extracted through Celonis will have Table Statistics refreshed during their extraction.

The data model load is invoking the materialization of large and costly/complex views or the the size of the tables is too big.

Suggested solutions

  • Identify data model views that involve large source tables (e.g. VBAP). As these are defined as views, their definition script needs to run every time they are invoked (e.g. during DM Load).

  • Check whether these views are using other views in their definition script. Invoking a view that uses a view is a very inefficient way of creating a data set, as sorted projections and statistics are not being used.

  • Consider whether some of these larger data model views (or the views they are using) can be defined as tables instead.

  • Consider using staging tables as described in our Vertica SQL Performance Optimization Guide. By using a view to bring together original table fields along with a staging table containing calculated fields, the performance of the DM load will improve and lower APC consumption can also be achieved.

  • Ensure that all Data Model View definitions are properly optimized based on the guidance of the Vertica SQL Performance Optimization Guide.

  • If your table is extremely large, consider limiting the number of columns in the Data Model View/Table. Is a SELECT * from a transactional table necessary or can the script be limited to only dimensions needed for the analyses?

The error may appear in transformations that are performing de-duplication using the ROW_NUMBER function. Usually, in the first step, the table is created and contains the NUM column (used for deleting the duplicates). After deleting the duplicates, when you try to drop the NUM table, you might get the error, due to a Vertica projection dependency.

Suggested solutions

The solution for this is to place the UNSEGMENTED ALL NODES in all statements that are creating tables with Row number column generated by ROW_NUMBER function. That would ensure that the RowNumber column is not part of segmentation and can be dropped.

This error may appear when a set of queries including an ANALYZE_STATISTICS() command is executed within the Transformation Workbench but not when run through a Data Job execution.

Suggested solutions

When running ANALYZE_STATISTICS() in the workbench, it has to be an independent operation. Instead of selecting all statements and executing them at the same time, users should execute the ANALYZE_STATISTICS() statement on its own.

For example:

  • DROP and CREATE table

  • Run SELECT ANALYZE_STATISTICS(‘TABLE_NAME’);

This error may appear if you are either performing a transformation using a CREATE TABLE statement or UPDATE statement with a considerably large resulting data set and/or a complex script (ie. multiple joins on large tables, expensive commands).

Suggested solutions

  • When creating the table, limit the query to a smaller data chunk.

  • Run additional INSERT statements to the new table, each filtering for a different chunk.

  • Consider limiting the query through a condition, we suggest selecting a field that appears towards the beginning of the table field list (ie. the table projection is sorted by that field) and whose values are close to evenly distributed across the data set. Such fields may be date or document type.

This error occurs when a data job performing DDL operations on a table runs in parallel. Ensure that data jobs performing the DDL operation on the same table are scheduled in different time frames.

Suggested solutions

  • Review query and note tables or tables in views that are being used.

  • Locate data jobs/transformations in the ETL pipeline that performs DDL operations on these objects.

  • Ensure there is no overlap between the query throwing the error and the schedules that is performing the DDL operations.

Memory errors, often sporadic, are primarily due to resource saturation within our dynamic, shared environment. Even without changes to transformations, unexpected spikes in system workload from large queries or concurrent operations can temporarily limit memory availability.

Suggested solutions

  • If the error occurs in transformation or data model load: Optimize and simplify queries to reduce overall memory requirements based on best practices for query optimization.

  • If the error occurs during the extraction process: Ensure that the feature VarcharOptimization V2 is activated.

  • Alternatively, set up the maximum allowed string length for each of the affected table(s) at the JDBC extraction task.

This error may appear when an object used in the transformation, either a table or a view, is not found in the Vertica database. These errors usually point to orchestration issues and misconfigurations.

Suggested solutions

  • Review the pipeline and the script to ensure the query is pointing to the correct schema.

  • Identify the transformation where the missing object is created or dropped.

  • Ensure tasks are scheduled in an appropriate sequence so the object is available when the query is executing.

This error may show up in a query if a function is used without the proper number or type of parameters that are required.

Suggested solutions

This error may show up in a query if a function is used that is either not available in Vertica or is not whitelisted by Celonis for use.

Suggested solutions

  • Review the function being used and ensure that the function is available in Vertica and all parameters needed are properly defined. For more information, check the Vertica Functions Documentation.

  • If the function is available in Vertica, check against the lists of supported Case-Centric and Object-Centric (OCPM) functions in Celonis.