Conversion Error After Vertica Update
Problem
The following errors might occur when executing your transformation scripts (even if they have worked before):
ERROR: Could not convert "A" from column X to a float8
or
ERROR: Could not convert "" from column X to an int8
Root cause
This is due to the fact that for some teams the Vertica version was updated, from v9.3.1-0
to v10.1.0-0
, which has a different execution order for conversions.
For example we have this code:
SELECT * FROM CDPOS WHERE CDPOS.FNAME = 'BMENG' AND TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0
And our data looks like this:
Until now our code was working, because first it filtered to 'BMENG' which is always a number, and then did the conversion.
But in the new Vertica version this is the other way around, first it tries to convert all values, also 'CPMPM' to a number, and only then it filters to 'BMENG'.
The error also occurs if we create a view and then use the view to do the conversion (the same with WITH clause as well):
CREATE VIEW CDPOS_V AS SELECT * FROM CDPOS CDPOS.FNAME = 'BMENG' ; SELECT * FROM CDPOS_V TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0 ;
Solution
Important
A common problem with long running transformations and slow data model loads is due to missing statistics in Vertica. This issue can be resolved by adding the Vertica ANALYZE_STATISTICS statement directly in the SQL. For more information, refer to Vertica Transformations Optimization.
How to resolve this:
I. When there is an INNER JOIN (=JOIN) in the query: move the WHERE condition to the JOIN
SELECT * FROM ( SELECT * FROM CDPOS CDPOS.FNAME = 'BMENG' ) AS CDPOS INNER JOIN CDHDR AS CDHDR ON CDPOS.MANDANT = CDHDR.MANDANT AND CDPOS.OBJECTCLAS = CDHDR.OBJECTCLAS AND CDPOS.OBJECTID = CDHDR.OBJECTID AND CDPOS.CHANGENR = CDHDR.CHANGENR AND TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0;
II. If there is no JOIN, but the error occurs when using a VIEW: convert the VIEW to a TABLE
DROP TABLE IF EXISTS CDPOS_V; CREATE TABLE CDPOS_V AS SELECT * FROM CDPOS CDPOS.FNAME = 'BMENG' ; SELECT * FROM CDPOS_V WHERE TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0 ;
III. If none of the above applies: use LOCAL TEMPORARY TABLE
CREATE LOCAL TEMPORARY TABLE _TMP_CDPOS ON COMMIT PRESERVE ROWS AS SELECT * FROM CDPOS CDPOS.FNAME = 'BMENG'; SELECT * FROM _TMP_CDPOS as CDPOS TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0;
or create and drop a temporary table:
DROP TABLE IF EXISTS _TMP_CDPOS; CREATE TABLE _TMP_CDPOS AS SELECT * FROM CDPOS CDPOS.FNAME = 'BMENG'; SELECT * FROM _TMP_CDPOS AS CDPOS TO_NUMBER(CDPOS.VALUE_OLD) + TO_NUMBER(CDPOS.VALUE_NEW) > 0; DROP TABLE IF EXISTS _TMP_CDPOS;
IV. Solution to the error in the standard O2C _CEL_O2C_DELIVERY_DATES transformation:
CREATE TABLE TMP_CDPOS_BMENG AS (SELECT * FROM CDPOS WHERE CDPOS.FNAME = 'BMENG'); ------------------------------------ SELECT CDPOS.MANDANT, CDPOS.TABKEY, CDPOS.VALUE_OLD, CDPOS.VALUE_NEW, CDPOS.FNAME, CDPOS.CHANGENR, CDHDR.OBJECTCLAS, CDHDR.OBJECTID, CAST(CDHDR.UDATE AS DATE) || ' ' || CAST(IFNULL(CAST(CDHDR.UTIME AS TIME), '23:59:59') AS TIME) AS EVENTTIME, ROW_NUMBER() OVER(PARTITION BY CDPOS.MANDANT, CDPOS.TABKEY ORDER BY TO_NUMBER(CDPOS.CHANGENR) DESC) AS ROWNUM FROM TMP_CDPOS_BMENG AS CDPOS LEFT JOIN CDHDR AS CDHDR ON 1=1 AND CDPOS.MANDANT = CDHDR.MANDANT AND CDPOS.OBJECTCLAS = CDHDR.OBJECTCLAS AND CDPOS.OBJECTID = CDHDR.OBJECTID AND CDPOS.CHANGENR = CDHDR.CHANGENR WHERE TO_NUMBER(CASE WHEN CDPOS.VALUE_NEW LIKE '%-' THEN CONCAT('-',REPLACE(LTRIM(CDPOS.VALUE_NEW),'-','')) ELSE CDPOS.VALUE_NEW END) > 0