Configuration
The sink offers a set of configuration keys alongside the Kafka Connect defaults for convertors, consumer settings, and so on.
Here is the full list:
Key | Description | Type | Required | Default | Version |
---|---|---|---|---|---|
connect.ems.allow.null.pk | Allow messages with null values in the columns listed as primary keys. If disabled, the connector will fail after receiving such a message. NOTE: Enabling this may cause data inconsistencies on the Celonis Platform side. | BOOL | NO | false | From 1.7.1 |
connect.ems.authorization.key | Contains the Celonis Platform API Authorization header. It should be AppKey <<app-key>> or Bearer <<api-key>>. | STRING | YES | null | |
connect.ems.client.id | An optional parameter representing the client's unique identifier | STRING | NO | null | |
connect.ems.commit.interval.ms | The time interval in milliseconds to upload the data to Celonis Platform if the other two commit policies are not yet applicable. It cannot be less than 1 second. | LONG | YES | null | |
connect.ems.commit.records | The maximum number of records in the accumulated file before it is uploaded to Celonis Platform. | INT | YES | null | |
connect.ems.commit.size.bytes | The accumulated file maximum size before it is uploaded to Celonis Platform. It cannot be less than 100 kb. A file will be uploaded if the other commit policies are triggered. A file smaller than 1MB can still be uploaded if the records count, the time interval, or a schema change comes first. | LONG | YES | System temp directory | |
connect.ems.connection.id | Optional parameter. It represents the unique Celonis Platform connection identifier. | STRING | NO | null | |
connect.ems.convert.decimals.to.double | Currently Celonis Platform doesn't support ingestion of decimal in parquet file. This flag enables conversion from decimal to double (float64). | BOOL | NO | false | |
connect.ems.data.fallback.varchar.length | An optional parameter representing the STRING (VARCHAR) length when the schema is created in Celonis Platform. Optional parameter representing the STRING (VARCHAR) length when the schema is created in Celonis Platform. This value must be between 1 and 65000. | STRING | NO | null | |
connect.ems.data.primary.key | Optional parameter to contain a list of comma separated columns which are primary keys for the Celonis Platform table. If not specified, no primary key will be used, unique will not be enforced and the data will not be deduplicated. | STRING | NO | null | |
connect.ems.debug.keep.parquet.files | For debugging purposes, set the setting to true for the connector to keep the files after they were uploaded. | BOOL | NO | false | |
connect.ems.embed.kafka.metadata | Include Kafka metadata fields (kafkaOffset, kafkaPartition, and kafkaTimestmap) in the target Celonis Platform table. | BOOL | YES | true | |
connect.ems.endpoint | Contains the Celonis Platform API endpoint in the form of: https://[team].[realm].celonis.cloud/continuous-batch-processing/api/v1/[pool-id]/items | STRING | YES | null | |
connect.ems.error.policy | Specifies the action to be taken if an error occurs while inserting the data. There are three options:
All errors will be logged automatically, even if the code swallows them. | STRING | NO | THROW | |
connect.ems.explode.mode | When each incoming record is a list of records, this will explode (flatten) the records on output. The possible values are:
| STRING | NO | NONE | |
connect.ems.flattener.collections.discard | Discard array and map fields. Default behavior is to transform them into JSON-encoded strings. | BOOL | NO | false | |
connect.ems.flattener.enable | Enable message flattening transformation. This has to be set to true when source topic contains nested data. | BOOL | NO | true | |
connect.ems.flattener.jsonblob.chunks | The number of string chunks the input record should be JSON encoded into. The byte-size of each JSON-encoded chunk is driven by the connect.ems.data.fallback.varchar.lengthparameter, which needs to be supplied in order for this configuration key to be accepted. | INT | NO | null | |
connect.ems.inmemfs.enable | Rather than writing to the host file system, buffer parquet data files in memory | BOOL | NO | false | |
connect.ems.max.retries | The maximum number of times to re-attempt to write the records before the task is marked as failed. | INT | NO | 10 | |
connect.ems.obfuscation.fields | An optional value for comma-separated fields to obfuscate. It supports nested values, including arrays. | STRING | NO | null | |
connect.ems.obfuscation.method | The connector offers 3 types: fix, sha1, and sha512. When a fix is used, the string values are transformed to:***** . For SHA512 a salt is required. See connect.ems.obfuscation.sha512.salt | STRING | NO | fix | |
connect.ems.obfuscation.sha512.salt | Required only when connect.ems.obfuscation.method is set to sha512 and obfuscation fields have been set. If no obfuscation fields have been provided, this configuration is ignored. | STRING | NO | null | |
connect.ems.order.field.name | Optional parameter used only when primary keys are set. It needs to be a sortable field, present in the incoming data, to allow record deduplication for records sharing the same primary key(s). For details, see the Primary Key(s) section. | STRING | NO | null | |
connect.ems.parquet.row.group.size.bytes | The row group size, in bytes, of the rows in the generated parquet files. | INT | NO | 1048576 | From 1.7.2 |
connect.ems.row.size.bytes | Constrained to MIN(row.group.size.bytes, commit.size.bytes) | INT | NO | null |
|
connect.ems.pool.explicit.close | Connection pool - Explicitly close connections on completion of request. | BOOL | NO | false | |
connect.ems.pool.keepalive | Connection pool - Number of milliseconds to keep connection alive. | LONG | NO | 300000 | |
connect.ems.pool.max.idle | Connection pool - Maximum number of idle connections to allow. | INT | NO | 5 | |
connect.ems.proxy.auth.password | The password for proxy authentication, if a proxy is required to access external services. | STRING | NO | null | |
connect.ems.proxy.auth.type | If a proxy is required to access external services, the type of proxy to use. There is currently one available option:
| STRING | NO | null | |
connect.ems.proxy.auth.username | The username for proxy authentication, if a proxy is required to access external services. | STRING | NO | null | |
connect.ems.proxy.host | The hostname of the proxy server, if a proxy is required to access external services. | STRING | NO | null | |
connect.ems.proxy.port | The port number of the proxy server, if a proxy is required to access external services. | INT | NO | null | |
connect.ems.retry.interval | The interval to wait between retries when using RETRY mode. | LONG | NO | 1000 | |
connect.ems.sink.put.timeout.ms | The maximum time (in milliseconds) for the connector task to complete the upload of a single Parquet file before being flagged as failed. NoteThis value should always be lower than max.poll.interval.ms. | LONG | NO | 288000 | From 1.8.1 |
connect.ems.target.table | The table in Celonis Platform to store the data. | STRING | YES | null | |
connect.ems.tmp.dir | The folder stores the temporary files as it accumulates data. If not specified then it uses System.getProperty("java.io.tmpdir"). | STRING | NO | System temp directory |