Connect with SAP for data extraction
The SAP connection offers several options to configure the extractions. These options are accessible using the data connection settings.
Before you begin:
To extract data from SAP to Celonis Platform, make sure you've already installed the extraction client, the RFC module and set up the technical user with relevant permissions in your SAP instance. If this sounds like a lot, check Continuous extraction.
Creating a data connection between SAP and the Celonis Platform
You can now create a data connection between SAP and the Celonis Platform from your data pool diagram:
Click Data Connections.
Click Add Data Connection and select Connect to Data Source.
Select the source type based on this table:
S/4HANA
SAP ECC
SAP ECC 5.0
SAP ECC 4.6C
Supported Version
All
minimum SAP ECC 6 EHP 4
only SAP ECC 5.0
only SAP ECC 4.6c / 4.7
Connection type
(when creating new Data Connection)
SAP
SAP
SAP
SAP 4.6C
Note
Available based on request.
Required RFC module
Celonis_RFC_Data_Extraction
Celonis_RFC_Data_Extraction
Celonis_RFC_Data_Extraction_ECC5
Celonis_RFC_Data Extraction_ECC4.6C
Unsupported features
-
"Buffer chunks in memory for validation"
Changelog Extractions
Joins within the Extraction
connection via middleware (SAP PI/PO and Message Server)
advanced settings
Configure the following connection details:
Name: The name assigned to this connection.
Host: Hostname of the system.
System number: A two-digit code, e.g. 00.
Client: A three-digit code, e.g. 100.
User: The user that was created in Step 1.
Compression type: Choose "Native Compression" if supported (recommended), otherwise GZIP, SAPCAR or uncompressed (not recommended).
Maximum parallel table extractions: Enter the number of tables that can be extracted in parallel.
Click Test Connection and correct any issues highlighted.
If you receive an error, check your connection details, then verify that your user is not locked and the SAP system is running. If a connection can be established, you will be redirected back to the connection overview and you will see a notification that the connection has been established.
Click Save.
In most of the cases extractor will connect to SAP system directly. However, sometimes there is a middleware which mediates all connections between external services and SAP.
![]() |
SAP PI/PO
This option enables the connection using PI/PO (more information). Once it is selected, the "PI/PO Adapter" dropdown becomes available with two options: RFC or SOAP. Depending on the selected adapter, the standard or generic SOAP extractor should be used.
RFC
Select this option when the PI/PO uses RFC Adapters to connect to SAP. The standard on-premise SAP Extractor can be used with this option.
The following fields should be defined:
Gateway Host: The host of the PI/PO system to which extractor should connect.
Gateway Port: The port port of the PI/PO system to which extractor should connect.
Program ID: The program ID of the Celonis program in PI/PO.
SOAP
Select this option when the PI/PO uses SOAP Adapters to connect to SAP. In this scenario, the Generic SOAP PI/PO Extractor, should be used rather than the standard SAP Extractor.
Note
If the SOAP Adapters are used, the customer should also generate WSDL files which should becomes available be placed in a folder, preferably in the same directory as the Celonis Extractor.
The following fields become available:
Use TLS: Select this if you want to connect to the WSDL endpoints via https.
WSDL Files Directory: Enter the directory where the WSDL files have been coped (see the info above).
User: The PI/PO user for the authentication.
Password: The PI/PO user password.
Message Server
Enables connecting to an SAP server through Logon Groups (SAP Load Balancing). Using this approach the connection to a Message Server is established which is mapped to specific application servers. See Advanced configuration for more information.
Use Change Logs: Enables the Real-Time Extraction via Change Logs (more information).
Include change type/timestamp in extracted data: Extends each table with a column about the change type (insert/update) and the respective change date.
Extract in SAP foreground process when 1 chunk or fewer: Small amounts of records are extracted via "direct call" to bypass the background job queues. This speeds up the extraction times.
Chunk size: The number of entries that are contained in one chunk (default: 50,000).
It's possible to turn chunking off entirely by adding the following statement in the on-premise Extractor package:
chunked: false
in the fileapplication-local.yml
.Number of rows to store in memory: Number of rows from the joined table to store in memory (default: 10,000). This number can be lowered in case of memory issues.
SAP Job Prefix: Defines the naming convention of SAP background jobs (default: "CEL_EX_") (more information).
Run on any SAP Server: If activated, the server on which the SAP background job should be run is not specified. SAP then decides the server to run it. By default the current application server is selected. This option includes SAP high availability systems.
Buffer chunks in memory for validation (reduce the Chunk size when enabled): This option should only be enabled when there are issues with corrupt files as it slows down the extraction process.
Number of retries in case validation fails (default: 100)
Retry interval (seconds) (default: 30)
Extract Change Log data of the specified client only: When you enable this setting, the real-time extractions will only extract the data for the client that is defined in the connection. When you disable it, data from all clients is extracted.
SAP systems are usually multi-client environments, where different clients are writing to the same database and tables. However, the real-time extension triggers are client independent, meaning that they capture changes by all clients and log them in the same Change Log table. This may create complications if two separate clients want to extract from the same system, and you can use this setting if you need to avoid that type of issue.
It's also possible to enable or disable this setting by adding the following statement in the on-premise Extractor package:
clientDependent: true
(to enable) orclientDependent: false
(to disable) in the fileapplication-local.yml
.Disable auto deletion of old files in Z-CELONIS_TARGET: Disables the automated clean up of the "leftover" files from the Z_CELONIS_TARGET folder. This option should be selected if advised by your Celonis team.
Use non-chunked Change Log: When using the Change Logs for real time extraction, the data is read in a chunked manner by default to avoid deadlocks in the database. Select this option to override that setting and allow the data to be read without chunks. This option should be selected if advised by your Celonis team.
Use non-chunked Change Log cleanup: When cleaning up the data from the Change Logs, the data is cleaned up in chunks by default to avoid deadlocks in the database. Select this option to override that setting and allow the data cleanup to be done without chunks. This option should be selected if advised by your Celonis team.
Enable cold data extraction: Check this box to archive old data and free up working memory. The table is partitioned based on the age of the data, so the aged data is moved to the persistent memory and is not available unless it is explicitly invoked. Selecting this option makes the aged data available for extraction along with the current data. To extract old data, you must have RFC module 3.8.0 or later installed. For more information see, Extracting aged data.
Use SNC (SAP Secure Network Communications): Enables data encryption between the RFC module and the extractor via SNC.
Enable data extraction via staging table: This feature is supported by RFC v3.9.0 and above. Apart from this setting, no additional setup is required. The data will be stored in a database table during the extraction and is cleaned up once the extraction is complete.
Advanced SAP connection configuration
When configuring the connection between your SAP system and the Celonis Platform, the following configuration options are available:
The SAP Extractor supports connecting via an SAP Message Server. To use this functionality simply check the option 'Use Logon Group (SAP Load Balancing)' and configure the following options:
1) Enter the host name or IP of your SAP Message Server
![]() |
2) Enter the port number of your SAP Message Server (36<INSTANCE NUMBER>)
![]() |
3) Enter the Logon Group identifying your set of SAP application servers (e.g. PUBLIC)
![]() |
Finally fill in the rest of the standard required connection fields and save the connection
Preparations in the SAP system
Make sure that the firewall between the extractor server and SAP allows connections on port 48XX (where XX is the SAP system number).
Setup SNC on the SAP Server.
Save the certificate for your SAP Server's SNC PSE.
Preparations on the extractor server
To create our client PSE, from the extractor installation directory run the 'snc_create_pse.sh' script, providing your desired distinguished name and PSE password, e.g.:
./snc_create_pse.sh "CN=<YOUR_CHOSEN_CN>, OU=IT, O=CELONIS, C=DE" <your_chosen_password>
Now add the certificate from step 2 above by running the 'snc_add_pse_cert.sh', providing the same PSE password provided when generating it e.g.:
./snc_add_pse_cert.sh ~/IDES.crt <your_chosen_password>
As part of step 1 a client certificate is also generated named 'RFC.crt'. Import it into your SNC PSE on the SAP system (Blog → Import Client Certificate to Server PSE)
Start the extractor using ./start_with_snc.sh (if started from outside of the extractor installation directory then please provide it as a parameter to the script)
Use the distinguished name of the SAP Server's SNC PSE certificate as the SNC partner name in your Data Connection, e.g 'p:CN=IDES, OU=DEV, O=CELONIS, C=DE' (don't forget the p!)
Steps 2 - 3 can be repeated for each SAP Server you wish to connect to, afterwards just create separate Data Connections with the appropriate SNC partner names
Considerations when running as a Windows service
The provided 'install_with_snc.bat' will install a Windows service that bootstraps the extractor for SNC connections.
Open the Administrative Tools > Services window on your Windows server
Stop the service
Open the Properties > Log On dialog
Change the service user account to the user that ran 'snc_create_pse.bat'
Start the service
After allowing around 20-30 seconds for the service to start up, try the connection test
The standard implementation of Celonis SAP Extractor assumes direct communication between the Extractor service and the RFC Module. However, some customers use PI/PO as mediator between all external parties and SAP systems, and therefore direct communication between the Extractor service and RFC Module is impossible.
To make this happen Celonis also supports extraction via PI/PO. In this scenario, the Extractor service will conduct all communications with the RFC module via PI/PO. We can communicate via RFC Adapters, or SOAP endpoints.
For the integration via SOAP Adapter, the customer should create adapters/endpoints in PI/PO and map them to the RFC Functions of our RFC package. Then they should generate WSDL files for these endpoints, which will later be used when setting up the connection between Celonis Platform and SAP.
The following Function Modules should be mapped:
/CELONIS/FM_NEW_EXTRACT
/CELONIS/FM_CANCEL_EXTRACT
/CELONIS/FM_CLEANUP_FILES
/CELONIS/FM_GET_JOB_STATUS
/CELONIS/FM_CONFIG_TEST
/CELONIS/FM_GET_EXTR_FILE_LIST
/CELONIS/FM_GET_EXTRACT_FILE
/CELONIS/FM_DELETE_JOB_LOG
/CELONIS/FM_GET_JOB_LOG
/CELONIS/FM_GET_SYS_INFO
/CELONIS/FM_GET_TABLE_LIST
/CELONIS/FM_GET_CHECKED_TABLES
/CELONIS/FM_GET_TABLE_METADATA
/CELONIS/FM_CL_NEW_EXTRACT (required only for real-time extractions)
/CELONIS/FM_CL_RM_EXTRACT (required only for real-time extractions)
/CELONIS/FM_CL_GET_TABLE_NAME (required only for real-time extractions)
These are part of the Function Groups: /CELONIS/EXTRACTION and /CELONIS/CL_EXTRACTION.
Note
RFC Module should be setup as usual on SAP side for PI/PO connection to work.
The diagrams below describe how the systems communicate with each other.

![]() |
From the RFC module version 3.9 there's no need to create a temporary folder in SAP file system to store the extracted data temporarily. Instead, you can use the newest data extractor which relies on a staging table inside the SAP application database to store the temporary files.
How to extract data using staging table
To enable data extraction using a staging table:
Go to Data > Data Integration.
Go to your Data Pool.
Click Data Connections.
Next to the selected Data Connection, click Edit.
Toggle Enable Advanced Settings.
Select Enable data extraction via staging table.
From now on, extraction will be done using the staging table instead of the file system.
The transport for Celonis RFC Module includes a table called /CELONIS/BUFFTAB which is used to temporarily store the extracted data before pushing it to Celonis Platform cloud. This storage is necessary since large extraction is run as a background process in SAP which periodically dumps the extracted data to files to not overload the memory.
The table has the following structure:
EXID - Execution ID (PK).
Filename - name of the filename (PK).
Timestamp - information on when the file was written to the table.
Size- Size of the data in a knowledge base.
xdata - the compressed data.
The staging table is cleaned up immediately after the file is fetched by the SAP extraction client. This keeps the table clean and makes sure that it does not grow unnecessarily.
As a safeguard, on each file fetch, we also clean up all the files older than one day. Even if the SAP extraction client is down for some reason and the background jobs keep adding files to the staging table without a subsequent fetch, once the service is restored, the next extraction will remove all the leftover files.
If the table grows too big, you can empty it without running the extraction by using the following report:
/CELONIS/RP_TRUNCATE_STAGTAB
The generic SOAP PI/PO Extractor lets you connect to SAP ECC through the PI/PO middleware using SOAP Adapters.
In SAP Connection form Select PI/PO as the middleware, SOAP as the adapter, and fill in the connection details as described here.
Note
You will need to get the WSDL files for the PI/PO SOAP endpoints from your SAP administration team.
![]() |
The extractor is implemented the same way as the standard SAP on-premise Extractor with the exception that instead of the standard .jar file, the PI/PO extractor .jar file should be used.
To download the on-premise extractor package and the SOAP PI/PO connector, go to the Download portal.
Rename the connector-sap-soap-pipo.jar file to connector-sap.jar.
Replace the .jar file in the extractor package with that of the PI/PO.
Go ahead with the standard steps of Installing