Task Mining data processing and scheduling
Task Mining data processing overview
You can configure how and when your Task Mining data is processed. You can choose to process new Task Mining data only or to reprocess all data when new data is added. If new Task Mining data only is processed, you can optionally schedule when this processing will be performed,
Task Mining data is processed in batches of up to 10 million rows for efficiency but processing is limited by the number of:
Rows the data model can load into one table (typically two billion).
Concurrent users sending data per realm (up to 30,000 users for larger realms).
For more information, see the Workforce Productivity app. For information on tables, see the Task Mining table reference.
Task Mining data processing options
Note
You access the data processing options from the Task Mining project home page by selecting Run & Schedule and choosing an option from the Run dropdown. Any issues are displayed in the Run History on the Run & Schedule screen.
Data processing option | Description | Use case |
|---|---|---|
Processes new incoming data only. | Data needs to be updated as quickly as possible while ensuring the data processing duration is as short as possible. | |
Reprocesses all existing data to apply new or updated rules, labels, business events or tasks. | Data updates need to be consistently applied across the entire data set to ensure analyses are always consistent. As the entire data set is being processed, the data processing duration is longer than if new data alone were being processed. |
The Task Mining Client software sends new data to the
user_interaction_event_logtable,Default and custom Labels are applied to raw events.
The resulting events are stored in the
TM_Labeled_Datatable.Tasks are applied and:
Tasks containing the list of task names defined by the user are stored in the
Taskstable.Task instances found for Tasks are stored in the
Task_Instancestable.Tasks_Joinis an n:n join table between a task instance andTM_Labeled_Dataevents.
The Data Model automatically reloads to reflect the newly-processed data.
User triggers re-processing in the Run & Schedule screen.
Temporary tables are created in the background.
These temporary tables are essentially copies of existing tables and do not contribute to APC consumption.
The union of the
user_interaction_event_loganduser_interaction_event_log_historytables is queried in batches of 10 million rows until all sessions that were available when re-processing was triggered have been processed.Default and custom Labels are applied to raw events.
The resulting events are stored in the
TM_Labeled_Data_reprocessingtable.Tasks and Business Events are applied.
Steps 3 to 6 are performed for new data in batches of up to 10 million rows until all raw events have been processed.
The Data Model automatically reloads to reflect the newly-processed data.
If reprocessing is:
Successful, the temporary reprocessing tables which contain the re-processed results are renamed and replace the original tables until re-processing is performed again.
Unsuccessful, all temporary tables are deleted and the re-processing job execution status in the Run & Schedule screen is set to
Failed.
Task Mining processing scheduling options
Tip
You access the processing scheduling options from the Task Mining project home page by selecting Run & Schedule and selecting Schedule.
Data processing scheduling options | Description | Use case |
|---|---|---|
Run when new data is uploaded. | Captured Task Mining data is processed every time new data is uploaded to the Task Mining Data Pool by the Task Mining Client software. There may be a delay of up to 20 minutes between the Task Mining Client software status showing as uploaded and the processing run starting. Available for processing new Task Mining data only (delta) only. | Data must be available in Studio as quickly as possible and there are no resource utilization issues that interfere with other data transformations. |
Run by schedule | Captured Task Mining data is processed at a specified time/date. Available for processing new Task Mining data only (delta) only. | For performance reasons, data processing is run when users are not working, for example, overnight. |
No schedule | Data processing is triggered by a user manually selecting Run in Run & Schedule. Available for processing new Task Mining data only (delta) and re-processing existing Task Mining data (full). | Gives flexibility when there are no specific timing or performance constraints. |