Configure captured applications
To define which applications and data to capture with Task Mining, the Celonis Task Mining Client provides a system to define rules, which events (e.g. left click) and data should be captured.
This page explains the rule-based processing and filtering of events and how these rules are evaluated. Celonis calls those rules "Event Processing Rules".
The structure of a rule
A processing rule consists of three logical parts, namely:
The events that trigger the rule.
An (optional) condition to filter for specific events based on their metadata (e.g. the related application or URL).
An action that defines what to do with the event if it matches the rule.
The event part specifies which type of event (e.g. Left click, Text input , etc.) the rule should trigger. This can be a single event type or multiple event types. For convenience, it is also possible to specify that the rule should trigger on all event types.
The condition part can be used to filter the matching events on their attached metadata. Only events that match the condition will be selected for processing, all other events are omitted and evaluated against the following rules.
If an event matches the rule as defined by the event and the condition part, it is processed by this rule and captured. The action part defines which data exactly to capture for the event, e.g., which metadata fields should be stored for the event, which metadata fields should be hashed before storing, and should a screenshot be captured or not.
Processing order of rules
It is possible to define an arbitrary number of rules. Every time the Task Mining Client detects an event, it is evaluated against these rules to find a matching rule and process it accordingly. This evaluation works similar to a case-when-statement, a well-known concept in many programming languages. Starting with the first rule, the event is checked against all rules until the event matches one rule. Then this matching rule is selected for processing and captures the event as specified in the rule. Please note that a matching rule consumes the event while processing it, This means that the rule check stops for this event, even though there might be further rules potentially matching the event. Consequently, the order of rules is important as only the first rule that matches the event is considered for processing. Therefore, ambiguous rules should be avoided or properly ordered, i.e. specific rules should come first while more general rules should be specified further down the order.
Default rule
An event may not match any of the defined rules. By default, these events are not captured. However, there might be use cases where these events should be captured as well but maybe only with minimal data and without screenshots. To achieve this, it is possible to define a default rule to specify which data should be captured. As the default rule is applied to all events that did not match any rule so far, it does not have any event or condition part.
Often, the default rule is set up to just discard the event by using "Default skip".
Example
The following example demonstrates how the events are processed by the rules. For this, we define four rules, each capturing a specific subset of events.
Example rules
RULE 'Word' DESCRIPTION 'Capture Word with all data' ON ALL EVENTS IF ProcessName = 'WINWORD' THEN LOG ALL RULE 'Excel' DESCRIPTION 'Capture excel with hashed input/clipboard text' ON ALL EVENTS IF ProcessName = 'EXCEL' THEN LOG ALL HASH EnteredText, ClipboardText RULE 'Chrome mouse events' DESCRIPTION 'Capture mouse events with screenshot for Chrome' ON 'Left click', 'Mouse wheel', 'Mouse wheel (up)', 'Mouse wheel (down)', 'Right click' IF ProcessName = 'chrome' THEN LOG ALL TAKE SCREENSHOT ACTIVE_WINDOW RULE 'Other Chrome events' DESCRIPTION 'Capture all other events for chrome without screenshot' ON ALL EVENTS IF ProcessName = 'chrome' THEN LOG ALL DEFAULT SKIP
The first rule is triggered by all event types but only matches events coming from Word. Matching events will be captured with all provided data, but without screenshots.
The second rule is triggered by all event types, too. However, it only accepts events coming from Excel. Similar to the first rule, it captures all provided data but hashes the values for the EnteredText and ClipboardText data fields.
The third rule is only triggered by mouse events ('Left click', 'Mouse wheel', 'Mouse wheel (up)', 'Mouse wheel (down)', 'Right click'), other event types like keyboard inputs bypass this rule. Additionally, the rule only accepts events coming from Chrome. Matching events will be recorded with all data and a screenshot of the active window.
The fourth rule triggers all events and only accepts events coming from Chrome. Matching events are captured with all data and without screenshots.
Any event, not matching a rule will be skipped.
The defined rules are mutually exclusive, i.e. they only match distinct subsets of events. The only exceptions are the third and the fourth rules where rule three matches a subset of rule four. The following image outlines how events are evaluated against these rules. For simplification, events are only shown with their event type and application.
The first event (Entered Text | WINWORD) represents a text input in Word. The evaluation starts with Rule 1 which matches the event as it triggers for all event types and also the condition matches the data fields of the event. Therefore, Rule 1 processes the event and captures it with all data fields but without screenshots. As the processing rule consumes the event, all following rules are not evaluated anymore.
For the second event (Left click | EXCEL), the evaluation starts with Rule 1, again. Event though the rule triggers for all event types, it does not match the event as the application (i.e., field ProcessName) is different. The evaluation continues with Rule 2 which triggers for all event types and also matches the application. Therefore, Rule 2 captures the event with all data fields and without screenshots. The values for the EnteredText and ClipboardText data fields are hashed before capturing. The evaluation then ends without considering the following rules.
The third event (Left click | chrome) is first evaluated against Rule 1 and then Rule 2. As both do not match due to the different applications, the evaluation continues with Rule 3 which is triggered by the "Left click" event type. The application matches as well so it processes the event and captures all data fields. Additionally, it takes a screenshot of the active window. Theoretically, Rule 4 would match the event as well. However, the event is consumed by the previous rule. Consequently, Rule 4 is not evaluated and the event is only captured once.
The fourth event (Entered text | chrome) is evaluated against Rule 1 and then Rule 2, but does not match them due to the different applications. For Rule 3 the application would match, however, the event does not trigger the rule as it is not a mouse event. The evaluation continues then with Rule 4 which triggers for all event types and also matches the application. Therefore, the event is captured with all data fields but without screenshots.
The fifth event (Entered text | slack) is evaluated against all rules in the order of their definition. Even though some rules (namely Rules 1, 2, and 4) trigger for the event type, no rule matches the application of the event. As there is no default rule, the event will finally be skipped and not captured at all.
Pitfalls and troubleshooting
Wrong order of rules
If all rules are mutually exclusive, the order of rules is not imported because every event would only match one particular rule. However, if rules are overlapping, the order of the rules is crucial. Consider rules 3 and 4 from the example above: Both match events coming from Chrome. While Rule 3 only triggers for mouse events, Rule 4 triggers for any event.
Assuming we would change the order of the rules, i.e. move Rule 4 before Rule 3. Then the fourth event (Entered text | chrome) would still be processed by Rule 4. The third event (Left click | chrome), however, would be processed by Rule 4, too, and captured without screenshots that are not intended. In fact, Rule 3 would never process any event as it matches a true subset of the events matching Rule 4 which would consume all these events.
To avoid such issues, it is recommended to define the rules in a way that they only match mutually exclusive subsets of the events. If it is not possible to avoid overlaps or if it is more convenient to define overlapping rules, the more specific rules should always be defined before the more generic rules.
Consider NULL values
When defining rules, you should always be aware of NULL values. While a few data fields are guaranteed to have a value for all events (e.g. ProcessName), many data fields are only available for specific event types (e.g. ClipboardText is only available for clipboard-related events). Furthermore, fields might not have value due to technical reasons. Missing values of the data fields are represented as NULL.
Avoid mixing of allow-listing and deny-listing
When defining rules, you should keep them consistent and only use either allow-listing (recommended) or deny-listing. Mixing both approaches often leads to unintended results, e.g., capturing undesired events or missing out on events that should have been captured.