MATCH_PROCESS

Description

MATCH_PROCESS matches the variants of a process against a given pattern.

Similar functionality is provided by MATCH_PROCESS_REGEX. MATCH_PROCESS uses Nodes and Edges to match the cases. Nodes consist either of a single activity or a list of activities. Edges describe how the nodes are linked together.

Syntax

 MATCH_PROCESS ( [ activity_table.string_column ,] node (, node)* CONNECTED BY edge (, edge)* )

activity_table.string_column: A string column of an activity table. Usually, the activity column of an activity table is used.
node: NODE | OPTIONAL | LOOP | OPTIONAL_LOOP | STARTING | ENDING [ single_activity (, single_activity )* ] AS node_name
- single_activity: [LIKE] activity (Activity name. LIKE allows you to use wildcards in your activity name. LIKE reacts case sensitive. )
edge: DIRECT | EVENTUALLY [ edge_start_node, edge_end_node ]
- edge_start_node: node_name
- edge_end_node: node_name

Node

A node consists of one or more activities. If multiple activities are given, it means one of those activities.

Node Types

NODE: Node which has to be part once in the case, without any restrictions on where the node has to be.
STARTING: Node which has to happen at the beginning of a case.
ENDING: Node which has to happen at the end of a case.
LOOP: Node which occurs at least once but can also be repeated.

Edge Types

DIRECT: edge_end_node has to follow directly after the edge_start_node
EVENTUALLY: between edge_start_node and edge_end_node other activities can be placed

Result

MATCH_PROCESS returns an INT column, which flags all matching cases with 1 and all non matching cases with 0. The resulting column is temporarily added to the case table and is often used in combination with a filter.

Tips

Instead of specifying the activity column, it is also possible to use another string column of the activity table. For example, you can match cases with a specified sequence of user types.

Examples

[1]

Here MATCH_PROCESS flags all cases in which one activity A is followed directly by activity B with a 1.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' ] as src , NODE [ 'B' ] as tgt CONNECTED BY DIRECT [ src , tgt ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'1'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'A'	Tue Jan 01 2019 13:00:00.000
'2'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'B'	Tue Jan 01 2019 13:03:00.000

Activities_CASES

CASE_ID : string
'1'
'2'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	0

[2]

Here is MATCH_PROCESS combined with a filter. The result are only cases in which one activity A is followed by activity B.

Query

Filter

         FILTER MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' ] as src , NODE [ 'B' ] as tgt CONNECTED BY DIRECT [ src , tgt ] ) = 1;

Column1

         "Activities"."CASE_ID"

Column2

         "Activities"."ACTIVITY"

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'1'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'A'	Tue Jan 01 2019 13:00:00.000
'2'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'B'	Tue Jan 01 2019 13:03:00.000

Activities_CASES

CASE_ID : string
'1'
'2'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : string
'1'	'A'
'1'	'B'
'1'	'C'

[3]

If an activity has not only to be directly followed by another activity but can come any time later the keyword EVENTUALLY can be used. In this example MATCH_PROCESS flags all cases in which one activity A is followed eventually by activity B with a 1.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' ] as src , NODE [ 'B' ] as tgt CONNECTED BY EVENTUALLY [ src , tgt ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'1'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'A'	Tue Jan 01 2019 13:00:00.000
'2'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'B'	Tue Jan 01 2019 13:03:00.000

Activities_CASES

CASE_ID : string
'1'
'2'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	1

[4]

Here is an example in which node 'node_ab' has two activities. This means that a matching case needs an activity C which comes either after A or B.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' , 'B' ] as node_ab , NODE [ 'C' ] as node_c CONNECTED BY DIRECT [ node_ab , node_c ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'1'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'A'	Tue Jan 01 2019 13:00:00.000
'2'	'C'	Tue Jan 01 2019 13:02:00.000
'2'	'B'	Tue Jan 01 2019 13:03:00.000

Activities_CASES

CASE_ID : string
'1'
'2'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	1

[5]

Nodes can represent loops. Here matching cases can have between activity A and C at least one or more activities of type B.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' ] AS node_a , LOOP [ 'B' ] AS loop_b , NODE [ 'C' ] AS node_c CONNECTED BY DIRECT [ node_a , loop_b ] , DIRECT [ loop_b , node_c ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'1'	'B'	Tue Jan 01 2019 13:02:00.000
'1'	'C'	Tue Jan 01 2019 13:04:00.000
'2'	'A'	Tue Jan 01 2019 13:00:00.000
'2'	'B'	Tue Jan 01 2019 13:01:00.000
'2'	'C'	Tue Jan 01 2019 13:02:00.000
'3'	'A'	Tue Jan 01 2019 13:00:00.000
'3'	'C'	Tue Jan 01 2019 13:02:00.000

Activities_CASES

CASE_ID : string
'1'
'2'
'3'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	1
'3'	0

[6]

A loop node can also consist of multiple activities. PROCESS_MATCH accepts than all given activities, without regarding order or number of occurrences till another activity is found.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ 'A' ] AS node_a , LOOP [ 'B' , 'C' ] AS loop_bc , NODE [ 'D' ] AS node_d CONNECTED BY DIRECT [ node_a , loop_bc ] , DIRECT [ loop_bc , node_d ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'C'	Tue Jan 01 2019 13:01:00.000
'1'	'B'	Tue Jan 01 2019 13:02:00.000
'1'	'B'	Tue Jan 01 2019 13:03:00.000
'1'	'D'	Tue Jan 01 2019 13:05:00.000

Activities_CASES

CASE_ID : string
'1'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1

[7]

Nodes can be forced to be at the start or the end of a case.

Query

Column1

         "Activities_CASES"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."ACTIVITY" , STARTING [ 'A' ] AS node_a , ENDING [ 'B' ] AS node_b CONNECTED BY DIRECT [ node_a , node_b ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date
'1'	'A'	Tue Jan 01 2019 13:00:00.000
'1'	'B'	Tue Jan 01 2019 13:01:00.000
'2'	'A'	Tue Jan 01 2019 13:02:00.000
'2'	'B'	Tue Jan 01 2019 13:03:00.000
'2'	'C'	Tue Jan 01 2019 13:05:00.000
'3'	'A'	Tue Jan 01 2019 13:06:00.000
'3'	'A'	Tue Jan 01 2019 13:07:00.000
'3'	'B'	Tue Jan 01 2019 13:08:00.000

Activities_CASES

CASE_ID : string
'1'
'2'
'3'

Foreign Keys

Activities.CASE_ID

Activities_CASES.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	0
'3'	0

[8]

It is not required to match cases based on the activity column. In this example, all cases where user type A is directly followed by user type B are flagged with a 1.

Query

Column1

         "Cases"."CASE_ID"

Column2

         MATCH_PROCESS ( "Activities"."USERTYPE" , NODE [ 'A' ] as src , NODE [ 'B' ] as tgt CONNECTED BY DIRECT [ src , tgt ] )

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	TIMESTAMP : date	USERTYPE : string
'1'	'X'	Tue Jan 01 2019 13:00:00.000	'A'
'1'	'Y'	Tue Jan 01 2019 13:01:00.000	'B'
'1'	'Z'	Tue Jan 01 2019 13:02:00.000	'A'
'2'	'X'	Tue Jan 01 2019 13:00:00.000	'B'
'2'	'Z'	Tue Jan 01 2019 13:02:00.000	'A'
'2'	'Y'	Tue Jan 01 2019 13:03:00.000	'A'

Cases

CASE_ID : string
'1'
'2'

Foreign Keys

Activities.CASE_ID

Cases.CASE_ID

Result

Column1 : string	Column2 : int
'1'	1
'2'	0

[9]

Query

Filter

         FILTER MATCH_PROCESS ( "Activities"."ACTIVITY" , NODE [ LIKE 'A%' ] as src , NODE [ 'C' ] as tgt CONNECTED BY DIRECT [ src , tgt ] ) = 1;

Column1

         "Activities"."CASE_ID"

Column2

         "Activities"."ACTIVITY"

Input

Output

Activities

CASE_ID : string	ACTIVITY : string	Column3 : date	Column4 : date
'1'	'AB'	Tue Jan 01 2019 13:00:00.000	Tue Jan 01 2019 13:02:00.000
'1'	'B'	Tue Jan 01 2019 13:02:00.000	Tue Jan 01 2019 13:08:00.000
'1'	'C'	Tue Jan 01 2019 13:03:00.000	Tue Jan 01 2019 13:04:00.000
'1'	'D'	Tue Jan 01 2019 13:06:00.000	Tue Jan 01 2019 13:07:00.000

Result

Column1 : string	Column2 : string
'1'	'AB'
'1'	'B'
'1'	'C'
'1'	'D'

MATCH_PROCESS

Description

Syntax

Node

Node Types

Edge Types

Result

Tips

Examples

See also:

Search results