Model miner types

Two different miner types are available when mining your model: strict and balanced. Both miner types analyze the process graph that is created from an event log and try to match the patterns of events within the process graph. They use different approaches where patterns of events can’t be matched.

You should normally use the balanced miner type as this typically creates models that have a good trade-off between fitness, precision and complexity. However, if the number of variants in your model is very low or runtimes are very long, we recommend trying the strict miner type instead. Normally, however, there will be no performance difference between balanced and strict miner types for most scenarios.

Quantifying the quality of a model

You can evaluate the quality of a process model by checking if it is a good representation of the data and/or does not unintentionally allow undesirable behavior.

Visualizing two ‘quality dimensions’ of fitness and precision lets you assess the behavior seen in the event log, the behavior allowed by the model and the overlap between these two dimensions. You should always consider both fitness and precision when evaluating model quality as it’s possible to create models that have high fitness but very low precision and are so generic as to be useless.

Explaining model quality dimensions

Dimension	Description
Fitness	Describes how much of the behavior in the event log is allowed by the model.Models with high fitness represent almost all the behavior observed in the event log.
Precision	Describes how much additional behavior not present in the event log is allowed by the model.Models with high precision do not allow significant extra behavior from the behavior observed in the event log.

Complexity comparison between miner types

The complexity of models created by the different miner types is shown below. Using the same data, the first example shows a model that was mined using the strict miner type, the second using the balanced miner type. The balanced miner type process graph is significantly less complex.

Model examples

Both fitness and precision are typically quantified by values between 0 (low precision/fitness) and 1 (high precision/fitness). The examples below use the information given in this table:

case_id	Activity
1	A
1	B
1	C
2	A
2	C

High fitness and high precision model

This model has high fitness since it allows all the behavior seen in the event log and high precision since it does not allow any extra behavior. Perfect fitness and perfect precision mean that the behaviour expressed in the event log is exactly the same as the behaviour allowed by the model.

Image showing a model that is high fitness and high precision.

Lower fitness and high precision model

This model does not allow case_id == 2 so it has lower fitness than the high fitness and high precision model but still has high precision as it does not allow any additional behavior.

Image showing a model that is of lower fitness and high precision.

High fitness and very low precision model

This flower model has high fitness, but low precision. The three activities A, B and C can occur in any combination. It has high fitness because it allows all the behavior in the event log but low precision because it allows a lot of other traces like <C, B, A, A, B, C>.

Image showing a model that is high fitness and very low precision.