Clustering Techniques for Reducing Noise in Business Process Mining

Cases under study

Id

Name

Description

Size (KLOC)

S1

University Enrollments

Supports the university students’ self-enfrollments in various Spanish universities

26.7

S2

Villasante Lab

Manages a laboratory of the water and waste industry

28.8

S3

Magic Table

Creates, manages and simulates decision tables for associating conditions with domain-specific actions

33.3

S4

Diabetes Care

It is an Android application for diabetes patients, which analyzes blood (through an external device) and suggests diet plans.

9.9

Ontologies used in the Semantic Clustering Algorithm

Name

Ontology

Academia

go

Chemical

go

Decision Table

go

Diabetes Caree

go

Summary of the Data Collected in the Multi-Case Study

Op/Alg

Time (ms)

Size

Density

Connectivity

Separability

Precision

Recall

F-measure

RF1

21.493 1.000 0.369 0.208 0.970

0.556

0.796

0.647

RF2

38.216 0.804 0.323 0.026 1.000

RF3

29.002 0.598 0.339 0.026 0.651

RF4

3.699 0.480 1.000 0.000 0.426

RF5

2.624 0.415 0.644 0.033 0.288

RF6

13.881 0.237 0.626 0.129 0.000

RF7

0.984 0.237 0.626 0.129 0.000

RF8

24.896 0.273 0.222 0.226 0.073

RF9

21.625 0.492 0.000 0.436 0.080

CL1-Structural

78.148 0.000 0.693 0.141 0.080

0.898

0.532

0.642

CL2-Syntactic

264.945 0.411 0.098 1.000 0.074

0.818

0.599

0.649

CL3-Semantic

8108.374 0.452 0.081 0.793 0.076

0.804

0.630

0.660

Analysis Charts
tasks sequence flows
Size Density
Separability Connectivity
clustering time PrecisionRecall
Sample of initial business processes
Study Id BP Id Source BP After CL1 After CL2 After CL3
S1 1 go go go go
S1 2 - - - -
S1 3 go go go go
S1 4 go go go go
S1 5 - - - -
S1 6 go go go go
S1 7 go go go go
S1 8 - - - -
S1 9 go go go go
S2 10 go go go go
S2 11 go go go go
S2 12 - - - -
S2 13 go go go go
S2 14 - - - -
S2 15 - - - -
S2 16 - - - -
S2 17 - - - -
S2 18 - - - -
S2 19 - - - -
S2 20 go go go go
S2 21 - - - -
S2 22 go go go go
S3 23 - - - -
S3 24 - - - -
S3 25 - - - -
S3 26 - - - -
S3 27 go go go go
S3 28 go go go go
S3 29 - - - -
S3 30 go go go go
S3 31 go go go go
S3 32 go go go go
S3 33 - - - -
S3 34 - - - -
S3 35 go go go go
Ontology-based similarity Charts

Master Data (EXCEL FILE)

simsize simdensity
simconnectivity simseparability
ClusteredTasks Similarity