In thesis we explore the application of various discretization methods combined with the Apriori and FP-growth algorithms for mining association rules on artificial and real datasets. Special emphasis was placed on examining the impact of discretizing numerical attributes on the quality and interpretability of the generated rules, using different interestingness measures such as chi kvadrat, lift, recall, and accuracy. The analysis revealed that the Relative Unsupervised Discretization (RUDE) method proved effective in identifying rules involving discretized attributes, while other primarily uncovered well-known, general rules made of categorical atributes. Additionally, it was found that the combination of discretization methods and interestingness metrics significantly influenced the evaluations of rules, even those involving purely categorical attributes. This work contributes to understanding the impact of discretization on rule mining and opens up possibilities for future research in this area.
|