Ensemble methods combine a number of weak classifiers and combine their predictions for more accurate models. The gain is not only an increase of classification accuracy but also decrease in bias and variance. The idea of our work is to build probabilistic classification rule ensembles. Probabilistic rules differ from common classification rules by outputting a class probability estimation instead of a crisp class decision. We build an ensemble model named Random Gaussian Rule Set or RGRS, inspired by the Random Forest method. Our model uses a random sample of examples and tries to create a conjunction of elementary Gaussian rules based on their attribute value distribution. We also describe an alternate idea where we form an AdaBoost ensemble with the CN2SD rule-building algorithm. We test both methods with cross validation and compare them to existing approaches. While the AdaBoost ensemble only improves the given algorithm, the RGRS results are comparable to classification models such as C4.5rules, but trail behind Random Forest and fuzzy rule classifier FURIA.
|