Recommendation systems are mostly used for commercial purposes. On the other hand, in machine learning, we often encounter problems with missing data, which can be subsequently obtained with some measurements. Here, we could use recommendation system techniques to determine which data is most worthwhile to obtain. This is active feature acquisition.
First, we train a machine learning model on the data. Using this model, we calculate the Shapley values for the attributes, and use these as product ratings for the recommendation system. We recommend a subset of measurements to the user that have the highest Shapley values.
We found on artificial data that the procedure performs well in cases where there are no missing values in the training set, but even a few of these slightly degrade the results. The reason for this likely lies in the way we calculate Shapley values for missing data, and how the machine learning model works with them, but it is not yet fully clear. Based on the results on real data, we concluded that the performance of our method is highly dependent on the relationship between the attributes and the target variable.
|