Multi-criteria based Item Recommendation Methods

This paper comprehensively investigates and compares the performance of various multi-criteria based item recommendation methods. The development of the methods consists of three main phases: predicting rating per criterion; aggregating rating prediction of all criteria; and generating the top-𝑁 item recommendations. The multi-criteria based item recommendation methods are varied and labelled based on what approach is implemented to predict the rating per criterion, i.e., Collaborative Filtering (CF), Content-based (CB), and Hybrid. For the experiments, we generate two variations of datasets to represent the normal and cold-start conditions on the multi-criteria item recommendation system. The empirical analysis suggests that Hybrid and CF are best implemented on the normal and cold-start item conditions, respectively. On the other hand, CB should never be (solely) implemented in a multi-criteria based item recommendation system on any conditions.


INTRODUCTION
Recommendation System is a firmed research topic (G. Adomavicius & Tuzhilin, 2005;Zhang, Zhou, & Zhang, 2011). A multi-criteria recommendation system allows users to specify ratings based on various criteria (Gediminas Adomavicius, Manouselis, & Kwon, 2011;Aggarwal, 2016). For example, a user may rate tourism attraction based on attraction, accessibility, amenities, and ancillary. Such a system implements a recommendation method in which the user's preference for an item is represented as a vector of ratings corresponding to various criteria (Aggarwal, 2016).
The rating prediction algorithms can be categorized into three approaches: Collaborative Filtering (CF), Content-based (CB), and Hybrid (G. Adomavicius & Tuzhilin, 2005). The CF approach is the most popular rating prediction approach which generates the predictions based on the similarity of users or items (Su & Khoshgoftaar, 2009). The CB approach is popular for generating prediction when the information of the item's content is available (G. Adomavicius & Tuzhilin, 2005;Lops, Gemmis, & Semeraro, 2011). Meanwhile, the Hybrid approach is comprising both the Collaborative Filtering and Contentbased approaches (G. Adomavicius & Tuzhilin, 2005;Burke, 2007). Researchers have proposed methods various methods based on the three rating prediction approaches (Gediminas Adomavicius et al., 2011;Fuchs & Zanker, 2012;Jannach, Karakaya, & Gedikli, 2012;Lakiotaki, Matsatsinis, & Tsoukias, 2011;Manouselis & Costopoulou, 2007). However, to the best of our knowledge, no in-depth work has been done that comprehensively compares the performance of the multi-criteria recommendation method variations.
In this paper, we study a variation of multicriteria based item recommendation methods.
The development of the methods consists of three main phases: predicting rating per criterion; aggregating rating prediction of all criteria; and generating the top-item recommendations.
The multi-criteria based item recommendation methods are varied and labelled based on what approach is implemented to predict the rating per criterion, i.e., Collaborative Filtering (CF), Content-based (CB), and Hybrid. For the experiments, we generate two variations of datasets to represent the normal and cold-start conditions on the multi-criteria item recommendation system. The contributions of this paper are as follows: (1) Analysis of multi-criteria based recommendation methods and show the different approaches to implementing the rating prediction per criterion phase, and (2) Performance comparison of three multi-criteria based item recommendation methods.

METHOD
In this section, we explain how to develop the multi-criteria based methods for generating a top-list of item recommendation to a target user. The input of the method is the multi-criteria rating histories of users; the item features data; and the significance score of each criterion. At this stage, we can identify and notate the set of users, items, criteria, significance score of each criterion, features, items which have been rated by each user, and features of each item. Table I lists the notations used in this paper to symbolize those sets.
The development of the multi-criteria based item recommendation method consists of three main phases: (a) predicting rating per criterion, (b) aggregating rating prediction of all criteria, and (c) generating the top-item recommendations. Figure 1 shows the framework of the method.

Prediction Rating per Criterion
The calculation of rating prediction can be categorized into three approaches: Collaborative Filtering, Content-based, and Hybrid approach (G. Adomavicius & Tuzhilin, 2005).

Collaborative Filtering Multi-criteria based Approach
The Collaborative Filtering (CF) approach is the most popular rating prediction approach (Su & Khoshgoftaar, 2009). A way to implement this  approach is by calculating the prediction based on the similarities of users rating preferences (Gediminas Adomavicius et al., 2011;Bilge & Yargıç, 2017). Per criterion , we calculate the similarity between user and using the cosine similarity function: Once the similarities between target user and other users are calculated, we can form the topnearest neighbours of user , u . The CF rating prediction of user to item for criterion is formulated as: where | | ≤ .

Content-based Multi-criteria based Approach
The Content-based (CB) approach is popular for generating prediction when the information of the item's content is available (G. Adomavicius & Tuzhilin, 2005;Lops et al., 2011). A way to implement this approach is by calculating the prediction based on the weight of each item feature (Uluyagmur, Cataltepe, & Tayfur, 2012). Per criterion, we calculate the weight of item feature for user using the weighting function: The CB rating prediction of user to item for criterion is formulated as: Hybrid Multi-criteria based Approach The Hybrid approach is comprising both the Collaborative Filtering and Content-based approaches (G. Adomavicius & Tuzhilin, 2005;Burke, 2007). A way to implement this approach is by calculating the prediction based on the average of CF and CB rating predictions (Aggarwal, 2016). The Hybrid rating prediction of user to item for criterion is formulated as:

Generating Top-N Item Recommendation
The list of top-list item recommendation for target user , ( ), is generated based on the aggregated rating predictions. In this case, item is listed in the if , is within highest values in , *

Toy Example
This section shows examples of how the multicriteria based methods for generating a toplist of item recommendation to a target user are developed based on the CF, CB, and Hybrid approaches. Figure 2, Figure 3, and Figure 4 respectively show the toy examples of multicriteria rating data, item features data and criteria significance scores. The complete calculation of the three main phases in the multi-criteria based item recommendation method is presented in Figure 5.  Figure 3 shows that there are five item features data ∆= { 1 , 2 , 3 , 4 , 5 } , in which the set of items' features are formed as 1 = {1,2,3,5}, 1 = {1,3,4,5}, 1 = {1,5}, and 1 = {1,4,5}. Meanwhile, Figure 4 shows that the significance score of each criterion is 1 = 2, 2 = 3, and 3 = 1.

RESULTS AND DISCUSSION
In this section, we present the empirical analysis by conducting experiments to evaluate the performance of the multi-criteria based item recommendation methods built in this paper.

Dataset and Experiment Procedure
This paper uses a tourism multi-criteria rating dataset that consists of 77 users, 50 touristic attractions or items, and 900 rating data. Table II lists the details of the dataset. Our experimentations implement the 5-fold crossvalidation evaluation approach, in which each fold has a training set and a test set .
For the experiments, we generate two variations based on the tourism dataset to represent the normal and cold-start conditions: • TN: The dataset is refined such all items and users in occurred several times in . This dataset represents the normal condition.
• TCS: The dataset is refined such that there occurred items in that have not been rated by users in . This dataset represents the condition in which the coldstart item problem occurs.

Evaluation Criteria
Our recommendation methods build the model using and utilize it afterwards to generate a top-item recommendations for target users in . In this case, the evaluation is conducted by comparing the top-list of item recommendations for a user , ( ), to the ground-truth items listed in , . We use the AP (Average Precision) evaluation metric to measure the performance of recommendations. The AP score of the first list of item recommendations for a target user is formulated as: It is to be noted that (•) results 1 when the condition within the bracket is fulfilled, or 0 otherwise.

Performance Comparison
In this sub-section, we compare the performance at = {1 ⋯ 20} of the three developed three multi-criteria based item recommendation methods. For ease of explanation, we label the methods based on the rating prediction per criterion approach implemented, i.e., CF, CB, and Hybrid. Figure 6 shows the performance comparison of the three methods on the LN dataset. We can observe that CF and Hybrid have comparable performance, while CB performs the worst. Note that Hybrid slightly outperforms CF when ≥ 5.
These results suggest that, on a normal condition, Hybrid is the best method to be implemented in a multi-criteria based item recommendation system. Figure 7 shows the performance comparison on the LCS dataset. We can notice that CF achieve the best performance, followed by Hybrid and CB. These results advise that a multi-criteria based item recommendation system must implement a CF method on a cold-start item condition.
Additionally, the poor performance of CB confirms that this method should not be implemented solely in a multi-criteria based item recommendation system on any conditions. It is also worth it to note that CF can be more effective than CB at providing recommendations for cold-start items condition since CF performance is better than that of CB on the LCS

CONCLUSION
In this paper, we develop three multi-criteria based item recommendation methods. The methods are varied and, therefore, labelled based on what approach is implemented to predict the rating per criterion, i.e., Collaborative Filtering (CF), Content-based (CB), and Hybrid. For a deeper analysis, we generate two variations of datasets to represent the normal and coldstart conditions of multi-criteria item recommendation system. The experiment results suggest that Hybrid and CF are best implemented on the normal and cold-start item conditions, respectively. On the other hand, CB should never be (solely) implemented in a multicriteria based item recommendation system on any conditions. Nevertheless, we believe that this study needs to be further expanded. Our recommendations are to implement: (1) other functions of CF, CB and, Hybrid approaches for predicting the rating per criterion; or/and (2) other functions for aggregating rating prediction of all criteria. Alternatively, we may build an application to test the user experience on the multi-criteria item recommendation system based on the CF and Hybrid methods developed in this paper.