Normalization based Multi-Criteria Collaborative Filtering Approach for Recommendation System

A multi-criteria collaborative filtering recommendation system allows its users to rate items based on several criteria. Users instinctively have different tendencies in rating items that some of them are quite generous while others tend to be pretty stingy. Given the diverse rating patterns, implementing a normalization technique in the system is beneficial to reveal the latent relationship within the multi-criteria rating data. This paper analyses and compares the performances of two methods that implement the normalization based multi-criteria collaborative filtering approach. The framework of the method development consists of three main processes, i.e.: multi-criteria rating representation, multi-criteria rating normalization, and rating prediction using a multi-criteria collaborative filtering approach. The developed methods are labelled based on the implemented normalization technique and multi-criteria collaborative filtering approaches, i.e., Decoupling normalization and Multi-Criteria User-based approach (DMCUser) and Decoupling normalization and Multi-Criteria User-based approach (DMCItem). Experiment results using the real-world Yelp Dataset show that DMCItem outperforms DMCUser at most Top-𝑁 in terms of Precision and Normalized Discounted Cumulative Gain (NDCG). Though DMCUser can perform better than DMCItem at large Top-𝑁 , it is still more practical to implement DMCItem rather than DMCUser in a multi-criteria recommendation system since users tend to show more interest to items at the top list.


INTRODUCTION
The multi-criteria collaborative filtering recommendation systems give their users preferences to rate items of their interest according to several criteria. In this case, they have the opportunity to express a detailed judgement towards items. Instinctively, users have diverse tendencies in rating items that some of them are quite generous while others tend to be pretty stingy. Given the dissimilar rating patterns, implementing a normalization technique in the recommendation system is advantageous to reveal the latent relationship within the multi-criteria rating data.
Researchers have conducted studies regarding the benefit of the normalization technique. Jin and Si (2004) have shown that the decoupling normalization technique performs better than the Gaussian when implemented on a single-criterion collaborative filtering approach. Meanwhile, Bilge and Yargıç (2017) also confirm the outperformance of the decoupling normalization technique when implemented on the multi-criteria collaborative filtering approach. However, existing studies were limited to the implementation of the userbased collaborative filtering approach only. This arouses a research gap since the collaborative filtering approach is categorized as two models, i.e., user-based and item-based (Aggarwal, 2016).
This paper studies two methods that implement the normalization based multi-criteria collaborative filtering approach. The framework of the method development consists of three main processes, i.e.: multi-criteria rating representation, multi-criteria rating normalization, and rating prediction using a multi-criteria collaborative filtering approach. The developed methods are labelled based on the implemented normalization technique and multi-criteria collaborative filtering approaches, i.e., Decoupling normalization and Multi-Criteria User-based approach (DMCUser) and Decoupling normalization and Multi-Criteria User-based approach (DMCItem). Series of experiments are conducted using the Yelp Dataset that contains the multi-criteria hotel ratings.
The contributions of this paper are: (1) A framework of normalization based multi-criteria collaborative filtering approach for recommendation system; and (2) Performance comparison between two multi-criteria recommendation methods that implement the decoupling normalization technique on the userbased and item-based models.

METHOD
The framework of the normalization based multi-criteria collaborative filtering approach for recommendation system consists of three main processes (as illustrates in Figure 1), i.e.: multicriteria rating representation, multi-criteria rating normalization, and rating prediction using a multi-criteria collaborative filtering approach.

Rating Prediction using Multi-Criteria
Collaborative Filtering Approach

Multi-criteria Rating Normalization
From the example in Figure 2, we can observe that the users have different tendencies in rating items. 1 and 3 are the type of users that tend to rate items quite generously while 2 and 4 are those that tend to rate pretty stingily. Given the diverse rating patterns, the purpose of implementing a normalization technique is to reveal the latent relationship within the multicriteria rating data by transforming it into another range of rating scales (Bilge & Yargıç, 2017). This paper implements the decoupling normalization technique that had previously been proven to perform better than the Gaussian (Jin & Si, 2004) and Z-Score (Bilge & Yargıç, 2017). The decoupling normalization is a technique that treats the ratings as ordinal numbers and transforms them into probabilistic scores. The transformation is generated based on the assumptions that (Jin, Si, Zhai, & Callan, 2003): 1. A user is more likely to prefer items rated as rating category when the user has rated many items as no more than rating category 2. A user is less likely to prefer items rated as rating category when the user has rated many items as rating category The decoupling normalized multi-criteria rating of user towards item according to criterion is calculated as follows (Bilge & Yargıç, 2017): represents the proportion of items that are rated as no more than rating category : where ∀ ∈ and ∀ ∈ and ( = ) represents the proportion of items that are exactly rated with rating category : where ∀ ∈ and ∀ ∈ The complete algorithm of the decoupling normalization technique is listed in Error! Reference source not found.. The results of implementing the technique to the toy data in Figure 2 are shown in Figure 4. We can observe that the decoupling normalization reveals that 1 and 2 , as well as 3 and 4 , actually have similar tastes despite their different rating tendencies.
These latent relationships cannot be revealed from the original multi-criteria rating.

Rating Prediction using Multi-Criteria Collaborative Filtering Approach
The multi-criteria collaborative filtering approach consists of 5 processes: (1) similarity per criterion, (2) similarity selection overall criteria, (3) rating prediction per criterion, (4) aggregation of rating prediction overall criteria, and (5) recommendation generation. This paper implements two well-known multi-criteria collaborative filtering approaches: Multi-Criteria User-based and Multi-Criteria Item-based. For ease of explanation, we name the method that implements the combination of Decoupling normalization and Multi-Criteria User-based approach as DMCUser; where the method that implements the combination of Decoupling normalization and Multi-Criteria User-based approach is labelled as DMCItem.

Similarity per Criterion
In DMCUser, the similarity between user and user per criterion is calculated as follows: where is the set of items that have been rated by user and is the average of the normalized rating of user based on criterion . On the other hand, DMCItem calculates the similarity between item and item per criterion as follows: where is the set of users that have rated by item .

Similarity Selection Overall Criteria
The similarity selection is the process of determining which similarity is considered to be most significant to use given the list of the multicriteria similarities. In this paper, we assume that the most significant similarity is the similarity of a criterion that has the lowest values overall criteria: ( 1 , 2 ) = min Once then similarities overall criteria are calculated, we may find the top-ℎ similar users or items to form the user or item neighbourhood, depending on which collaborative filtering approach is implemented. For the case of DMCUser, ( ) is formed as the set of top-ℎ similar users to the target user that have rated item . For the case of DMCItem, ( ) is formed as the set of top-ℎ similar items to item that have previously been rated by target user . Note that, in the rest of this paper, we denote ℎ as the neighbourhood size where ℎ ≥ ( ) in DMCUser or ℎ ≥ ( ) in DMCItem.

Rating Prediction per Criterion
The rating prediction per criterion in DMCUser is calculated as follows: Meanwhile, the rating prediction per criterion in DMCItem is formulated as:

Aggregation of rating prediction overall Criteria
The aggregation of rating prediction overall criteria is the process of taking into account the rating prediction of each criterion in a way such that the multiple rating scores are combined to form a single rating score. This paper implements the Weighted Linear Sum (WLS) technique (Barredo & Bosque-Sendra, 1998) that uses the criteria weighted scoring to distinguish the importance of each criterion. Let = { 1 , 2 , … , } be the set of criteria weighted scoring that satisfies 1 = ∑ ∈ and | | = | |. The aggregation of rating predictions is calculated as: The system generates the list of recommendations by offering the target user to the Top-items that have the highest aggregated predictions scores.

Experiment Setup
This paper uses the Yelp Dataset that contains the ratings of hotels based on four criteria: Overall, Useful, Funny, and Cool. We filter the dataset such that we only use data of users that have rated at least 3 hotels, results in 25,000 ratings from 2,595 users on 5,209 hotels. The rating category of each criterion is from 1 to 5 for Overall, from 0 to 110 for Useful, from 0 to 59 for Funny, and from 0 to 103 for Cool. We apply the 5-fold cross-validation as the evaluation method and use both the Precision and Normalized Discounted Cumulative Gain (NDCG) as the evaluation metrics.

Impact of Neighborhood Size
The neighbourhood size (ℎ) is the parameter that controls the per criterion rating prediction formulated in Equation (8) for the DMCUser and Equation (9) for the DMCItem. In this paper, we vary ℎ = {5,10,20,30,40,50}. We can observe that DMCUser performs the best when ℎ = 5 and deteriorates on larger ℎ . Meanwhile, DMCItems achieves its best performance at the largest ℎ, i.e., ℎ = 50. These results indicate that there is a size of neighbourhood difference that should be used for tuning DMCUser and DMCItem to their best performances. In this case, DMCUser significantly needs less ℎ than DMCItem.

Impact of Criteria Weighted Scoring
The criteria weighted scoring ( ) is the parameter that controls the aggregation of rating prediction overall criteria formulated in Equation (10). This paper tries a variation of (listed in Table ) such that either each criterion has the same importance or only one of them is considered as the most important criterion while the others are set as equally essential. Note that we give a label to each variation of for ease of explanation. Table 1 The impacts of to DMCUser and DMCItem in terms of Precision and NDCG are shown in Figure 6. We can observe that DMCUser and DMCItem achieve their best performances respectively on 2 and 5. These results indicate that DMCUser considers that the first criterion (i.e., Overall) is the most important one compared to the other three criteria. On the contrary, DMCItem assumes that the last criterion (i.e., Cool) is the most significant among others

Performance Comparison
The performances of DMCUser and DMCItem are compared and fine-tuned based on their best parameters from the sensitivity analysis. In this case, DMCUser uses ℎ = 5 and WLS2 while DMCItem uses ℎ = 50 and WLS5. We analyze the comparisons based on a variation of Top-= {1,2,3, ⋯ ,20}.  Figure 7 shows the performance comparisons in terms of Precision and NDCG. We can observe that DMCItem outperforms DMCUser at most Top-, i.e., until Top-= 16 in terms of Precision and Top-= 17 in terms of NDCG. In other words, DMCUser outperforms DMCItem only when Top-is large. However, knowing that a target user usually tends to choose items that are at the top of the recommended list (Mohan, Chen, & Weinberger, 2011;Wang, Wang, Li, & He, 2013), it makes it more practical to implement DMCItem instead of DMCUser in the recommendation system. Additionally, these findings confirm those of the single-criteria collaborative filtering studies, i.e., that the itembased approach performs better than the userbased (Aggarwal, 2016;Gong, 2009;Ifada, Susanti, & Mula'ab, 2019;Sarwar, Karypis, Konstan, & Riedl, 2001).

CONCLUSION
This paper analyses and compares the performances of two methods that implement the normalization based multi-criteria collaborative filtering approach for a recommendation system. The developed methods are labelled based on the implemented normalization technique and the multi-criteria collaborative filtering approaches, i.e., Decoupling normalization and Multi-Criteria User-based approach (DMCUser) and Decoupling normalization and Multi-Criteria User-based approach (DMCItem). Experiment results using the Yelp Dataset that contains the multi-criteria hotel ratings show that DMCItem outperforms DMCUser at most Top-in terms of Precision and NDCG. Though DMCUser can perform better than DMCItem at large Top-, it is still more practical to implement DMCItem rather than DMCUser in a multi-criteria recommendation system since users tend to show more interest to items at the top list.
For further study, a possible plan is to implement other rating normalization techniques. Alternatively, we can also try to implement a fusion of the user-based and itembased multi-criteria collaborative filtering approaches to improve the methods developed in this paper.