Contextual recommender system for E-commerce applications

doi:10.1016/j.asoc.2021.107552

Applied Soft Computing

Volume 109, September 2021, 107552

https://doi.org/10.1016/j.asoc.2021.107552 Get rights and content

Full text access

Highlights

•
Hybrid Collaborative filtering model is proposed for recommender system.
为推荐系统提出了混合协同过滤模型。
•
It is aware of both context and semantic of user and item textual details.
它知道用户和项目文本详细信息的上下文和语义。
•
User embeddings are prepared using word2vec(w2v).
用户嵌入是使用 word2vec（w2v）准备的。
•
Items embeddings are generated using Convolutional Neural Network(CNN).
项目嵌入是使用卷积神经网络（CNN）生成的。
•
PMF is used as collaborative filtering techniques.
PMF 用作协作过滤技术。
•
The model is primarily proposed for missing rating prediction.
该模型主要用于缺失评级预测。
•
The model can be used as recommendation model.
该模型可用作推荐模型。
•
The proposed model is tested on three real world dataset.
所提出的模型在三个真实世界数据集上进行了测试。

Abstract 抽象

Today’s arena of global village organizations, social applications, and commercial websites provides huge information about products, individuals, and activities. This is leading to a plethora of content that requires effective handling to obtain the desired information. A recommendation system (RS) suggests relevant items to the user according to his/her desired preference. It processes various information related to users and items. However, RSs suffer from data sparsity. Generally, deep learning techniques are used in RSs for deep analysis of item contents to create precise recommendations. However, the effective handling of user reviews in parallel with item reviews is still an open research domain that can be further explored. In this paper, a hybrid model that handles both user and item metadata concurrently is proposed with the aim of solving the sparsity problem. To demonstrate the viability of the proposed methodology, a series of experiments was performed on three real-world datasets. The results show that the proposed model outperforms other state-of-the-art approaches to the best of our knowledge.
当今的地球村组织、社交应用程序和商业网站提供了有关产品、个人和活动的大量信息。这导致了大量内容，需要有效处理才能获得所需的信息。推荐系统（RS）根据用户所需的偏好向用户推荐相关项目。它处理与用户和项目相关的各种信息。但是，RS 存在数据稀疏的问题。通常，RS 中使用深度学习技术对项目内容进行深入分析，以创建精确的推荐。然而，在项目评论的同时有效处理用户评论仍然是一个开放的研究领域，可以进一步探索。在本文中，提出了一种同时处理用户和项目元数据的混合模型，旨在解决稀疏性问题。为了证明所提出的方法的可行性，我们在三个真实世界的数据集上进行了一系列实验。结果表明，据我们所知，所提出的模型优于其他最先进的方法。

Keywords 关键字

Probabilistic matrix factorization

Hybrid collaborative filtering

Convolutional neural network

Textual embedding

概率矩阵分解

混合协同过滤

卷积神经网络

文本嵌入

1. Introduction 1. 引言

In the present era of socioeconomic race, recommendation systems (RSs) are the center segment of numerous e-business associations, movie sites, e-libraries, articles, news, and music and social forums. RSs are being used by firms to infer the likes and dislikes of users, fans, and potential customers. Commercial companies and social website organizations extensively use their own developed RSs. Such organizations generate considerable amounts of revenue based on user-aligned recommendations. Such valid recommendations play a vital role and increase their business revenue manifold. RSs have accordingly gained significance with the broad utilization of the Internet. A plethora of data for a single entity is readily available on the internet. A RS focuses on user needs and provides what the user wishes to have or expects to be recommended. RSs are content retrieval processes that suggest necessary relevant information from a large amount of data. Such a huge body of data is gathered by business organizations with the passage of time [1]. RSs are beneficial for both users and organizations in terms of valid recommendations and generating ample revenue. RSs improve customers’ decision-making process for online products and allow them to find relevant information quickly [2]. In this regard, different recommendation methodologies are proposed in the literature, such as collaborative filtering (CF), content-based filtering (CBF) and hybrid filtering(HF) [1], [3], [4]. Netflix, Amazon, Facebook, Google News, YouTube, Twitter, LinkedIn and many other organizations have deployed their RS to target their potential customers. They aim to show recommendations on their websites as per the customer’s preferences. RS performance is dependent upon the historical interaction between users and items. Such user-to-item interaction can be mapped in the form of an explicit rating matrix. The same interaction can also be implicitly formed in terms of user reviews and item descriptions. However, the data are sparse because most users do not give their reviews or ratings at all times about an item they have purchased. Therefore, such mappings (both implicit and explicit) remain sparse. To handle the sparsity of data through concurrent handling of both users and items, a hybrid model is proposed in this paper. The hybrid model combines the semantics and contextual ingredients of users and items’ textual description to the explicit rating data (explicit). Such hybrid collaborative filtering model aims to solve the problem of data sparsity where the number of items relevant to users is very few.
在当今社会经济竞赛的时代，推荐系统（RS）是众多电子商务协会、电影网站、电子图书馆、文章、新闻以及音乐和社交论坛的中心部分。公司正在使用 RS 来推断用户、粉丝和潜在客户的好恶。商业公司和社交网站组织广泛使用他们自己开发的 RS。此类组织根据用户一致的建议产生可观的收入。这些有效的建议起着至关重要的作用，并增加了他们的业务收入。因此，随着 Internet 的广泛使用，RS 变得越来越重要。互联网上很容易获得单个实体的大量数据。RS 专注于用户需求，并提供用户希望拥有或期望被推荐的内容。RS 是从大量数据中建议必要相关信息的内容检索过程。随着时间的推移，商业组织收集了如此庞大的数据 [1]。RS 在有效推荐和产生丰厚收入方面对用户和组织都有好处。RS 改善了客户对在线产品的决策过程，并允许他们快速找到相关信息 [2]。在这方面，文献中提出了不同的推荐方法，例如协同过滤（CF）、基于内容的过滤（CBF）和混合过滤（HF） [1]、[3]、[4]。 Netflix、Amazon、Facebook、Google News、YouTube、Twitter、LinkedIn 和许多其他组织已经部署了他们的 RS 来定位他们的潜在客户。他们的目标是根据客户的喜好在他们的网站上显示推荐。RS 性能取决于用户与项目之间的历史交互。这种用户与项目的交互可以以显式评分矩阵的形式进行映射。相同的交互也可以隐式地形成在用户评论和项目描述方面。但是，数据很少，因为大多数用户不会一直对他们购买的商品进行评论或评分。因此，此类映射（隐式和显式）保持稀疏。为了通过并发处理用户和项目来处理数据的稀疏性，本文提出了一种混合模型。混合模型将用户和项目文本描述的语义和上下文成分与显式评级数据（显式）相结合。这种混合协同过滤模型旨在解决与用户相关的项目数量非常少的数据稀疏问题。

The proposed model is termed the contextual hybrid model for RS, which effectively handles rating matrix data sparsity. The proposed model integrates the generated embedding of users and items into the corresponding latent factor of probabilistic matrix factorization (PMF). These user and item embeddings are concurrently developed by w2v and a CNN. Here the w2v captures the semantics of the text and the CNN extracts the contextual details of the textual description. Moreover, the feature locality information is captured by dividing each feature vector into sections and then the max-pool operation is applied to each section in the max-over pooling layer [5]. PMF is used as a collaborative filtering technique that outperforms in sparse, imbalanced, and large datasets when incubated with auxiliary information. It can predict approximate nearer ratings against users. The closer the predicted ratings are, the more accurate item recommendations are that can be provided to the user. Experimental evidence illustrates that contemporaneous understanding of the semantics and context of constituent entities improves the performance of RSs. The major contributions of our paper are summarized as follows:
所提出的模型被称为 RS 的上下文混合模型，它有效地处理了评级矩阵数据的稀疏性。所提出的模型将生成的用户和项目的嵌入集成到概率矩阵分解（PMF）的相应潜在因子中。这些用户和项目嵌入由 w2v 和 CNN 同时开发。在这里，w2v 捕获文本的语义，而 CNN 提取文本描述的上下文细节。此外，通过将每个特征向量划分为多个部分来捕获特征位置信息，然后将 max-pool 作应用于 max-over pooling 层 [5] 中的每个部分。PMF 用作一种协作过滤技术，当与辅助信息一起孵化时，该技术在稀疏、不平衡和大型数据集中表现出色。它可以预测针对用户的近似 Nearer 评级。预测的评级越接近，可以提供给用户的商品推荐就越准确。实验证据表明，同时对组成实体的语义和上下文的理解可以提高 RS 的性能。本文的主要贡献总结如下：

•
A hybrid contextual model is proposed, which improves the accuracy of rating prediction by merging semantic and contextual details of both the user and item to collaborative filtering.
该文提出一种混合上下文模型，通过将用户和项目的语义和上下文细节合并到协同过滤中，提高了评分预测的准确性。
•
User/item latent models are converged for accurate recommendations based on their corresponding semantic and contextual details. In this regard, a convolutional neural network generates semantically enhanced contextual embedding of users and items.
用户/项目潜在模型会融合在一起，以便根据其相应的语义和上下文详细信息提供准确的推荐。在这方面，卷积神经网络生成用户和项目的语义增强上下文嵌入。
•
The feature locality information is captured by dividing each feature vector into sections and then the max-pool operation is applied to each section in the max-over pooling layer [5].
•
Experimentation shows effective handling of data sparsity and enhanced rating prediction and item recommendation accuracy when compared with other state-of-art models to the best of our knowledge.

The rest of this paper is organized as follows. Section 2 provides a review of the literature related to background knowledge. Our proposed model is presented in Section 3. Experimental findings and comparisons are discussed in Section 4. Finally, we conclude in Section 5.

2. Background and related work

The RS is built upon various collaborative filtering (CF), content-based filtering(CBF), and hybrid filtering (HF) techniques. The workflow of these methodologies depends on how the provided data is processed. Each of these techniques is briefly explained to provide background knowledge. CF, the most widely used RS technique, makes recommendations on the basis of user/item historical data. It exploits the behavior of users and reviews on items to predict ratings and attempt accurate item recommendations. This technique is built on the basis of neighborhood search, where other groups of people (having the same interest, likes and dislikes) are looked into with respect to the current user. Subsequently, similar items are recommended. It assimilates implicit ratings built in the form of a sparse matrix wherein missing ratings are to be predicted based on historical ratings. This improves decisions made by clients based on neighboring similar users [6]. CF is further subdivided into memory & model-based filtering. The first is based on the past history of user or item, whereas later attempts to discover entities (neighbors) who have similar features. In model-based RS, learning methods are implemented to build trained models learned on user preferences or item trends, based on which missing ratings are predicted and recommendations are made [4], [7].

CBF recommends items even if no ratings to that item are present in its previous record. It is based on item contents and feedback given by clients through ratings or reviews. Because of knowing about users/items, cognitive filtering is another name for CB filtering. In light of this, client profiles are created on given data, which are further utilized for making recommendations and rating predictions. As more data are given by the client, more accurate recommendations are made [8]. In the CB filtering technique, machine learning-based algorithms develop user/item profiles and capture their trends/choices. Accordingly, items having similarities with user profiles are recommended to the user. To better grasp the item profile, consider news/books/products that have different features, individuals, concepts, contents and properties, etc. On the other hand, the user profile is composed of the user’s biodata information, likes/dislikes, age, etc.[9]. CBF recommends items as soon as valid details about items are available [2], [9], [10]. The CB methodology is similar to clustering items into two sections. One cluster is relevant to client preferences, and the other is irrelevant. The CB technique recommends related/relevant items to users according to their preferences. In this regard, user profiles are built on the basis of user document content.

Hybrid filtering is a joint venture of both collaborative and content-based techniques. To synergize the strengths of both CB and CF, hybrid filtering suppresses the limitations of both while incorporating positive aspects. It provides more precise suggestions and performs comparatively better than both CBF and CF. Different hybrid filtering methods have been presented by researchers in the academic literature; however, this is still an open area of research, particularly for data sparsity and cold-start problems [2], [11], [12].

A vast amount of research on RS using side information has been published in the literature. Earlier RSs utilize either CBF or CF techniques for recommendation tasks. In CF, products are recommended by a model based on client aptitude, i.e., nearest neighboring [13], matrix factorization (MF) [14], nonnegative MF (NMF) and singular value decomposition (SVD) [15]. Various methodologies have been adopted to improve PMF performance by processing auxiliary information followed by the proposed Bayesian version of PMF [16], [17], [18]. Another filtering method named trust awareness is presented in [8]. It uses a collaborative technique based on the concept that CF can be influenced by user reputation (computed by propagating user trust). Eigentaste calculation, uses worldwide inquiries to extricate client appraisals and applies principal component analysis PCA for recommendations is proposed [19].

CBF, another strategy utilized for RS, suggests products based on contents (side/auxiliary data) of either client or items. In [20], RS techniques that utilize the local information of a product and generate distributed representation while neglecting meta data are presented. The technique suggested in [21] is capable of identifying small articles for recommendations related to home improvement. The recommendation is made on the basis of user profile comparison with available document contents. Hybrid approaches have been created by researchers to incubate RS performance by combining the strengths of CF and CBF [2]. [22] proposed the restricted Boltzmann machine, which finds item similarities and maps them in CF. Cross-domain MF along with the coordinate transfer model explored by the author in [23]. The same is further studied in depth by [24] in a use case where no shared client exists in cross domains. Subsequently, a generative model for finding comparative grouping between various domains was proposed. In [25], the multiview deep neural network (MvDNN) is presented; it maps users and items in a shared space. In this approach user features/profiles are formulated from their past history, and films are suggested on the basis of nearest neighbors with identical profiles who watch the film that is being recommended. In [12], the hidden factor and topic (HFT) model, which is based on topic modeling techniques, is proposed. The model combines the latent review topics to interpret rating dimensions of either users or items. The results show comparatively better performance to when reviews or ratings are used separately. CADE (collaborative denoising autoencoder) presented in [26] predicts top-N recommended items. It utilizes only the rating matrix and neglects the auxiliary information. In [11], a unified model integrating CB and CF to solve the cold start problem is proposed. It applies topic modeling for user reviews to improve recommendation performance. Latent Dirichlet Allocation (LDA) a topic modeling technique that captures topic details of given data, was proposed in [27]. It addresses the word2vec (w2v) limitation, representing the data locally, i.e., semantics only [28]. In [29], the author proposed the topic word embeddings (TWE) model, which relates latent topic models to each word in text. TWE is learned based on both semantic and corresponding topics. Therefore, document vector representations influenced by different topics are obtained. In [30] Collaborative Deep Learning (CDL) where joint functioning of content deep representation learning and collaborative functioning can improve RS performance. CTR [31] is a topic regression model wherein CF and topic modeling are done simultaneously.

Recently, deep learning (DL) techniques have gained importance in the domains of AI, machine learning, natural language processing (NLP) and speech recognition. DL is being widely utilized for implementing RS using both CF and CBF. In [32], DL techniques are used as a marginalized denoising autoencoder (mDA), which applies mDA to learn item latent representation and neglects randomly corrupted features. Thus, it also reduces computational complexity and training costs. In [33], Hybrid-CF and content-based music recommendations are researched; they learn music content latent factors by using matrix factorization and employ DL techniques to regenerate them for better song suggestions. The DeepConn model in [34] uses both user and item embeddings as latent factors and models their interactions through factorization machines. It is based on a dual neural network to map latent factors into shared space for making recommendations. [35] proposed a hybrid collaborative filtering model in which non-negative matrix factorization is used as a collaborative filtering technique and its latent factors are initialized by the user’s embedding. [36] a presented deep semantic based hybrid model based on the integration of lada2vec with PMF. [37] proposed dual-regularized MF with deep neural networks to cope with the sparsity issue of RS. It adopted a multilayered model by stacking a CNN and gated RNN to formulate independently distributed representations of user and item contents. The model essentially integrates the CNN with the RNN models to better learn the word dependency and sequential information of words. A MF algorithm is then applied at the prediction layer to finally compute the rating prediction. [38] presents a CF technique named deep MF that integrates DL with MF. It undertakes successive reinforcement of MF with a layered framework that utilizes the gathered knowledge on one layer as input to the following layers. Some researchers have adopted the RNN model and demonstrated improved performance for recommendation systems. The majority of the RNN based models are specifically applied to formulate session-based recommendations. In [39], [40] the authors proposed a feature rich session-based method by exploiting a parallel RNN architecture for enhancing the performance of the system. The proposed approach typically encompasses several RNNs, one for each representation of the item. The hidden states of the RNN model are integrated for generating the scores for all items. In this method, specific users’ sessions can be considered as a sequence of clicks. [41] proposed a context-aware RS that extended the CF technique and learned nonlinear interaction between latent features of users, items, and sequential latent contexts representation. In [42] the authors proposed a CNN based recommender system that obtains the user profile and predicts his/her ratings using an attention-based CNN. Moreover, it recommends top-n online courses to the interested students according to their profile. In [43], an e-business recommendation system is proposed, in which user to item rating matrix and user reviews are considered. The proposed technique combines reviews mining and deep learning to the attention mechanism.

Our proposed model formulates a vector representation of item and user textual contents through w2v & CNN, thus capturing both semantics and the context of words. It combines the generated representation (having both semantic and contextual details) with a collaborative technique aimed at improving rating prediction accuracy and top-n item suggestions. A comparative analysis of the proposed model with the state-of-the-art models (to the best of our knowledge) is presented in Table 1 based on each model’s strengths and weaknesses.

Table 1. Comparative analysis of models.

Model	Inputs	Task	Metric	Feature vector	Semantics	Context
PMF [14]	Ratings data	Ratings prediction	RMSE.	No	No	No
CDL [30]	Ratings data, User documents.	Ratings prediction	mAP, Recall	LDA	No	No
CDL [31]	Ratings data, User documents.	Ratings prediction	Recall.	LDA	No	No
ConvMF [44]	Ratings data, User documents.	Ratings prediction	RMSE	TF–IDF	No	User only
CERMF [45]	Ratings data, User & item documents.	Ratings prediction	RMSE, MAE	TF–IDF	No	User & item
DRMF [37]	Ratings data, User & item documents	Ratings prediction & Items Rec	RMSE, MAE, Precision, Recall	Glove	No	User & item
CapsMF [46]	Ratings data, User & item documents	Rating prediction & Items Rec	RMSE, MAE, Precision, Recall	Glove	No	User & item
CMF-HRS	Ratings data, User & item documents	Ratings prediction & Items Rec	RMSE, MAE, Precision, Recall, F1 score	w2v	User & item	User & item

3. Contextual matrix factorization-hybrid recommendation system (CMF-HRS)

In this section, the proposed framework “Contextual Matrix Factorization for Hybrid Recommendation System” (CMF-HRS) is explained. In CMF-HRS, user’s and item’s textual documents containing reviews are taken as input to the framework. These textual documents are processed by w2v and the CNN to extract semantics and contextual details, respectively. In this regard, user reviews for an item indicate the user’s profile, and the item’s textual description shows the item’s profile. Both of these textual descriptions are known as auxiliary information, as the same will support the framework in making user and item’ profile in later stages. Here, w2v is applied to generate dense vector representations of the textual description Fig. 1.

After creating user and item embeddings, both of these embeddings are concurrently processed by the CNN (used as NLP technique). At the convolution layer, the CNN extracts the contextual ingredient of the textual description through convolution with a kernel [44]. Contextual details depict each word utilization within a sentence either as noun, verb, adjective, etc. Moreover, the word effect on other words in that sentence (syntax analysis) is also represented by the context in which that word is used. Therefore, user and item documents are converted into embeddings that encompass both their semantics and contextual ingredients.

Subsequently, the enriched embeddings are input to the collaborative filtering technique, i.e., probabilistic matrix factorization (PMF) Fig. 2. Here enriched embeddings indicate such embeddings which encapsulate both the semantics and context details of the user and item textual description. User and item constituents extracted by the CNN are merged into collaborative filtering techniques Different kinds of MF are used as collaborative filtering techniques in the RS. Matrix factorization divides a given matrix into its lower dimension sub-matrices. In the RS, it decomposes the rating matrix into three sub-matrices that are categorized as the user, item, and data concept latent matrices. Collaborative filtering is the practical implementation of matrix factorization in which user-to-item historical interaction is identified. The user to item rating matrix is input to the matrix factorization and based on the identified relationship; it is contemplated how the user would rate the items in future. Accordingly, better item recommendations are made to the user. PMF is used as a collaborative filtering technique in the proposed model. Both latent factors of user and item are initialized with embeddings generated by w2v and the CNN. To prove the efficacy of the proposed model, a series of experiments were carried out on three real-world datasets of Amazon Instant Videos (AIV), Apps for Android (AA) and Yelp where every user’s profile is described with reviews and an item’s profile with a corresponding description. This section elaborates on the generation of embeddings and a mathematical illustration of the proposed model. Subsequently, probabilistic matrix factorization, which is used to combine both item and user embeddings to predict missing ratings, is explained. Additionally, n-items are recommended based on predicted ratings.

3.1. Mathematical illustration of the proposed model

Let P

=

[1,2,3....M] be the set of M users and set Q

=

[1,2,3, …N] is the list of N items. Suppose users ratings to the items are represented in a 2D rating matrix

R_{m n} = {[1, 2, 3, 4, 5]}^{M \times N}

. It is known that the user may or may not rate all the items purchased by him/her; thus, there will be some rating values missing in the user to item rating matrix R. The role of a recommender system is to predict such missing ratings of users and based on predicted ratings, items are recommended to the users. The recommender system predicts these missing values based on the known rating values and the user and item auxiliary information that is available.

3.1.1. Embeddings generation

The proposed RS utilizes the w2v as shown in Fig. 1, and a CNN to generate user and item embeddings. CNNs are dominant in the fields of image processing and computer vision as prominent solution entities for classification, labeling, and anomaly detection in image frames, image segmentation, and edge detection. However, their utilization in natural language processing is also on the rise and is being explored by academic researchers. A CNN as an NLP technique captures the contextual meaning of the documents’ vector representation, which is input to its embedding layer. The CNN has four consecutive layers, Embedding, Convolution, Max-pool, and Projection layers. User documents (UDs) and item documents (IDs) are processed by the w2v model for dense vector representation. Subsequently, the embedding layer of the CNN is initialized with these word vectors concatenated in the form of a meaningful matrix. The matrix encapsulates the semantics of the words present in the document. The details of captured semantics are proportional to the dimension/size of word vectors, which is set as ‘d’. ‘d’ is kept similar to the predefined dimensional parameter for both w2v and the embedding layer of the CNN so that both can be linked. We know that user and item auxiliary documents are composed of word sequences and denoted by vector notation of the ‘d’ dimension generated by w2v. Let

I D

contain n1 words for item 1, n2 words for item 2, and n words for the

N

th item. Similarly,

U D

denotes the user auxiliary documents with m1 words for user 1, m2 words for user 2, and m words for the

M

th user. Both item document

I D

and user document

U D

can be represented as follows:

I D = {I D}_{1}, {I D}_{2}, {I D}_{3}, \dots, {I D}_{N}

I D = [w_{1}, w_{2}, w_{3}, \dots, w_{n 1}] \oplus [w_{1}, w_{2}, w_{3}, \dots, w_{n 2}] \oplus \dots \oplus [w_{1}, w_{2}, w_{3}, \dots, w_{N^{t h}}]

Similarly:

U D = {U D}_{1}, {U D}_{2}, {U D}_{3}, \dots, {U D}_{M}

U D = [w_{1}, w_{2}, w_{3}, \dots, w_{m 1}] \oplus [w_{1}, w_{2}, w_{3}, \dots, w_{m 2}] \oplus \dots \oplus [w_{1}, w_{2}, w_{3}, \dots, w_{M^{t h}}]

where

w_{N^{t h}}

and

w_{M^{t h}}

represent the last words in item and user documents, respectively. Moreover,

\oplus

shows the concatenation of each item and user respective description. Subsequently, prepossessed corpora

{I D}^{c}

and

{U D}^{c}

are generated by text cleaning, removing stop words, deleting unwanted symbols, and applying tokenization and various other sub-tasks for improved results in the following processing. Here, let us assume that total

l_{i}

and

l_{u}

words were present in the processed corpus of item and user, respectively. Therefore,

{I D}^{c}

and

{U D}^{c}

can be denoted as follows:

{I D}^{c} \underset{Cleaned Corpus}{\overset{}{⟵}} {I D}_{1}, {I D}_{2}, {I D}_{3}, \dots, {I D}_{N}

{U D}^{c} \underset{Cleaned Corpus}{\overset{}{⟵}} {U D}_{1}, {U D}_{2}, {U D}_{3}, \dots, {U D}_{M}

where

•
${I D}^{c}$ : Item cleaned Corpus
•
${U D}^{c}$ : User cleaned Corpus
•
N : Total number of Items
•
M : Total number of Users

To extract semantics and achieve numerical representation, the w2v model was applied to the cleaned corpora. Accordingly, dense vector representation is obtained by the probabilistic model defined as Eq. (1), which is optimized with the objective function in Eq. (2). Such dense vectors are semantically well learned and densely populated. The cost function reveals the average negative log-likelihood, thus minimizing it, leading to maximized predictive accuracy. It has been applied to each word position

i = 1

i = w_{n}

i = w_{m}

in the text of respective user and item documents. Vector representation is measured by the model as per the length normalized average of the word vectors Eq. (3) (for items) and Eq. (4) (for users). (1)

L_{θ} = \prod_{i = 1}^{w_{n}} \prod_{- x < j < x, j \neq 0} P (w_{i + j} | w_{i}; θ)

(2)

J (θ) = - \frac{1}{w_{n}} l o g (L_{θ}) = - \frac{1}{w_{n}} \sum_{i = 1}^{w_{n}} \sum_{- x < j < x, j \neq 0} l o g P (w_{i + j} | w_{i}; θ)

(3)

W_{v}^{i} = \frac{1}{w_{n}} \sum_{i = 1}^{w_{n}} \frac{1}{∣ ∣ w_{i} ∣ ∣} \cdot w_{i}

(4)

W_{v}^{u} = \frac{1}{w_{m}} \sum_{i = 1}^{w_{m}} \frac{1}{∣ ∣ w_{i} ∣ ∣} \cdot w_{i}

where

W_{v}^{i}

and

W_{v}^{u}

represent an item and user documents sentence representation in the form of dense vectors. Here,

w_{1}, w_{2}, w_{3}, \dots, w_{n}

is the vector embedding of words in each item-related document, and

∣ ∣ w_{i} ∣ ∣

is the

L_{2}

Norm of the

w_{i}

vector. The dimension for each sentence vector representation was set to ‘d’ for both the user and item. Accordingly, due to concatenated sentences in the prepossessed corpora

{I D}^{c}

and

{U D}^{c}

, matrix representation was developed by the w2v shown as follows for both the user and item: Item Embedding(IE)=

Item Embedding(IE) = [\begin{bmatrix} {ID}_{1} = w_{1}, w_{2}, w_{3}, \dots, w_{n 1} \\ {ID}_{2} = w_{1}, w_{2}, w_{3}, \dots, w_{n 2} \\ {ID}_{3} = w_{1}, w_{2}, w_{3}, \dots, w_{n 3} \\ ⋮ ⋮ \\ {ID}_{N} = w_{1}, w_{2}, w_{3}, \dots, w_{N^{t h}} \end{bmatrix}] = [\begin{bmatrix} v_{10}, v_{11}, v_{13}, \dots, v_{1 d} \\ v_{20}, v_{21}, v_{23}, \dots, v_{2 d} \\ ⋮ \\ v_{i 0}, v_{i 1}, v_{i 3}, \dots, v_{i d} \\ ⋮ \\ v_{l 0}, v_{l 1}, v_{l 3}, \dots, v_{l_{i} d} \end{bmatrix}] l_{i} \times d

User Embedding(UE) = [\begin{bmatrix} {UD}_{1} = w_{1}, w_{2}, w_{3}, \dots, w_{m 1} \\ {UD}_{2} = w_{1}, w_{2}, w_{3}, \dots, w_{m 2} \\ {UD}_{3} = w_{1}, w_{2}, w_{3}, \dots, w_{m 3} \\ ⋮ ⋮ \\ {UD}_{M} = w_{1}, w_{2}, w_{3}, \dots, w_{M^{t h}} \end{bmatrix}] = [\begin{bmatrix} v_{10}, v_{11}, v_{13}, \dots, v_{1 d} \\ v_{20}, v_{21}, v_{23}, \dots, v_{2 d} \\ ⋮ \\ v_{u 0}, v_{u 1}, v_{u 3}, \dots, v_{u d} \\ ⋮ \\ v_{l 0}, v_{l 1}, v_{l 3}, \dots, v_{l_{u} d} \end{bmatrix}] l_{u} \times d

v_{l_{i} d}

denotes the

l

th dimension of the vector in the item embedding matrix, and

l_{u} d

represents the

l

th dimension of the similar vector in the user embedding matrix. In user and item embeddings, word vectors are sequentially connected, thus holding semantics only. Subsequently, the embedding layer of the CNN is initialized by these generated embedding matrices. The CNN undertakes the lexical and syntax analysis of item and user documents to capture their fine grain contextual details. Contextual details mean how the order of words matters in a sentence. For example, in the sentence “the movie Gladiator is an English language movie and has English actors”, the word English is used in two different contexts. In the first case, it corresponds to the movie, and in the second, it belongs to the actors in the movie. The CNN captures such contextual details of both user and item documents. Let

v_{i} \in

Item Embedding be the

i

th word vector of the d-dimension and

v_{u} \in

User Embedding; thus, an item sentence comprising

x

words and a user sentence comprising

y

words can be represented as follows:

v_{1 : x} ≖ v_{1} \oplus v_{2} \oplus \dots \oplus v_{x}

v_{1 : y} ≖ v_{1} \oplus v_{2} \oplus \dots \oplus v_{y}

where

\oplus

denotes concatenation. Generally,

v_{x : x + k}

shows concatenation of word vectors

v_{x}, v_{x + 1}, v_{x + 2}, \dots, v_{x + k}

In the Convolutional Layer, contextual information about item textual documents and user reviews is acquired through the convolution kernel over the concatenated words. The size/length of the convolutional kernel defines the number of words considered in one convolution operation. Various window sizes were used to obtain content from item and user documents. The convolution kernel

K \in R^{d \times W}

was passed through

W

words that were convolved together, and their dimensions remained fixed. Let

\partial_{i}

and

\partial_{u}

created from the convolution operation of item’s and user’s word vectors

v_{i : i + W - 1}

and

u_{u : u + W - 1}

be represented by the following:

\partial_{i} = F (K_{W} \otimes I E (:, v_{i : i + W - 1}) + b i a s_{i})

\partial_{u} = F (K_{W} \otimes U E (:, v_{u : u + W - 1}) + b i a s_{u})

where

\otimes

shows the convolution operation.

K_{W}

is the convolution kernel of size W.

b i a s_{i} \in R

and

b i a s_{u} \in R

are real numbers that act as bias terms, and

F

is a mathematical nonlinear function that can be

tanh

, sigmoid or ReLU. The convolutional kernel is applied to different possible sizes of word vectors in a sentence, and a contextual features map is generated as follows:

θ_{i} = [\partial_{1}, \partial_{2}, \partial_{3}, \dots, \partial_{l_{i} - W + 1}]

θ_{u} = [\partial_{1}, \partial_{2}, \partial_{3}, \dots, \partial_{l_{u} - W + 1}]

Max-over pooling layer operations

max (θ_{i})

and

max (θ_{u})

are usually applied to capture the most important contextual feature, i.e., the highest value of

θ_{i}

and

θ_{u}

amongst the feature map. Moreover, variable sentence length is also addressed by padding to construct a fixed length feature vector of both item and user documents. However, feature locality information can be lost by applying a direct

max

operation. Moreover, frequency information of a feature appearing frequently may also be neglected by applying

max

directly. Therefore, to capture this information, we divided each feature vector into sections and applied the

max

operation to each section to obtain the maximum contextual details [5]. To calculate the size of each section of the feature map, the length of each segment is represented by Eq. (5), where it is supposed that

l_{u} = l_{i} = l

for ease of understanding and the entire feature map is divided into

ζ

sections. (5)

S e c = \frac{l - W + 1}{ζ + 1}

(6)

Θ_{i} = [max (θ_{1} \dots θ_{1 S e c}), max (θ_{2} \dots θ_{2 S e c}), max (θ_{3} \dots θ_{3 S e c}), \dots, max (θ_{l_{i} - W + 1} \dots θ_{l_{i} S e c})]

(7)

Θ_{u} = [max (θ_{1} \dots θ_{1 S e c}), max (θ_{2} \dots θ_{1 S e c}), max (θ_{3} \dots θ_{1 S e c}), \dots, max (θ_{l_{u} - W + 1} θ_{l_{u} S e c})]

In the

O u t p u t L a y e r

, features captured in the max-pooling layer should be converted into a form that is required to perform subsequent tasks. Here, the next step is to initialize the latent factors of the collaborative filtering PMF with the details from the output layer. These high-level features

Θ_{i}

and

Θ_{u}

are converted to

d -

-dimensional vector space by using nonlinear projection: (8)

V = tanh (M_{2} {tanh (M_{1} (Θ_{i}) + b_{1})} + b_{2})

(9)

U = tanh (M_{2} {tanh (M_{1} (Θ_{u}) + b_{1})} + b_{2})

where

M_{1}

and

M_{2}

are projecting matrices and

b_{1}, b_{2}

are bias vectors for

M_{1}, M_{2}

. Consequently, after going through the above methodology, the CNN and w2v both jointly become functions (

f n_{i}

for items and

f n_{u}

for users), which take items and users’ raw documents as input and generate latent models

V

and

U

while manipulating internal weights. The obtained

V

and

U

are linked up to PMF items and users latent models, respectively. (10)

V_{n} = f n_{i} (W, I D_{n})

(11)

U_{m} = f n_{u} (W, U D_{m})

3.1.2. Probabilistic matrix factorization

Probabilistic matrix factorization (PMF), a collaborative filtering technique, is used by mathematicians to present a larger matrix into the shared dimensional latent space of three smaller matrices with reduced dimensions. Similarly, PMF represents a larger rating matrix into three smaller matrices. The constituent matrices are interpretive and humanely intuitive to understand the information of the parent rating matrix. The rating matrix

R_{m n} \in {[1, 2, 3, 4, 5]}^{M \times N}

is decomposed by PMF into three matrices,

P_{m} = P^{M \times D}

Q_{n} = Q^{D \times N}

and

δ_{D} = δ^{D \times D}

. where

•
$P_{m} = P^{M \times D}$ : User Latent Factor
•
$Q_{n} = Q^{D \times N}$ : Item Latent Factor
•
$δ_{d} = δ^{D \times D}$ : Concept latent factor

The proposed RS technique integrates a modified CNN into PMF to nourish the user latent factor (P) and item latent factor (Q) with both semantics and contextual details of items and users. The proposed model predicts a ratings matrix

{\hat{R}}_{m n}

for user

m \in P^{M \times D}

who is expected to rate an item

n \in Q^{D \times N}

. The predictions are based on known ratings in the given ratings matrix

R_{m n}

in congestion with prepossessed auxiliary information

{I D}^{c}

and

{U D}^{c}

of items and users. Thus, (12)

{\hat{R}}_{m n} = (P_{m}^{T} . Q_{n} . δ)

where

δ

reveals the presence of concepts in

R_{m n}

. To obtain the predicted ratings matrix

{\hat{R}}_{m n}

, it is mandatory to determine the user matrix

P_{m}

and item matrix

Q_{n}

such that their product returns the predicted rating matrix. Both latent factors are calculated through an iterative process of minimization over a probabilistic cost function as follows: (13)

{\hat{R}}_{m n} = \sum_{m}^{M} \sum_{n}^{N} O {(R_{m n} - P_{m}^{T} Q_{n})}^{2} + δ_{m} \sum_{m}^{M} {‖ P_{m} ‖}^{2} + δ_{n} \sum_{n}^{N} {‖ Q_{n} ‖}^{2}

where O is a binary function that is set to 1 if the user has rated the item and kept zero vice-versa. In Eq. (13),

δ_{m}

is the user regularization term, and

δ_{n}

is the item regularization term. The performance of the model over the aforesaid equation is monitored through the root mean square error (RMSE) and mean absolute error. Probabilistic linearity over a Gaussian noise distribution is adopted for model convergence, wherein the conditional distribution over the observed ratings can be represented as follows: (14)

p {(R_{m n} | P, Q, δ^{2}) = \prod_{m}^{M} \prod_{n}^{N} ζ (R_{m n} | P_{m}^{T} Q_{n}, δ^{2})}^{O_{m n}}

In Eq. (14),

ζ (R_{m n} | P_{m}^{T} Q_{n}, δ^{2})

is a Gaussian distribution with mean

P_{m}^{T} Q_{n}

and variance of

δ^{2}

. As items and users latent factors are generated by the CNN from their corresponding textual documents, three variables of CNN were considered: (i) CNN hidden layer weights, (ii)

n_{t h}

item document

I D_{n} & m_{t h}

user document representation

U D_{m}

; and (iii)

α_{n}

and

α_{m}

variables as Gaussian noise. Consideration of these variables leads to more accurate user and item probabilistic latent models as follows:

V_{n} = f n_{i} (W, I D_{n}) + α_{n} s . t, α_{n} \approx G (0, δ_{v}^{2} O)

U_{m} = f n_{u} (W, U D_{m}) + α_{m} s . t, α_{m} \approx G (0, δ_{u}^{2} O)

(15)

p (P | W, U D_{m}, δ_{u}^{2}) = \prod_{m}^{M} ζ (P_{m} | f n_{u} (W, U D_{m}), δ_{u}^{2} O)

(16)

p (Q | W, I D_{n}, δ_{v}^{2}) = \prod_{n}^{N} ζ (Q_{n} | f n_{i} (W, I D_{n}), δ_{v}^{2} O)

Trainable variables, including the internal layer weights of the CNN, can be combined to approximate the proposed model as follows: (17)

p (P, Q, W | R_{m n}, I D_{n}, U D_{m}, δ^{2}, δ_{v}^{2}, δ_{u}^{2}) \approx p (R_{m n} | P, Q, δ^{2}) p \times (P | W, U D_{m}, δ_{u}^{2}) p (Q | W, I D_{n}, δ_{v}^{2})

3.2. Optimization

To converge the proposed model and to maximally optimize both latent factors, a posterior estimation is considered prudent as follows: (18)

max_{P, Q} p (P, Q, W | R_{m n}, I D_{n}, U D_{m}, δ^{2}, δ_{v}^{2}, δ_{u}^{2}) = max_{P, Q} [p (R_{m n} | P, Q, δ^{2}) p (P | W, U D_{m}, δ_{u}^{2}) p \times (Q | W, I D_{n}, δ_{v}^{2})]

Applying a negative logarithmic function to Eq. (18), gives: (19)

ln (P, Q) = \sum_{n = 1}^{N} \sum_{m = 1}^{M} \frac{O_{m n}}{2} {(R_{m n} - P_{m}^{T} Q_{n})}^{2} +

\frac{α_{m}}{2} \sum_{m = 1}^{M} {‖ P_{n} - f n_{u} (W, U D_{m}) ‖}^{2} + \frac{α_{n}}{2} \sum_{n = 1}^{N} {‖ Q_{m} - f n_{i} (W, I D_{n}) ‖}^{2}

where

α_{m} = \frac{δ^{2}}{δ_{u}^{2}}

α_{n} = \frac{δ^{2}}{δ_{v}^{2}}

and

{‖ . ‖}^{2}

is the norm Frobenius. Repetitive optimization was adopted for users & items latent models (P & Q) while fixing the remaining variables. Taking the derivative of Eq. (19) with respect to

P_{m}

and

Q_{n}

in a closed form of finding local minima renders following: (20)

P_{m} = \frac{Q R_{m} + α_{m} f n_{u} (W, U D_{m})}{Q O_{m} Q^{T} + α_{m} O_{d}}

(21)

Q_{n} = \frac{P R_{n} + α_{n} f n_{i} (W, I D_{n})}{P O_{n} P^{T} + α_{n} O_{d}}

where

O_{m}

and

O_{n}

are diagonal matrices with

m = 1, \dots, M

and

n = 1, \dots, N

as elements. Moreover,

R_{m}

and

R_{n}

are vectors with

{(R_{m})}_{m = 1}^{M}

for user m and

{(R_{n})}_{n = 1}^{N}

for item n. The overall abstraction of mathematical flow of the model is shown in Fig. 3. Moreover, implementation of the proposed model is briefly elaborated as follows:

4. Experimentation and results

This section elaborates on the performance evaluation of the proposed model based on real-world datasets. These datasets are primarily used in the field of RSs. It explains dataset preparation, i.e., preprocessing, followed by adjusting experimental parameters and results presented in tabular/graphical forms. Moreover, a comparative analysis of the proposed model with other state-of-the-art RS models is also provided in this section.

4.1. Dataset preprocessing

Three real-world datasets obtained from two different platforms were used to demonstrate the proposed model performance. AIV and AA downloaded from¹ and the Yelp dataset obtained from² were used for experimentation. The Amazon datasets fall in

22

subcategories of user reviews and product meta-data. The Amazon datasets are mostly used by researchers for RS model experimentation. The Yelp dataset is used for both the sentiment analysis and RS. We selected two different datasets named AIV and AA from the Amazon data bank to check the performance of our proposed model. In these datasets, item opinions were given by numerous users of different profiles. A threshold is set to select reviews of a user to build his profile. The dataset is selected based on sparsity, which was supported with auxiliary information available in the form of item plots and user reviews. The Table 2 shows the statistics of the dataset.

The datasets were divided into training, validation and testing at 80%, 10%, and 10%, respectively. User reviews and item textual descriptions were preprocessed by removing stop words followed by Part of Speech (PoS) tokenization. The TF-IDF threshold was set to 0.5 to exclude less frequent words from documents, i.e., to remove those words that appeared less frequent in documents and possessed no meaningful clues in the text. The vocabulary size is fixed to 8000, and introduced two thresholds

t_{A}

to limit the associated review size and

t_{B}

to control the size of the corresponding document. For users and items, a suitable combination of

t_{A}

and

t_{B}

may be different; however, it is kept constant (

t_{A}

=0.8 and

t_{B}

=0.5) for simplicity. Items with no ratings were removed from the datasets to obtain precise and accurate results. Subsequently, the preprocessed information was passed to the w2v word embedding model to produce a dense vector representation of user reviews and item plots. In this regard, the min-count is set to zero so that each word semantic in the sentence is captured and its influence is reflected by the vectors. The dimension of the vectors was set to 200 to be integrated with the embedding layer of the CNN. Subsequently, the CNN embedding layer was initialized by the vector notation of the text generated through w2v. The embedding layer dimension was also set to 200, the same as that of the w2v dimensions. Moreover, different window sizes [3,4,5] were used by the CNN model with dropout ratio to avoid over fitting. Moreover, various windows led to capturing the context of 3, 4 and 5 consecutive words. After obtaining user and item contextual information through CNN, PMF latent factors were initialized by them, respectively.

Table 2. Datasets statistical data.

Datasets	Users	Items	Ratings	Range	Density
AIV	5130	1685	37 126	[1 – 5]	0.429%
AA	87 271	13 209	752 937	[1 – 5]	0.065%
Yelp	12 146	27 774	408 410	[1 – 5]	0.121%

4.2. Evaluation matrices

To evaluate the performance of any devised model, different types of evaluation techniques are used by researchers [47]. Evaluation matrices can be divided into two categories. 1. Prediction Matrices 2. Classification Matrices. The proposed experimental model has been evaluated on the basis of RMSE (root mean square error) and MAE (mean absolute error), which are used as prediction matrices. Both RMSE and MAE are widely used in the domain of error measurement between actual and predicted values. The better performance of the proposed model is depicted by the repeated convergence of both evaluation matrices. Implementation of the same is based on a cost function for rating predictions: (22)

R M S E = \sqrt{\frac{\sum_{m, n}^{M, N} {(R_{m n} - {\hat{R}}_{m n})}^{2}}{M N}}

(23)

M A E = \frac{\sum_{m, n}^{M, N} (R_{m n} - {\hat{R}}_{m n})}{M N}

where

R_{m n}

is the actual rating,

{\hat{R}}_{m n}

is the predicted rating,

M N

represents the total number of ratings,

m

represents the instant user and

n

represents the instant item. It is apprised that decreasing values of both RMSE and MAE with each epoch refers to the improving model performance. The minimum RMSE value for any model depicts its better performance in near real times.

In addition to rating prediction, the proposed model also recommends top-n items. Therefore, in addition to RMSE & MAE, recall and precision were adopted for the proposed model evaluation with respect to item recommendations. To elaborate precision and recall, we need to consider a list of top-n recommended items to a user. First denotes the ratio of correctly recommended items present in the top n recommended items and later represents the proportionate of correctly recommended items returned in top n recommended items. Moreover, in order to evaluate the overall performance of the proposed model, a single value evaluation metric i.e. F1-score is also calculated. F1-score can be taken as the harmonic mean of both precision and recall and is considered as a single values evaluation metric: (24)

P r e c i s i o n = \frac{[R e c I t e m s] \cap [t o p N i t e m s]}{# o f t o p N i t e m s}

(25)

R e c a l l = \frac{[R e c I t e m s] \cap [t o p N i t e m s]}{# o f R e c I t e m s}

(26)

F 1 - s c o r e = 2 \times \frac{1}{\frac{1}{P r e c i s i o n} + \frac{1}{R e c a l l}}

where

[Rec Items] denotes recommended items

[top N items] represents the top N items selected in the list of recommended items

The overall performance of the proposed model has been compared with the following models to the best of our knowledge:

•
PMF [14] is a basic collaborative filtering technique in the field of RS that applies random initialization of both user and item latent factors. It only uses user-to-item explicit rating matrix as input to the collaborative filtering technique.
•
ConvMF [44] exploited movie plots and formulated item contextual embeddings from CNN for further integration into PMF. It takes into account both explicit rating matrix and implicit movie plots (item’s content). This approach improved the performance of RS; however, the user’s profile was not made part of the model.
•
CERMF [45] deployed two convolutional neural networks with embedding layers initialized randomly for both users and items. The probabilistic collaborative filtering technique was used by the model for rating prediction.
•
DRMF [37] used a dual regularized deep neural network and simultaneously generated both user and item latent factors. DNN was employed for a deep understanding of auxiliary details and extracted meaningful outcomes.
•
CapsMF capsule matrix factorization [46] recently proposed the use of a capsule network for document representation in which bidirectional RNNs are used by capsMF for robust representation of users and items textual data. CapsMF extracts sequential patterns of words present in sentences.
•
CMF-HRS learned semantics and context details of users and items through integrated function of w2v and convolutional neural network. Subsequently, the learned embeddings are fed to latent factors of the collaborative filtering technique for rating predictions. Moreover, the ratings matrix binary form is used by the model for top-n item recommendation.

4.3. Results and analysis

Our experimental environment setup is based on a platform with an AMD Ryzen 7 3700x8-core processor and 16 GB RAM with a GeForce RTX 2080/PCIe/SSE2 NVIDIA GPU. All implementations were performed in 64-bit Ubuntu OS with Python 3.6 and the Keras library. The proposed model was set to a series of experiments performed on real-world datasets (Amazon and Yelp) to study the impact of different parameters.

4.4. Effect of CNN embedding layer dimension d

‘d’ denotes the dimension of word vectors generated by the w2v model. It also represents the embedding size of the convolution layer, as both are identical for integration. The variation in ‘d’ has been analyzed in the task of rating prediction by varying it from 100 to 300. This showed the effect of semantics and contextual contents of users and items on predicted ratings. Table 3, Table 4 depict the effect of different values of ‘d’ on rating prediction. It can be observed that changes in the dimension of the CNN embedding layer impose a visible effect on rating prediction. When ‘d’ is set to 200, the model performed optimally, but deviating from ‘d’

=

200 to either side increases the RMSE and MAE values. This conforms to the statement that the best semantics are captured by the model at embedding dimensions of 200. Therefore, it can be concluded that the CNN embedding layer captured the contextual details of the semantics that were presented to it by the w2v model in the form of dense vector representation. It is emphasized that the dimensions of both the w2v model and CNN embedding must be kept similar so that vectors generated by w2v can be integrated into the CNN embedding layer.

Table 3. RMSE of CMF-HRS for ‘d’ $=$ 100, ‘d’ $=$ 200, and ‘d’ $=$ 300.

Model	Dataset	Root Mean Square Error (RMSE)
Empty Cell	Empty Cell	@d $=$ 100	@d $=$ 200	@d $=$ 300
	AIV	0.712	0.651	0.719
CMF-HRS	AA	1.113	0.943	1.092
	Yelp	0.976	0.957	0.967

Table 4. MAE of CMF-HRS for ‘d’ $=$ 100, ‘d’ $=$ 200, and ‘d’ $=$ 300.

Model	Dataset	Mean Absolute Error (MAE)
Empty Cell	Empty Cell	@d $=$ 100	@d $=$ 200	@d $=$ 300
	AIV	0.709	0.685	0.694
CMF-HRS	AA	0.916	0.863	0.893
	Yelp	0.793	0.725	0.768

4.5. Effect of user and item latent factors dimension D

Capital D represents the dimension of the user and item latent factors

P^{M \times D}

and

Q^{D \times N}

. In the task of rating prediction, the effect of D was analyzed on predicted ratings. The size of user and item latent factors varied from 25 to 75 to check the sensitivity of the proposed model. Table 5, Table 6 show the RMSE and MAE of the proposed model for

D = 25

D = 50

and

D = 75

on both Amazon datasets and Yelp. It can be concluded that the proposed model performs better when

D = 50

and RMSE and MAE converged to minimum values at

D = 50

. At

D = 75

, an increase in both evaluation metrics was observed, thus depicting the degraded performance of the model. This implies that the distribution of item and user concepts also affects the RS performance in their corresponding latent factors. Therefore, precisely captured user/item latent factors improve the proposed model performance.

Table 5. RMSE of CMF-HRS for D $=$ 25, D $=$ 50 and D $=$ 100.

Model	Dataset	Root Mean Square Error (RMSE)
Empty Cell	Empty Cell	$@ D = 25$	$@ D = 50$	$@ D = 75$
	AIV	0.696	0.685	0.692
CMF-HRS	AA	1.102	0.9368	1.115
	Yelp	1.058	0.9870	1.019

Table 6. MAE of CMF-HRS for D $=$ 25, D $=$ 50 and D $=$ 100.

Model	Dataset	Mean Absolute Error (MAE)
Empty Cell	Empty Cell	$@ D = 25$	$@ D = 50$	$@ D = 75$
	AIV	0.7051	0.6851	0.6913
CMF-HRS	AA	0.9045	0.8623	0.9168
	Yelp	0.7685	0.7251	0.7831

4.6. Item’s recommendation performance

In order to evaluate the model for top-n items recommendation to a user, the rating matrix

R_{m n}

is converted into binary formation. The user rating for an item

R_{m n}

ranges from 1 to 5, where 5 is considered the highest rating(highly liked items) whereas rating 1 is considered the lowest rating(disliked item). We considered ratings from 3 to 5 as ‘1’ for being a relevant item and ratings 1 to 2 as ‘0’ being an irrelevant item to the user. Thus, a cut-off threshold, Th is set in the rating matrix and the matrix is converted to binary form based on the defined Th [48]. Accordingly, items are divided into two classes of relevant and irrelevant based on the set threshold. An item rated lower than Th is considered irrelevant and one that is rated higher than Th was considered as relevant. The model optimum performance is noted at regularization terms

(δ_{m} = 100, δ_{n} = 10)

, D

=

50(dimension of user and item latent factors), d

=

200 (embeddings dimension) were chosen. . Moreover, precision, recall and F1-score are computed to monitor the efficacy of the proposed model. The improved performance of the proposed model is represented by a gradual rise in the evaluation matrices. As the datasets are very sparse, therefore, very few items are relevant to the user. Thus differentiation on top-n item recommendation performance is hardly detected. In this case, it is more meaningful to examine the recall rate than the precision. Moreover, model performance for lower values of n cannot be monitored precisely. Therefore, n is varied from 50 to 300 for the proposed model. Table 7, Table 8, Table 9 depict the gradual increase in precision, recall rate, and f1 score. Therefore, it can be concluded that as the value of the top n items increases, the performance of the model improves. If the user rates an increasing number of items, the subsequent recommendations in the future will be more relevant and precise to his profile. Moreover, varying the cut-off threshold (Th) of rating will affect the model performance. During the experiments, the threshold is set to

Th =

2 and 3 to observe the model performance. In addition to (Th), the data sparsity will also affect the recall rate of the model. Fig. 4 represents the model performance with n varying from 50 to 300 when the rating threshold was set to 2 and 3.

Table 7. Precision values of CMF-HRS for top n-recommendation.

$D a t a s e t \to$	$A I V$		$A A$		Yelp
$P r e c i s i o n ↓$	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2
$@ 50$	0.00308	0.00327	0.00163	0.00236	0.01023	0.01036
$@ 100$	0.00337	0.00356	0.00201	0.00319	0.01032	0.01047
$@ 150$	0.00377	0.00382	0.00293	0.00383	0.01042	0.01054
$@ 200$	0.00413	0.00429	0.00364	0.00476	0.01056	0.01066
$@ 250$	0.00441	0.00462	0.00502	0.00589	0.01065	0.01071
$@ 300$	0.00493	0.00510	0.00593	0.00603	0.01069	0.01079

Table 8. Recall values of CMF-HRS for top n-recommendation.

$D a t a s e t \to$	$A I V$		$A A$		Yelp
$R e c a l l ↓$	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2
$@ 50$	0.399	0.427	0.379	0.415	0.228	0.329
$@ 100$	0.421	0.463	0.395	0.427	0.309	0.331
$@ 150$	0.493	0.539	0.430	0.486	0.328	0.353
$@ 200$	0.521	0.616	0.461	0.513	0.365	0.384
$@ 250$	0.639	0.708	0.492	0.533	0.393	0.406
$@ 300$	0.727	0.802	0.539	0.565	0.410	0.423

Table 9. F1-score of CMF-HRS for top n-recommendation.

$D a t a s e t \to$	$A I V$		$A A$		Yelp
$F 1 - s c o r e ↓$	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2	Th $⩾$ 3	Th $⩾$ 2
$@ 50$	0.0061	0.0064	0.0032	0.0046	0.0195	0.0200
$@ 100$	0.0066	0.0070	0.0039	0.0063	0.0199	0.0202
$@ 150$	0.0074	0.0075	0.0058	0.0076	0.0201	0.0205
$@ 200$	0.0082	0.0085	0.0072	0.0094	0.0203	0.0207
$@ 250$	0.0087	0.0091	0.0099	0.0115	0.0206	0.0209
$@ 300$	0.0096	0.0097	0.0113	0.0117	0.0208	0.0210

4.7. Comparative analysis

The proposed model performs two different tasks of ratings prediction and top-n items recommendation on the basis of user and item profiles. These profiles are formulated by the model on the basis of their auxiliary information which is available in the form of textual documents. The proposed model’s various parameters are tuned with a heuristic approach and its performance is checked on the tuned parameters. Accordingly, regularized terms are adjusted to

(δ_{m} = 100, δ_{n} = 10)

for optimum performance of the model. The embedding dimension is set to 200 so that maximum semantics and context details of auxiliary information are captured by the w2v and CNN responsible for embeddings generation. Moreover, size of users and items latent factor is fixed at 50 for optimum distribution of concepts present in user to item ratings matrix. The proposed model is evaluated by using RMSE and MAE evaluation matrices for the task of ratings prediction. Moreover, its top-n item recommendation performance is evaluated in terms of precision and recall. The evaluation is performed on tuned parameters, and the results are shown in Table 10, Table 11.

4.7.1. Model performance on rating prediction

It can be ascertained from Table 10, that the RMSE of PMF [14] is on the higher side because there are no auxiliary details incorporated into users and item latent factors. In this case, both latent factors were initialized randomly while using only explicit rating data (rating matrix only). It can be observed that ConvMF [44] performs better than PMF in terms of RMSE and MAE. This shows that combining an item’s contextual details with ratings data improves the RS performance. Such contextual details are learned by the CNN from the item’s textual description. In addition to the item, CERMF [45] also learned user textual description in parallel with the item’s description by employing a dual convolutional neural network. This led CERMF [45] to perform comparatively better than ConvMF [44]. It can be argued that deep understanding of both users and items auxiliary information leads to improved performance of the RS. The same is complied by DRMF [37], which performed better than ConvMF [44] and its predecessors. In DRMF [37] user and item embeddings are generated by the combined functioning of convolutional and gated recurrent neural networks. Subsequently, the learned document representation is fed to collaborative filtering to predict ratings. Thus, by considering the side information of both entities (users and items), recommendation accuracy and rating prediction can be improved. ConvMF [44], CERMF [45], and DRMF [37] utilize explicit rating data and implicit details extracted from auxiliary information of users and items. The capsule matrix factorization [46] aptly improved the RS performance by using bidirectional GRU along with a capsule network. In this, capsMF [46] integrates the embedding layer of CNN with bi-directional GRU. This learns the semantics of the user’s and item’s textual documents. Moreover, if either item or user content is not enough, incorporating content information of both the user and item can be beneficial and significantly improve the performance. This effect can be anticipated by comparing CERMF [45] with ConvMF [44] and DRMF [37] with DRMF-user [37] and DRMF-item [37]. The proposed model combines both the semantics and contextual details of user’s in a single numerical matrix. A similar matrix is concurrently achieved for items as well. Moreover, the feature locality information is captured by dividing each feature vector into sections and then max-pool operation is applied to each section in the max-over pooling layer [5]. The user’s and item’s embeddings, having semantics and contextual details, in the form of the numerical matrices, are then used to regularize the user and item latent factors in collaborative filtering. It can be observed from Table 10 that the proposed model performs better as compared to the state-of-the-art models (to the best of our knowledge). Figs. 5, 6, and 7 indicate proposed model convergence in terms of decreasing RMSE and MAE. Thus, by incorporating semantic and contextual ingredients of both the user and item improve the RS performance.

Table 10. Ratings prediction comparison.

$D a t a s e t \to$	$A I V$		$A A$		Yelp
$M o d e l s ↓$	RMSE	MAE	RMSE	MAE	RMSE	MAE
$P M F$	1.2080	0.9493	1.4087	1.1805	1.2194	0.9697
$C o n v M F$	1.0057	0.7508	1.2461	0.9718	1.0038	0.7863
$D R M F - I t e m s$	0.9868	0.7259	1.2187	0.9275	0.9988	0.7755
$D R M F - U s e r$	0.9722	0.7184	1.1863	0.9088	1.0022	0.7864
$C E R M F$	0.9621	0.7353	1.2091	0.9101	0.9843	0.7697
$D R M F$	0.9426	0.6982	0.1789	0.9000	0.9865	0.7615
$C a p s M F$	0.9593	0.7064	1.1570	0.8878	–	–
$CMF − HRS$	0.7010	0.6850	0.8631	0.8632	0.9532	0.7257

4.7.2. Model performance on top-n recommendations

The proposed model is evaluated for the task of top-n item recommendation with the help of precision and recall. In order to achieve the item recommendation task, the rating matrix is converted to binary form and compared with PMF [14], ConvMF [44], CERMF [45], DRMF [37], and capsMF [46]. The rating matrix is converted to the form of 0’s and 1’s, where

R_{m n} = 1

if the user has a rated item and

R_{m n} = 0

otherwise. For comparison with other models, the conversion of the rating matrix is different from that described in Section 4.6 where conversion is done on the basis of a cut-off rating threshold Th. After the conversion, regularization terms

δ_{m}, δ_{n}

, D(dimension of user and item latent factors), and d(embeddings dimension) were chosen. As the dataset is sparse, therefore, lesser relevant items are available for the user. Thus it is difficult to observe the difference with respect to the proposed model’s precision with other stated models . In these conditions, it is more appropriate to observe the recall of the proposed model with other state-of-the-art model. This is indicated by the small differences in precision and recall values in Table 11. Here n is set to 300 for simplicity.

It should be noted that the proposed model provides improved rating prediction and top-n items recommendation accuracy. The improved RMSE and MAE of the proposed model depict better rating prediction whereas a gradual rise in precision and recall illustrate the optimum performance of the proposed model in top-n item recommendations. The proposed model performance is a result of incorporating semantics and contextual details of the textual document into the collaborative filtering technique(PMF). Moreover, utilizing the contents of both user and item also improved the RS efficiency.

Table 11. Top-n item recommendation performance comparison.

$D a t a s e t \to$	$A I V$		$A A$		Yelp
$M o d e l s ↓$	Precision	Recall	Precision	Recall	Precision	Recall
$P M F$	0.00506	0.772	0.00583	0.549	0.01067	0.421
$C o n v M F$	0.00505	0.771	0.00583	0.546	0.01063	0.423
$D R M F - I t e m s$	0.00510	0.780	0.00591	0.551	0.01084	0.429
$D R M F - U s e r$	0.00522	0.810	0.00611	0.589	0.01064	0.432
$C E R M F$	0.00518	0.801	0.00588	0.550	0.01025	0.410
$D R M F$	0.00519	0.802	0.00604	0.583	0.01084	0.430
$C a p s M F$	0.00515	0.810	0.00451	0.520	–	–
$CMF − HRS$	0.00527	0.819	0.00615	0.586	0.01087	0.433

5. Conclusion and future work

Recommendation systems are used by organizations for recommending requisite item to the potential user in accordance with his/her desired preferences, however, it suffers from data sparsity problem. The RS performs either of the two tasks i.e. ratings prediction and item recommendation. In this paper, a hybrid collaborative filtering model is proposed in which a convolutional neural network along with w2v is integrated with matrix factorization. The CMF-HRS incorporates word2vec (semantics) and CNN (context) into PMF to capture the content information of both item and user documents. Both user and item contents have been exploited by w2v and CNN to build their optimum dense vector representation for rating predictions and item recommendation. The model learns latent factors of both entities from client reviews along with item textual details. The CNN along with w2v is responsible for user/ item contents feature extractions where feature locality information is also captured. The PMF is mandated with missing ratings prediction for the user and top-n item recommendation. The aforementioned technique can effectively handle both semantics and contexts of user and item documents to prepare corresponding user and item latent factors. The suggested model can be applied to any dataset that is enriched with both implicit and explicit feedback of users and items. Series of experiments were performed to evaluate the proposed model in terms of ratings prediction and items recommendation on three benchmark datasets named Amazon (Amazon Instant Videos and Apps for Android) and Yelp. Experimental findings led to better performance of the model compared to other proposed models. In future, the temporal effect of time to generate latent factors can be explored. Moreover, the noise issue in the recommender system can also be researched, as it will surely affect the performance of the recommender system. Additionally, embedding generation through a bidirectional RNN can also be explored for its subsequent integration into collaborative filtering techniques.

CRediT authorship contribution statement

Zafran Khan: Conceptualization, Methodology, Experimentation, Result compilation, Writing - original draft. Muhammad Ishfaq Hussain: Conceptualization, Methodology, Experimentation, Result compilation, Writing - original draft. Naima Iltaf: Investigation, Writing - review & editing. Joonmo Kim: Investigation, Writing - review & editing. Moongu Jeon: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP), South Korea grant funded by the Korea government (MSIT) (No.2014-3-00077, AI National Strategy Project & No.2019-0-01842, Artificial Intelligence Graduate School Program (GIST)) and Ministry of Culture, Sports and Tourism and Korea Creative Content Agency(Project Number: R2020070004).

References

[1]
Bao Y., Fang H., Zhang J.
Topicmf: Simultaneously exploiting ratings and reviews for recommendation
AAAI, AAAI Press (2014), pp. 2-8
View in Scopus Google Scholar
[2]
Isinkaye a F.O., Folajimi b Y.O., Ojokoh B.A.
Recommendation systems: Principles, methods and evaluation
Egypt. Inform. J., 16 (2015), pp. 261-273
Google Scholar
[3]
Bharadwaj S., Dhamecha T.I., Vatsa M., Singh R.
Computationally efficient face spoofing detection with motion magnification
IEEE Conference on Computer Vision and Pattern Recognition Workshops (2013), pp. 105-110
Crossref View in Scopus Google Scholar
[4]
Krell G., Glodek M., Panning A., Siegert I., Michaelis B., Wendemuth A., Schwenker F.
Fusion of fragmentary classifier decisions for affective state recognition
International Conference on Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction, Vol. 7742 (2012), pp. 116-130
Google Scholar
[5]
Zhang Dapeng, Yajun Liu, Liu Jiancheng
A content-based deep hybrid approach with segmented max-pooling
Intelligent Information Processing X (2020), pp. 299-309, 10.1007/978-3-030-46931-328
View in Scopus Google Scholar
[6]
Acilar A.M., Arslan A.
A collaborative filtering method based on Artificial Immune Network
Expert Syst. Appl., 36 (4) (2009), pp. 8324-8332
Google Scholar
[7]
Yu Kai, Schwaighofer Anton, Tresp Volker, Xu Xiaowei, Kriegel Hans-Peter
Probabilistic memory-based collaborative filtering
IEEE Trans. Knowl. Data Eng. (2013)
Google Scholar
[8]
P. Massa, Paolo Avesani, Trust-Aware Collaborative Filtering for Recommender Systems Via Sommarive 14 - I-38050 Povo (TN) - Italy.
Google Scholar
[9]
X. Su, T.M. Khoshgoftaar, A Survey of Collaborative Filtering Techniques Advances in Artificial Intelligence, Volume 2009, Article ID 421425, 19 pages.
Google Scholar
[10]
J.S. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in: Proceedings of the Fourteenth Conference on Uncertainty in Artifical Intelligence, 1998.
Google Scholar
[11]
Ling Guang, Lyu Michael R., King Irwin
Ratings Meet Reviews, a Combined Approach to Recommend, Computer
(2014), pp. 6-10
Crossref Google Scholar
[12]
McAuley J., Leskovec J.
Hidden factors and hidden topics: understanding rating dimensions with review text
Proceedings of the 7th ACM Conference on Recommender Systems, ACM (2013), pp. 165-172
Crossref View in Scopus Google Scholar
[13]
Bell Robert M., Koren Yehuda
Improved neighborhood-based collaborative filtering
KDD’13 CUP (2007)
Google Scholar
[14]
Mnih A., Salakhutdinov R.R.
Probabilistic matrix factorization
Adv. NeuralInformation Processing Systems (2008), pp. 1257-1264
Google Scholar
[15]
Guo G., Zhang J., Thalmann D.
Merging trust in collaborative filtering toalleviate data sparsity and cold start
Knowl.-Based Syst., 57 (2014), pp. 57-68
View PDFView article View in Scopus Google Scholar
[16]
Salakhutdinov Ruslan, Mnih Andriy
Bayesian probabilistic matrix factorization using markov chain montecarlo
ICML (2008), pp. 880-887
Crossref View in Scopus Google Scholar
[17]
Adams Ryan Prescott, Dahl George E., Murray Iain
Incorporating side information in probabilistic matrix factorization with gaussian processes
UAI (2010), pp. 1-9
View in Scopus Google Scholar
[18]
Shan H., Banerjee A.
Generalized probabilistic matrix factorizations forcollaborative filtering
2010 IEEE 10th International Conference on DataMining (ICDM) (2010), pp. 1025-1030
Crossref View in Scopus Google Scholar
[19]
Goldberg Kenneth, Roeder Theresa, Gupta Dhruv, Perkins Chris
Eigentaste: A constant time collaborative filtering algorithm
Information Retrieval (2001)
Google Scholar
[20]
Robin van Meteren, Maarten van Someren, Using Content-Based Filtering for Recommendation. NetlinQ Group, Gerard Brandtstraat 26-28, 1054 JK, Amsterdam.
Google Scholar
[21]
Vasile Flavian, Smirnova Elena, Conneau Alexis
Meta-Prod2Vec: Product embeddings using side-information for recommendation
RecSys ’16 Proceedings of the 10th ACM Conference on Recommender Systems (2016), pp. 225-232
Crossref View in Scopus Google Scholar
[22]
Asela Gunawardana, Christopher Meek, Tied boltzmann machines for cold start recommendations, in: RECSYS’08, pp. 19–26.
Google Scholar
[23]
Pan Weike, Xiang Evan Wei, Liu Nathan Nan, Yang Qiang
Transfer learning in collaborative filtering for sparsity reduction
AAAI’10 (2010)
Google Scholar
[24]
Bin Li, Qiang Yang, Xiangyang Xue, Transfer learning for collaborative filtering via a rating-matrix generative model, in: ICML’09, pp. 617–62.
Google Scholar
[25]
Elkahky A.M., Song Y., He X.
A multi-view deep learning approach for cross domain user modeling in recommendation systems
Proceedings of the 24th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee (2015), pp. 278-288
Crossref View in Scopus Google Scholar
[26]
Y. Wu, C. DuBois, A.X. Zheng, M. Ester, Collaborative denoising auto-encoders for top-n recommender systems.
Google Scholar
[27]
Blei D.M., Ng A.Y., Jordan M.I.
Latent Dirichlet allocation
J. Mach. Learn. Res., 3 (2003), pp. 993-1022
View in Scopus Google Scholar
[28]
Mikolov T., Chen K., Corrado G., Dean J.
Efficient estimation ofword representations in vector space
(2008)
Available online: https://arxiv.org/pdf/1301.3781.pdf (accessed on 4 September 2008)
Google Scholar
[29]
Liu Y., Liu Z., Chua T., Sun M.
Topical word embeddings
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA (2015)
Available online: https://pdfs.semanticscholar.org/9a0a/f9e48aad89512ce3e24b6a1853ed3d5d9142.pdf (accessed on 4 September 2018)
Google Scholar
[30]
Wang H., wang N., Yeung D.
Collaborative deep learning for recommender systems
(2015)
arXiv:1409.2944v2 [cs.LG] 18 Jun 2015
Google Scholar
[31]
Wang C., Blei D.M.
Collaborative topic modeling for recommending scientific articles
KDD (2011), pp. 448-456
Crossref View in Scopus Google Scholar
[32]
Li S., Kawale J., Fu Y.
Deep collaborative filtering via marginalized denoising auto-encoder
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM (2015), pp. 811-820
Crossref Google Scholar
[33]
Wang X., Wang Y.
Improving content-based and hybrid music recommendation using deep learning
Proceedings of the ACM International Conference on Multimedia, ACM (2014), pp. 627-636
View in Scopus Google Scholar
[34]
Lei Zheng, Vahid Noroozi, Philip S. Yu, Joint deep modeling of users and items using reviews for recommendation, in: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM’17), New York, NY, USA, ACM, pp. 425–434.
Google Scholar
[35]
Khan Z., Iltaf N., Afzal H., Abbas H.
Enriching non-negative matrix factorization with contextual embeddings for recommender systems
Neurocomputing, 380 (2020), pp. 246-258, 10.1016/j.neucom.2019.09.080
View PDFView article View in Scopus Google Scholar
[36]
Khan Z., Iltaf N., Afzal H., Abbas H.
DST-HRS: A topic driven hybrid recommender system based on deep semantics
Comput. Commun., 156 (2020), pp. 183-191
View PDFView article View in Scopus Google Scholar
[37]
Wu Hao, Zhang Zhengxin, Yue Kun, Zhang Binbin, He Jun, Sun Liangchen
Dual-regularized matrix factorization with deep neural networks for recommender systems
Knowledge-Based Systems, 145(J. Mach. Learn. Res. 3 2003) (2018), pp. 46-58
View PDFView article View in Scopus Google Scholar
[38]
Lara-Cabrera R., González-Prieto Á., Ortega F.
Deep matrix factorization approach for collaborative filtering recommender systems
Appl. Sci., 10 (14) (2020), p. 4926
Crossref View in Scopus Google Scholar
[39]
Tan Y.K., Xu X., Liu Y.
Improved recurrent neural networks for session-based recommendations
ACM International Conference Proceeding Series, Vol. 15-Septemb (2016), pp. 17-22
Crossref View in Scopus Google Scholar
[40]
Hidasi B., Quadrana M., Karatzoglou A., Tikk D.
Parallel recurrent neural network architectures for feature rich session-based recommendations
RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (2016), pp. 241-248
Crossref View in Scopus Google Scholar
[41]
Livne A., Unger M., Shapira B., Rokach L.
Deep context-aware recommender system utilizing sequential latent context
(2019)
arXiv preprint arXiv:1909.03999 (2020)
Google Scholar
[42]
J. Wang, H. Xie, O.T.S. Au, D. Zou, F.L. Wang, Attention-based CNN for personalized course recommendations for MOOC learners, in: Proc. Int. Symp. Educ. Technol. (ISET), Bangkok, Thailand, Aug. 2020, pp. 180–184.
Google Scholar
[43]
Chou Y., Chen H., Liu D., Chang D.
Rating prediction based on merge-CNN and concise attention review mining
IEEE Access, Digital Object Identifier (2020)
Google Scholar
[44]
Kim D., Park C., Oh J., Lee S., Yu H.
Convolutional Matrix Factorization for Document Context-Aware Recommendation
(2016)
978-1-4503-4035-9
Google Scholar
[45]
H. Wu, Z. Zhang, K. Yue, B. Zhang, R. Zhu, Content embedding regularized ma- trix factorization for recommender systems, in: 2017 IEEE 6th International Congress on Big Data, IEEE, June 25-June 30, 2017, Honolulu, Hawaii, USA, 2017, pp. 209–215.
Google Scholar
[46]
Katarya R., Arora Y.
A novel product recommender system using deep learning based text analysis model
Multimed Tools Appl (2020) (2020), 10.1007/s11042-020-09199-5
Google Scholar
[47]
Davoudi M. Chatterjee A.
Modeling trust for rating prediction in recommender systems
SIAM Workshop on Machine Learning Methods for Recommender Systems, SIAM (2016), p. 18
Google Scholar
[48]
Anwaar F., Iltaf N., Afzal H., Nawaz R.
HRS-CE: A hybrid framework to integrate content embeddings in recommender systems for cold start items
J. Comput. Sci., 29 (2018), pp. 9-18
Cited 19 times (2018)
View PDFView article View in Scopus Google Scholar

Cited by (25)

CNNRec: Convolutional Neural Network based recommender systems - A survey
2024, Engineering Applications of Artificial Intelligence
Citation Excerpt :
The CNN’s design simplicity makes the model good in learning the context of information provided. Khan et al. (2021a) proposed a contextual hybrid model to handle data sparsity in the rating matrix. The embedding developed by Word2Vec (Mikolov et al., 2013) and a CNN of user and items are integrated into a corresponding latent factor of PMF.
Easy internet access and technological advancements have resulted in information overload and a plethora of options, making decision-making extremely difficult. Recommender System (RS) is a potential solution for assisting users in making decisions by recommending or predicting product ratings. Three fundamental forms of RS that use implicit or explicit feedback for recommendation are collaborative, content-based, and hybrid filtering. Ratings are the most common form of feedback, but product descriptions, reviews, images, audios, and videos are also important and can help improve the performance of the traditional RS. These additional variables can have a significant impact on RS’s performance. Traditional RSs used approaches based on the nearest neighbor or other machine learning models, but thanks to recent advances in artificial intelligence and deep learning, RSs are now being developed using Convolutional Neural Networks (CNN), which can efficiently exploit auxiliary information. In addition to comparing CNN-based RSs on common grounds, this article provides a full examination of CNN-based RSs and how they might use various types of auxiliary information. The study also discusses data characteristics, data statistics, and auxiliary information in a variety of publicly available datasets. Different evaluation measures for RSs are also discussed, and readers are provided with interesting challenges and open research issues.
DCARS: Deep context-aware recommendation system based on session latent context
2023, Applied Soft Computing
Citation Excerpt :
The importance of the user’s context to increase the quality of suggestions has been studied by researchers [3]. Adding user contexts in the recommendations systems (RSs) extends the applications of CARSs [4]. In comparison to other RSs that apply user past interests, CARSs create appropriate suggestions that are close to the current context of the user.
Recommendation systems (RSs) usually create suggestions based on users’ prior intentions. Users’ interests may evolve due to context change or user-mode change. Discovering such a change is crucial for producing personalized suggestions. Traditional approaches assume that each user has a fixed preference. On the contrary, context-aware recommendation systems (CARSs) use contextual information to detect user intention changes. However, applying contextual information is the main challenge in CARSs, because it is not always feasible to achieve all the users’ contextual information. Furthermore, adding different contexts to RSs grows its dimensionality in multiple applications. Besides, existing CARSs cannot precisely obtain the hierarchical relationships between items and contexts items that influence users’ intentions. They often use short-term interest with either static long-term preference in the recommendation process. To alleviate the mentioned challenges, we propose a novel deep context-aware recommendation system (DCARS) to capture and incorporate user preferences changes in the recommendation process. The proposed method models the latent context among selected items in each session throughout users’ historical interactions and combines users’ short-term and long-term preferences to generate recommendations. Specifically, we suggest a DCARS based on latent representations of sessions derived from users’ activities. The experiment results on benchmark context-aware data sets show that the proposed DCARS model surpasses state-of-the-art approaches.
Analysis of Recommender System Using Generative Artificial Intelligence: A Systematic Literature Review
2024, IEEE Access
Clustering-Based Frequent Pattern Mining Framework for Solving Cold-Start Problem in Recommender Systems
2024, IEEE Access
Recommendation Systems for e-Shopping: Review of Techniques for Retail and Sustainable Marketing
2023, Sustainability Switzerland
Impact of word embedding models on text analytics in deep learning environment: a review
2023, Artificial Intelligence Review

View all citing articles on Scopus

¹: http://jmcauley.ucsd.edu/data/amazon/.

²: https://www.yelp.com/dataset.

View Abstract

[1] [1]
Bao Y., Fang H., Zhang J.
Topicmf: Simultaneously exploiting ratings and reviews for recommendation
AAAI, AAAI Press (2014), pp. 2-8
View in Scopus Google Scholar

[2] [2]
Isinkaye a F.O., Folajimi b Y.O., Ojokoh B.A.
Recommendation systems: Principles, methods and evaluation
Egypt. Inform. J., 16 (2015), pp. 261-273
Google Scholar

[3] [3]
Bharadwaj S., Dhamecha T.I., Vatsa M., Singh R.
Computationally efficient face spoofing detection with motion magnification
IEEE Conference on Computer Vision and Pattern Recognition Workshops (2013), pp. 105-110
Crossref View in Scopus Google Scholar

[4] [4]
Krell G., Glodek M., Panning A., Siegert I., Michaelis B., Wendemuth A., Schwenker F.
Fusion of fragmentary classifier decisions for affective state recognition
International Conference on Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction, Vol. 7742 (2012), pp. 116-130
Google Scholar

[5] [5]
Zhang Dapeng, Yajun Liu, Liu Jiancheng
A content-based deep hybrid approach with segmented max-pooling
Intelligent Information Processing X (2020), pp. 299-309, 10.1007/978-3-030-46931-328
View in Scopus Google Scholar

[6] [6]
Acilar A.M., Arslan A.
A collaborative filtering method based on Artificial Immune Network
Expert Syst. Appl., 36 (4) (2009), pp. 8324-8332
Google Scholar

[7] [7]
Yu Kai, Schwaighofer Anton, Tresp Volker, Xu Xiaowei, Kriegel Hans-Peter
Probabilistic memory-based collaborative filtering
IEEE Trans. Knowl. Data Eng. (2013)
Google Scholar

[8] [8]
P. Massa, Paolo Avesani, Trust-Aware Collaborative Filtering for Recommender Systems Via Sommarive 14 - I-38050 Povo (TN) - Italy.
Google Scholar

[9] [9]
X. Su, T.M. Khoshgoftaar, A Survey of Collaborative Filtering Techniques Advances in Artificial Intelligence, Volume 2009, Article ID 421425, 19 pages.
Google Scholar

[10] [10]
J.S. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in: Proceedings of the Fourteenth Conference on Uncertainty in Artifical Intelligence, 1998.
Google Scholar

[11] [11]
Ling Guang, Lyu Michael R., King Irwin
Ratings Meet Reviews, a Combined Approach to Recommend, Computer
(2014), pp. 6-10
Crossref Google Scholar

[12] [12]
McAuley J., Leskovec J.
Hidden factors and hidden topics: understanding rating dimensions with review text
Proceedings of the 7th ACM Conference on Recommender Systems, ACM (2013), pp. 165-172
Crossref View in Scopus Google Scholar

[13] [13]
Bell Robert M., Koren Yehuda
Improved neighborhood-based collaborative filtering
KDD’13 CUP (2007)
Google Scholar

[14] [14]
Mnih A., Salakhutdinov R.R.
Probabilistic matrix factorization
Adv. NeuralInformation Processing Systems (2008), pp. 1257-1264
Google Scholar

[15] [15]
Guo G., Zhang J., Thalmann D.
Merging trust in collaborative filtering toalleviate data sparsity and cold start
Knowl.-Based Syst., 57 (2014), pp. 57-68
View PDFView article View in Scopus Google Scholar

[16] [16]
Salakhutdinov Ruslan, Mnih Andriy
Bayesian probabilistic matrix factorization using markov chain montecarlo
ICML (2008), pp. 880-887
Crossref View in Scopus Google Scholar

[17] [17]
Adams Ryan Prescott, Dahl George E., Murray Iain
Incorporating side information in probabilistic matrix factorization with gaussian processes
UAI (2010), pp. 1-9
View in Scopus Google Scholar

[18] [18]
Shan H., Banerjee A.
Generalized probabilistic matrix factorizations forcollaborative filtering
2010 IEEE 10th International Conference on DataMining (ICDM) (2010), pp. 1025-1030
Crossref View in Scopus Google Scholar

[19] [19]
Goldberg Kenneth, Roeder Theresa, Gupta Dhruv, Perkins Chris
Eigentaste: A constant time collaborative filtering algorithm
Information Retrieval (2001)
Google Scholar

[20] [20]
Robin van Meteren, Maarten van Someren, Using Content-Based Filtering for Recommendation. NetlinQ Group, Gerard Brandtstraat 26-28, 1054 JK, Amsterdam.
Google Scholar

[21] [21]
Vasile Flavian, Smirnova Elena, Conneau Alexis
Meta-Prod2Vec: Product embeddings using side-information for recommendation
RecSys ’16 Proceedings of the 10th ACM Conference on Recommender Systems (2016), pp. 225-232
Crossref View in Scopus Google Scholar

[22] [22]
Asela Gunawardana, Christopher Meek, Tied boltzmann machines for cold start recommendations, in: RECSYS’08, pp. 19–26.
Google Scholar

[23] [23]
Pan Weike, Xiang Evan Wei, Liu Nathan Nan, Yang Qiang
Transfer learning in collaborative filtering for sparsity reduction
AAAI’10 (2010)
Google Scholar

[24] [24]
Bin Li, Qiang Yang, Xiangyang Xue, Transfer learning for collaborative filtering via a rating-matrix generative model, in: ICML’09, pp. 617–62.
Google Scholar

[25] [25]
Elkahky A.M., Song Y., He X.
A multi-view deep learning approach for cross domain user modeling in recommendation systems
Proceedings of the 24th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee (2015), pp. 278-288
Crossref View in Scopus Google Scholar

[26] [26]
Y. Wu, C. DuBois, A.X. Zheng, M. Ester, Collaborative denoising auto-encoders for top-n recommender systems.
Google Scholar

[27] [27]
Blei D.M., Ng A.Y., Jordan M.I.
Latent Dirichlet allocation
J. Mach. Learn. Res., 3 (2003), pp. 993-1022
View in Scopus Google Scholar

[28] [28]
Mikolov T., Chen K., Corrado G., Dean J.
Efficient estimation ofword representations in vector space
(2008)
Available online: https://arxiv.org/pdf/1301.3781.pdf (accessed on 4 September 2008)
Google Scholar

[29] [29]
Liu Y., Liu Z., Chua T., Sun M.
Topical word embeddings
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA (2015)
Available online: https://pdfs.semanticscholar.org/9a0a/f9e48aad89512ce3e24b6a1853ed3d5d9142.pdf (accessed on 4 September 2018)
Google Scholar

[30] [30]
Wang H., wang N., Yeung D.
Collaborative deep learning for recommender systems
(2015)
arXiv:1409.2944v2 [cs.LG] 18 Jun 2015
Google Scholar

[31] [31]
Wang C., Blei D.M.
Collaborative topic modeling for recommending scientific articles
KDD (2011), pp. 448-456
Crossref View in Scopus Google Scholar

[32] [32]
Li S., Kawale J., Fu Y.
Deep collaborative filtering via marginalized denoising auto-encoder
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM (2015), pp. 811-820
Crossref Google Scholar

[33] [33]
Wang X., Wang Y.
Improving content-based and hybrid music recommendation using deep learning
Proceedings of the ACM International Conference on Multimedia, ACM (2014), pp. 627-636
View in Scopus Google Scholar

[34] [34]
Lei Zheng, Vahid Noroozi, Philip S. Yu, Joint deep modeling of users and items using reviews for recommendation, in: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM’17), New York, NY, USA, ACM, pp. 425–434.
Google Scholar

[35] [35]
Khan Z., Iltaf N., Afzal H., Abbas H.
Enriching non-negative matrix factorization with contextual embeddings for recommender systems
Neurocomputing, 380 (2020), pp. 246-258, 10.1016/j.neucom.2019.09.080
View PDFView article View in Scopus Google Scholar

[36] [36]
Khan Z., Iltaf N., Afzal H., Abbas H.
DST-HRS: A topic driven hybrid recommender system based on deep semantics
Comput. Commun., 156 (2020), pp. 183-191
View PDFView article View in Scopus Google Scholar

[37] [37]
Wu Hao, Zhang Zhengxin, Yue Kun, Zhang Binbin, He Jun, Sun Liangchen
Dual-regularized matrix factorization with deep neural networks for recommender systems
Knowledge-Based Systems, 145(J. Mach. Learn. Res. 3 2003) (2018), pp. 46-58
View PDFView article View in Scopus Google Scholar

[38] [38]
Lara-Cabrera R., González-Prieto Á., Ortega F.
Deep matrix factorization approach for collaborative filtering recommender systems
Appl. Sci., 10 (14) (2020), p. 4926
Crossref View in Scopus Google Scholar

[39] [39]
Tan Y.K., Xu X., Liu Y.
Improved recurrent neural networks for session-based recommendations
ACM International Conference Proceeding Series, Vol. 15-Septemb (2016), pp. 17-22
Crossref View in Scopus Google Scholar

[40] [40]
Hidasi B., Quadrana M., Karatzoglou A., Tikk D.
Parallel recurrent neural network architectures for feature rich session-based recommendations
RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (2016), pp. 241-248
Crossref View in Scopus Google Scholar

[41] [41]
Livne A., Unger M., Shapira B., Rokach L.
Deep context-aware recommender system utilizing sequential latent context
(2019)
arXiv preprint arXiv:1909.03999 (2020)
Google Scholar

[42] [42]
J. Wang, H. Xie, O.T.S. Au, D. Zou, F.L. Wang, Attention-based CNN for personalized course recommendations for MOOC learners, in: Proc. Int. Symp. Educ. Technol. (ISET), Bangkok, Thailand, Aug. 2020, pp. 180–184.
Google Scholar

[43] [43]
Chou Y., Chen H., Liu D., Chang D.
Rating prediction based on merge-CNN and concise attention review mining
IEEE Access, Digital Object Identifier (2020)
Google Scholar

[44] [44]
Kim D., Park C., Oh J., Lee S., Yu H.
Convolutional Matrix Factorization for Document Context-Aware Recommendation
(2016)
978-1-4503-4035-9
Google Scholar

[45] [45]
H. Wu, Z. Zhang, K. Yue, B. Zhang, R. Zhu, Content embedding regularized ma- trix factorization for recommender systems, in: 2017 IEEE 6th International Congress on Big Data, IEEE, June 25-June 30, 2017, Honolulu, Hawaii, USA, 2017, pp. 209–215.
Google Scholar

[46] [46]
Katarya R., Arora Y.
A novel product recommender system using deep learning based text analysis model
Multimed Tools Appl (2020) (2020), 10.1007/s11042-020-09199-5
Google Scholar

[47] [47]
Davoudi M. Chatterjee A.
Modeling trust for rating prediction in recommender systems
SIAM Workshop on Machine Learning Methods for Recommender Systems, SIAM (2016), p. 18
Google Scholar

[48] [48]
Anwaar F., Iltaf N., Afzal H., Nawaz R.
HRS-CE: A hybrid framework to integrate content embeddings in recommender systems for cold start items
J. Comput. Sci., 29 (2018), pp. 9-18
Cited 19 times (2018)
View PDFView article View in Scopus Google Scholar

Outline 大纲

Cited by (25) 被引用次数（25）

Figures (8) 手办（8）

Tables (11)

Applied Soft Computing

Highlights

Abstract 抽象

Keywords 关键字

1. Introduction 1. 引言

2. Background and related work

3. Contextual matrix factorization-hybrid recommendation system (CMF-HRS)

3.1. Mathematical illustration of the proposed model

3.1.1. Embeddings generation

3.1.2. Probabilistic matrix factorization

3.2. Optimization

4. Experimentation and results

4.1. Dataset preprocessing

4.2. Evaluation matrices

4.3. Results and analysis

4.4. Effect of CNN embedding layer dimension d

4.5. Effect of user and item latent factors dimension D

4.6. Item’s recommendation performance

4.7. Comparative analysis

4.7.1. Model performance on rating prediction

4.7.2. Model performance on top-n recommendations

5. Conclusion and future work

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

References

Cited by (25)

CNNRec: Convolutional Neural Network based recommender systems - A survey

DCARS: Deep context-aware recommendation system based on session latent context

Analysis of Recommender System Using Generative Artificial Intelligence: A Systematic Literature Review

Clustering-Based Frequent Pattern Mining Framework for Solving Cold-Start Problem in Recommender Systems

Recommendation Systems for e-Shopping: Review of Techniques for Retail and Sustainable Marketing

Impact of word embedding models on text analytics in deep learning environment: a review

Contextual recommender system for E-commerce applications适用于电子商务应用的情境推荐系统

Highlights

Abstract 抽象

Keywords 关键字

1. Introduction 1. 引言

2. Background and related work

3. Contextual matrix factorization-hybrid recommendation system (CMF-HRS)

3.1. Mathematical illustration of the proposed model

3.1.1. Embeddings generation

3.1.2. Probabilistic matrix factorization

3.2. Optimization

4. Experimentation and results

4.1. Dataset preprocessing

4.2. Evaluation matrices

4.3. Results and analysis

4.4. Effect of CNN embedding layer dimension d

4.5. Effect of user and item latent factors dimension D

4.6. Item’s recommendation performance

4.7. Comparative analysis

4.7.1. Model performance on rating prediction

4.7.2. Model performance on top-n recommendations

5. Conclusion and future work

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

References

Contextual recommender system for E-commerce applications
适用于电子商务应用的情境推荐系统