Collaborative Deep Learning for Recommender Systems笔记

Collaborative Deep Learning for Recommender Systems

Authors：Hao Wang,Naiyan Wang,Dit-Yan Yeung

ABSTRACT

Collaborative filtering (CF) is a successful approach commonly used by many recommender systems.
Conventional CF-based methods use the ratings given to items by users as the sole source of information for learning to make recommendation.
To address the ++ratings sparsity problem++, auxiliary information may be utilized.
Collaborative topic regression (CTR) is an appealing recent method taking this approach which tightly couples the two components that learn from two different sources of information.
To address this problem that ++the auxiliary information is very sparse++, we generalize recent advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (==CDL==), which jointly performs ++deep representation learning for the content information++ and ++collaborative filtering for the ratings (feedback) matrix++.
Result：CDL can significantly advance the state of the art.

Categories and Subject Descriptors：

[Information Systems]: Models and Principles| General;

[Computer Applications]: Social and Behavioral Sciences

Keywords：

Recommender systems; Deep learning; Topic model; Text mining

1. INTRODUCTION

Due to the abundance of choice in many online services, recommender systems (RS) now play an increasingly significant role .
Existing methods for RS can roughly be categorized into three classes:

==content-based methods==：make use of user profiles or product descriptions for recommendation.
==collaborative filtering (CF) based methods==：use the past activities or preferences, such as user ratings on items, without using user or product content information.
==hybrid methods==：seek to get the best of both worlds by combining content-based and CF-based methods.

Because of CF-based methods prediction accuracy often drops significantly when ++the ratings are
very sparse++. Moreover, ++they cannot be used for recommending new products++ which have yet to receive rating information from users. Consequently, it is inevitable for CF-based methods to exploit auxiliary information and hence hybrid methods have gained popularity in recent years.

According to whether two-way interaction exists between ++the rating information++ and ++auxiliary information++, hybrid methods into two sub-categories:

==Loosely coupled==：process the auxiliary information once and then use it to provide features for the CF models. (information flow is one-way)
==Tightly coupled methods==:the rating information can guide the learning of features, and the extracted features can further improve the predictive power of the CF models. (two-way interaction) 重点

With two-way interaction, tightly coupled methods can automatically learn features from the auxiliary information and naturally balance the influence of the rating and auxiliary information.

目前最好的方法，也是本文提出来的方法 collaborative deep learning (CDL)的基础：Collaborative topic regression (CTR) is a probabilistic graphical model that seamlessly integrates a topic model, latent Dirichlet allocation (LDA) , and a model-based CF method, probabilistic matrix factorization (PMF).

目的：This calls for integrating deep learning with CF by performing deep learning collaboratively.

deep learning models for CF(综述)：

[28] uses restricted Boltzmann machines instead of the conventional matrix factorization formulation to perform CF.(CF-based methods because they do not incorporate content information)
[9] extends this work by incorporating user-user and item-item correlations. (CF-based methods because they do not incorporate content information)
[24] uses low-rank matrix factorization in the last weight layer of a deep network to significantly reduce the number of model parameters and speed up training.
On music recommendation, [21, 39] directly use conventional CNN or deep belief networks (DBN) to assist representation learning for content information.

To address the challenges above, we develop a hierarchical Bayesian model called ==collaborative deep learning (CDL)== as a novel tightly coupled method for RS.

We first present a Bayesian formulation of a deep learning model called stacked denoising autoencoder (SDAE).
With this, we then present our CDL model which tightly couples deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix, allowing two-way interaction between the two.

Experiments show that CDL significantly outperforms the state of the art.

(Note: Although we present CDL as using SDAE for its feature learning component, CDL is actually a more general framework which can also admit other deep learning models such as deep Boltzmann machines, recurrent neural networks , and convolutional neural networks.)

==The main contribution:==

By performing deep learning collaboratively, ++CDL can simultaneously extract an effective deep feature representation from content and capture the similarity and implicit relationship between items (and users).++ The learned representation may also be used for tasks other than recommendation.
Unlike previous deep learning models which use simple target like classification and reconstruction, ++we propose to use CF as a more complex target in a probabilistic framework++.
Besides the algorithm for attaining maximum a posteriori (MAP) estimates, ++we also derive a sampling-based algorithm for the Bayesian treatment of CDL++, which, interestingly, turns out to be a Bayesian generalized version of back-propagation.
To the best of our knowledge, CDL is ++the first hierarchical Bayesian model to bridge the gap between stateof-the-art deep learning models and RS++. Besides, due to its Bayesian nature, CDL can be easily extended to incorporate other auxiliary information to further boost the performance.
Extensive experiments on three real-world datasets from different domains show that ++CDL can significantly advance the state of the art++.

2. NOTATION AND PROBLEM FORMULATION

Defination:

The entire collection of J items is represented by a J-by-S matrix $X_c$ , where row j is the bag-of-words vector $X_{c,j*}$ for item j based on a vocabulary of size S.
With I users, we define an I-by-J binary rating matrix $R=[R_{ij}]_{I*J} $ .

Given part of the ratings in R and the content information $X_c$ , the problem is to predict the other ratings in R.

(Note : an L=2-layer SDAE corresponds to an L-layer network.)

3. COLLABORATIVE DEEP LEARNING

3.1 Stacked Denoising Autoencoders

SDAE is a ++feedforward neural network++ for learning representations (encoding) of the input data by learning to predict the clean input itself in the output.

SDAE 是一种++前馈神经网络++，用于通过学习预测输出中的干净输入本身来学习输入数据的表示（编码），如图2所示。

QQ图片20171026194440.png

QQ图片20171026194929.png

3.2 Generalized Bayesian SDAE

QQ图片20171026202817.png

(Note: If λs goes to infinity, the Gaussian distribution in Equation (1) will become a ++Dirac delta distribution++. The model will degenerate to be a ++Bayesian formulation of SDAE++. )

(Note: the first L=2 layers of the network act as an encoder and the last L=2 layers act as a decoder.)

3.3 Collaborative Deep Learning

QQ图片20171026203337.png

(Note: ++the middle layer++ XL=2 serves as a bridge between the ratings and content information. This middle layer, along with the latent offset �j, is the key that ++enables CDL to simultaneously learn an effective feature representation and capture the similarity and (implicit) relationship between items (and users)++. )

The graphical model of CDL when λs approaches positive infinity :

QQ图片20171026204920.png

3.4 Maximum A Posteriori Estimates

An EM-style algorithm for obtaining the MAP estimates:

QQ图片20171027162117.png

++when λs approaches positive infinity++, training of the probabilistic graphical model of CDL would degenerate to simultaneously training two neural networks overlaid together with a common input layer (the corrupted input) but different output layers.
++When the ratio λn=λv approaches positive infinity++, it will degenerate to a two-step model in which the latent representation learned using SDAE is put directly into the CTR.
++when λn=λv goes to zero++ where the decoder of the SDAE essentially vanishes.

QQ图片20171027162738.png

QQ图片20171027162911.png

3.5 Prediction

Let D be the observed test data. We use the point estimates of ui, W+ and �j to calculate the predicted rating:

QQ图片20171027163053.png

we approximate the predicted rating as:

QQ图片20171027163452.png

4. EXPERIMENTS

4.1 Datasets

citeulike-a: citeulike-a contains 5551 users and 16980 items.
citeulike-t: citeulike-t, the numbers are 7947 and 25975.
Netflix: 407261 users, 9228 movies, and 15348808 ratings.

( Note: After removing stop words, the top S discriminative words according to the ++tf-idf values++ are chosen to form the vocabulary (S is 8000, 20000, and 20000 for the three datasets).)

4.2 Evaluation Scheme

We use recall as the performance measure because the rating information is in the form of implicit feedback.

QQ图片20171027164228.png

Another evaluation metric is the mean average precision (mAP).

4.3 Baselines and Experimental Settings

CMF: Collective Matrix Factorization is a model incorporating different sources of information by simultaneously factorizing multiple matrices. In this paper, the two factorized matrices are R and Xc.
SVDFeature: SVDFeature is a model for featurebased collaborative filtering. In this paper we use the content information Xc as raw features to feed into SVDFeature.
DeepMusic: DeepMusic is a model for music recommendation. We use the variant, a loosely coupled method, that achieves the best performance as our baseline.
CTR: Collaborative Topic Regression is a model performing topic modeling and collaborative filtering simultaneously as mentioned in the previous section.
CDL: Collaborative Deep Learning is our proposed model as described above. It allows different levels of model complexity by varying the number of layers.

4.4 Quantitative Comparison

QQ图片20171027165319.png

QQ图片20171027165555.png

QQ图片20171027165622.png

QQ图片20171027165650.png

QQ图片20171027165729.png

4.5 Qualitative Comparison

With a more effective representation, CDL can capture the key points of articles and the user preferences more accurately. Besides, it can model the co-occurrence and relations of words better.

CDL is sensitive enough to changes of user taste and hence can provide more accurate recommendation.

5. COMPLEXITY ANALYSIS AND IMPLEMENTATION

the total time complexity is O(JSK1 + K2J 2 + K2I2 + K3).

CDL is very scalable.

6. CONCLUSION AND FUTURE WORK

We have demonstrated in this paper that state-of-the-art performance can be achieved by jointly performing ++deep representation learning for the content information++ and ++collaborative filtering for the ratings (feedback) matrix++.
As far as we know, CDL is the first hierarchical Bayesian model to bridge the gap between state-of-the-art deep learning models and RS.
The Bayesian nature of CDL also provides potential performance boost if other side information is incorporated as in . Besides, as remarked above, CDL actually provides a framework that can also ++admit deep learning models other than SDAE++.

刘丽
2017-10-26

最后编辑于：2017.12.11 06:25:54

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 217,406评论 6赞 503
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 92,732评论 3赞 393
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 163,711评论 0赞 353
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,380评论 1赞 293
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,432评论 6赞 392
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,301评论 1赞 301
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,145评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,008评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,443评论 1赞 314
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,649评论 3赞 334
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,795评论 1赞 347
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,501评论 5赞 345
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,119评论 3赞 328
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,731评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,865评论 1赞 269
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 47,899评论 2赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,724评论 2赞 354