The Netflix Recommender System: Algorithms, Business Value, and Innovation

用户研究发现netflix的用户在一到两屏看过10-20个title之后,在60s-90s过后就会失去兴趣。推荐系统的目的就是在两屏之内让用户找到感兴趣的东西。
how each member watches (e.g., the device, time of day, day of week, intensity of watching)
有这么几种推荐策略:
1)Personalized Video Ranker
orders the entire catalog of videos (or subsets selected by genre or other filtering) for each member profile in a personalized way。
Because we use PVR so widely, it must be good at general- purpose relative rankings throughout the entire catalog; this limits how personalized it can actually be
PVR需要对一个分类下所有的视频进行rank,需要对所有分类都进行排序,这实际上限制了个性化
2) Top-N Video Ranker
find the best few personalized recommendations in the entire catalog for each member, that is, focusing only on the head of the ranking, a freedom that PVR does not have because it gets used to rank arbitrary subsets of the catalog
TVR其实是用对头部的视频进行rank,挑出topn出来,所以方法上比PVR更自由。但是这俩其实共享了很多相同的属性,比如
3)Treding Now
used to drive the Trending Now row,有两部分情况表现很好:

  • 季节性的热点,比如情人节
  • 短期实时热点,比如飓风
    4)Continue Watching
    the continue watching ranker sorts the subset of recently viewed titles based on our best estimate of whether the member intends to resume watching or rewatch,主要特征有
  • 上次看过的时间间隔
  • 什么时候放弃的(中间、开始、结尾)
  • 使用的设备
  • 其他[相关]标题是不是看过
    5)Video-Video Similarity
    an unpersonalized algorithm that computes a ranked list of videos—the similars—for every video in our catalog,the choice of which BYW rows make it onto a homepage is personalized
    6) Page Generation: Row Selection and Ranking
    select and order rows from a large pool of candidates to create an ordering optimized for relevance and diversity(怎么评估的相关性和多样性?A recent blogpost Learning a Personalized Homepage
    7) Evidence
    Evidence selection algorithms evaluate all the possible evidence items that we can display for every recommendation, to select the few that we think will be most helpful to the member viewing the recommendation。推荐理由的选择和展示
    decide whether to show that a certain movie won an Oscar or instead show the member that the movie is similar to another video recently watched by that member
    8)Search
    a) search recommends videos for a given query as alternative results for a failed search.
    b)we know about the searching member’s taste is also especially important for us.
  • One algorithm attempts to find the videos that match a given query
  • Another algorithm predicts interest in a concept given a partial query
  • A third algorithm finds video recommendations for a given concept
  1. 商业价值
    The effective catalog size (ECS) is a metric that describes how spread viewing is across the items in our catalog.tells us how many videos are required to account for a typical hour streamed.
    ECS的计算方法如下:


    图片.png

    Notethat pi ≥ pi+1 for i=1,...,N−1and 综合为1.

  2. 衡量标准
    直觉跟线上效果不一定相关,比如“house of cards”看起来更相似的相关推荐结果效果并不如更宽泛的结果.
    we have observed that improving engagement—the time that our members spend viewing Netflix content—is strongly correlated with improving retention.
    显著性和测试的cell数量关系很大,For example, if we find that 50% of the members in the test have retained when we compute our retention metric, then we need roughly 2 million members per cell to measure a retention delta of 50.05% to 49.95%=0.1% with statistical confidence. this type of plot can be used as a guide to choose the sample size for the cells in a test, for example, detecting a retention delta of 0.2% requires the sample size traced by the black line labeled 0.2%, which changes as a function of the average retention rate when the experiment stops, being maximum (south of 500k members per cell) when the retention rate is 50%.


    图片.png

    离线测试加速迭代,Offline experiments allow us to iterate quickly on algorithm prototypes, and to prune the candidate variants that we use in actual A/B experiments.

  1. 关键问题
    1)Better Experimentation Protocols
    还是需要更好地离线和在线评测指标来综合整体的收益,比如在长期收益和短期收益的衡量上
    2)Global Algorithms
    3)Controlling for Presentation Bias
    introduce randomness into the recommendations
    4)Page Construction
    It took us a couple of years to find a fully personalized algorithm to construct a page of recommendations that A/B tested better than a page based on a template (itself optimized through years of A/B testing)
    5)Member Coldstarting
    Today, our member coldstart approach has evolved into a survey given during the sign-up process, during which we ask new members to select videos from an algorithmically populated set that we use as input into all of our algorithms.
    6)Choosing the Best Evidence to Support Each Recommendation
    highlight different aspects of a video, such as an actor or director involved in it

  2. 延伸阅读
    Learning a Personalized Homepage


    图片.png

    We want our recommendations to be accurate in that they are relevant to the tastes of our members, but they also need to be diverse so that we can address the spectrum of a member’s interests versus only focusing on one. We want to be able to highlight the depth in the catalog we have in those interests and also the breadth we have across other areas to help our members explore and even find new interests. We want our recommendations to be fresh and responsive to the actions a member takes, such as watching a show, adding to their list, or rating; but we also want some stability so that people are familiar with their homepage and can easily find videos they’ve been recommended in the recent past
    二维的多行,横着天然满足相关性,竖着天然满足多样性。
    we consider important

  • the quality of the videos in the row,
  • the amount of diversity on the page
  • the affinity of members for specific kinds of rows
  • and the quality of the evidence we can surface for each video.

A simple way to add in diversity is to switch from a row-ranking approach to a stage-wise approach using a scoring function that considers both a row as well as its relationship to both the previous rows and the previous videos already chosen for the page.Other approaches to greedily add diversity based on submodular function maximization can also be used.
Diversity can also be additionally incorporated into the scoring model when considering the features of a row compared to the rest of the page by looking at how similar the row is to the rest of the rows or the videos in the row to the videos on the rest of the page.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,287评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,346评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,277评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,132评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,147评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,106评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,019评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,862评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,301评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,521评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,682评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,405评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,996评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,651评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,803评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,674评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,563评论 2 352

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,322评论 0 10
  • The Inner Game of Tennis W Timothy Gallwey Jonathan Cape ...
    网事_79a3阅读 12,033评论 3 20
  • 浮生长路烟雨不散,山仍是山, 世间风景缺憾万般,你还是你。
    Markyyy阅读 223评论 0 0
  • 纵然再大的悲或喜 总要吃饭睡觉 纵然有千般喜爱万般仇恨 总会放下 终究一生要明的什么 不是喜欢痛苦 不是追逐享受 ...
    生命不息信仰不止阅读 124评论 0 0