Churn Analysis and Plan Recommendation F

Method 3: Matrix factorization
Previous section:
1. Collaborative filtering and Co-occurrence matrix
Used for product recommendations - Amazon
2. Limitations of collaborative filtering
No context, does not use features, etc.
Matrix factorization
Movie recommendation by NETFLIX
Use features of Use interactions of
users and items users and items
(Classification model) (Collaborative filtering)
www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

Movie recommendation
User Movie Rating Each user watches only
a few of the movies

Movie recommendation
Rating given by user 𝑢𝑢 for movie 𝑣𝑣
𝑢𝑢
Known for white cells
? ?
Users
Rating (𝑢𝑢, 𝑣𝑣)
Rating = ? Unknown for blue cells
? ? ? ?
Movies 𝑣𝑣
Goal Fill missing data Use ratings given by all users

Recommendations from known features
Movie recommendation and Ratings matrix
Describe movie 𝑣𝑣 by vector 𝑅𝑅𝑣𝑣 How much does the movie vector (𝑅𝑅𝑣𝑣)
and user vector (𝐿𝐿𝑢𝑢)agree?
How much is it action, romance, drama, ….
𝑅𝑅𝑣𝑣 = [ 0.2, 0.8, 1.3, … … ]

Describe user 𝑢𝑢 by vector 𝐿𝐿𝑢𝑢
How much he/she likes action, romance, drama, ….

𝐿𝐿𝑢𝑢 = [ 0.7, 0, 2.1, … … ]
� (𝑢𝑢, 𝑣𝑣) by using 𝑅𝑅𝑣𝑣 and 𝐿𝐿𝑢𝑢
Find 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅

Recommendations from known features
𝑅𝑅𝑣𝑣 = [ 0.2,0.8, 1.3, … … ]
⨯ ⨯ ⨯
𝐿𝐿𝑢𝑢 = [ 0.7, 0, 2.1, … … ] For user u
0.2 ⨯ 0.7 + 0.8 ⨯ 0 + 1.3 ⨯ 2.1 + … = 𝟕𝟕. 𝟑𝟑
𝑅𝑅𝑣𝑣 = [ 0.2, 0.8, 1.3,… … ]

⨯ ⨯ ⨯
𝐿𝐿𝑢𝑢′ = [ 2.9, 0.01,0.02,… … ] For user u’
0.2 ⨯ 2.9 + 0.8 ⨯ 0.01 + 1.3 ⨯ 0.02 + … = 𝟎𝟎. 𝟗𝟗𝟗𝟗
Recommendations: Sort movies the user

� (𝒖𝒖, 𝒗𝒗)
hasn’t watched by 𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹

Predictions in the matrix form
𝑣𝑣
𝑢𝑢 � (𝑢𝑢, 𝑣𝑣) = < 𝐿𝐿𝐿𝐿, 𝑅𝑅𝑅𝑅 >
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅
Users
� =
𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹
Movies
𝑅𝑅𝑣𝑣
𝐿𝐿𝑢𝑢 𝑹𝑹
≈ 𝑳𝑳
𝑅𝑅𝑣𝑣 = [action, romance, drama, ….]
𝐿𝐿𝑢𝑢 = [action, romance, drama, ….]
Predictions in the matrix form
𝑣𝑣
𝑢𝑢 � (𝑢𝑢, 𝑣𝑣) = < 𝐿𝐿𝐿𝐿, 𝑅𝑅𝑅𝑅 >
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅
� =
𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹𝑹
𝑅𝑅𝑣𝑣
𝐿𝐿𝑢𝑢 𝑹𝑹
≈ 𝑳𝑳
𝑅𝑅𝑣𝑣 = [action, romance, drama, ….]

𝐿𝐿𝑢𝑢 = [action, romance, drama, ….]
But you don’t know topics of users and movies

Matrix Factorization: Discovering topics from data
White squares = Data
Users
Rating =
Movies
𝑹𝑹
≈ 𝑳𝑳
Parameters of the model

Residual sum of squares (RSS)
2
𝑅𝑅𝑅𝑅𝑅𝑅 𝐿𝐿𝑢𝑢, 𝑅𝑅𝑅𝑅 = ( 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅(𝑢𝑢, 𝑣𝑣) − < 𝐿𝐿𝐿𝐿, 𝑅𝑅𝑅𝑅 >)
2
𝑅𝑅𝑅𝑅𝑅𝑅 𝐿𝐿, 𝑅𝑅 = ∑ ( 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅(𝑢𝑢, 𝑣𝑣) − < 𝐿𝐿𝐿𝐿, 𝑅𝑅𝑅𝑅 >) For all 𝑢𝑢 and 𝑣𝑣
White squares = Data
Users
Rating =
Movies
Users
Rating =
Movies
FACTORIZE into
𝑹𝑹
≈ 𝑳𝑳

Matrix Factorization and Limitations
𝐿𝐿𝑢𝑢 →
𝐿𝐿� 𝑢𝑢
𝑅𝑅𝑣𝑣 →
𝑅𝑅� 𝑣𝑣
Many efficient algorithms for factorization Example: Stochastic Gradient Descent

(refer this part of e-book)
Use estimated 𝐿𝐿� 𝑢𝑢 and 𝑅𝑅� 𝑣𝑣 for recommendation
Cold start problem:
What if a new user or movie arrives?

Combining features and discovered topics
How to solve the cold start problem?
Features capture context: Time of the day, user information, etc.
Discovered topics from Matrix Factorization capture groups of
users that behave similarly
Combine models to solve the cold start problem
1. Ratings for a new user from features only

2. Matrix Factorization topics become more important as more
information about the user is discovered
Ensemble methods

Blending models
Netflix Prize 2006-2009
Winning team blended

over 100 models
Data: 100M ratings, 17,770 movies and 480,189 users

Goal: Predict 3M ratings to highest accuracy
Prize: 1 million USD

Churn Analysis and Plan Recommendation F

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Churn Analysis and Plan Recommendation F

Загружено:

Авторское право:

Доступные форматы

Method 3: Matrix factorization

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

Goal Fill missing data Use ratings given by all users

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

𝑅𝑅𝑣𝑣 = [ 0.2, 0.8, 1.3, … … ]

How much he/she likes action, romance, drama, ….

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

0.2 ⨯ 0.7 + 0.8 ⨯ 0 + 1.3 ⨯ 2.1 + … = 𝟕𝟕. 𝟑𝟑

𝑅𝑅𝑣𝑣 = [ 0.2, 0.8, 1.3,… … ]

0.2 ⨯ 2.9 + 0.8 ⨯ 0.01 + 1.3 ⨯ 0.02 + … = 𝟎𝟎. 𝟗𝟗𝟗𝟗

Recommendations: Sort movies the user

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

𝑅𝑅𝑣𝑣 = [action, romance, drama, ….]

But you don’t know topics of users and movies

Parameters of the model

White squares = Data

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

Many efficient algorithms for factorization Example: Stochastic Gradient Descent

Use estimated 𝐿𝐿� 𝑢𝑢 and 𝑅𝑅� 𝑣𝑣 for recommendation

Cold start problem:

What if a new user or movie arrives?

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

Combine models to solve the cold start problem

1. Ratings for a new user from features only

www.subhrajitroy.com | facebook.com/sroy.subhrajitroy | @sroy_subhrajit

Winning team blended

Data: 100M ratings, 17,770 movies and 480,189 users

Вам также может понравиться