Predictive Analytics is mean to use predictive models( find patterns) to learn from the data to predict a "Score"(Probability) and forecasting.
Types of Models
Predictive models-预测模型
- find relation of one unit's specific performance in a sample with one or more other attributes in the unit. 找一个现象哪一个或者多个因素是相关的,利用training sample的已知结果和已知因素,生产模型,out of training sample,利用已知因素来判断预测结果,给出一个probability or score
Descriptive Models-描述模型
-用于用户分群等,寻找一些用户和商品之间关系,或者一些用户和一些用户之间的关系
Decision Models
-find optimal
While this is not quite True.
From Coursera course Customer Analytics, Predictive analysis use probability model vs regression model to solve different problems. And there are reasons for each model and assumptions for each model.
所以一是了解各方法使用的问题范围,同时了解某些方法下具体的模型的假设是否和事实相符。比如有个要肯定用户满足某个随机分布??但实际用户不是随机分布??
Regression
-• Quantify the relationship among two or more variables.
-解释一个dependent的variable和几个相互独立的variable之间的关系。
-适合决定optimal prices
Quiz
In which of these situations would it be more appropriate to use a probability model rather than a regression/data-mining approach?
Not able to identify which customer or when will a customer buy certain things.
Among the explanations below, which one is not a reason to favor a probability model over a regression-like (e.g., data-mining) model for long-run projections of customer behavior?
Probability:Long-term forecasting, each individual as random coinflips.when will this customer turn?If they survive through the next period, how many more periods will they survive?
Regression: Short-term forecast. period two ,How many purchase going to happen in the next year, who's going to turn or not.its a number of purchases,someone stays with us or not
Answers that not right:
1. not easy to look for independent variables
2. Regression assume customer to be not random for it predicts customer one by one, while probability assume the randomness
3. ??????? non-staionary?
When we refer to a “cohort,” we are talking about a group of customers who:
Same Acquisition year, or same channel, or same purchase at their first time. Time-based cohort.
Referring back to the dataset (and model) we covered extensively, how would these two customers (both “acquired” in 1995) compare to each other, in terms of their expected future purchasing?
See the RFM model chart
Of various model summary figures, which one is the most diagnostic about the model’s overall validity?