Hive日常数据需求二

题目二:计算用户的回购率和复购率

复购率: 当前月份购买2次及以上的客户占所有客户比例

回购率:当前月份购买且上个月份也购买的客户占当月所有月份客户比例

思路:

复购率

1、对当月(2月份)的客户分组,计数购买次数
2、筛选购买次数为2以上的,认为是复购群体
回购率

1、筛选当月及上月部分
2、利用客户id进行当月连上月,推荐左连
3、对同一条客户id均有购买记录的,认为是回购群体

操作:

复购率
1、对当月(2月份)的客户分组,计数购买次数

select  customer_key, 
        count(customer_key) as ct
from    ods_sales_orders   
                
where   month(create_date) = 2
group by customer_key
limit 10;
image.png
PS:hive中groupby 后的字段必须在select后出现,且groupby后没有的字段不能在select出现,除了聚合函数,否则报错。

示例:


image.png

2、筛选购买次数为2以上的,认为是复购群体

select count(ct),
       count(if (ct>1,1,null)),
       count(if (ct>1,1,null))/count(ct) as fg_ratio
from  
(select customer_key,count(customer_key) as ct
from   ods_sales_orders                  
where  month(create_date) = 2
group by customer_key) 
as a ;
image.png
PS:注意在Hive中书写sql时,这里每行需要顶格写,否则会报错.

例如以如下方式写报错:

select count(ct),
       count(if (ct>1,1,null)),
       count(if (ct>1,1,null))/count(ct) as fg_ratio
from  
      (select customer_key,count(customer_key) as ct  #这行开始没有顶格写
        from   ods_sales_orders                  
        where  month(create_date) = 2
        group by customer_key) as a ;
image.png
操作:

回购率
1、关联本月与上月,找出回购客户

SELECT *
FROM (
    SELECT customer_key, substr(create_date, 1, 7) AS umonth
    FROM ods_sales_orders
    GROUP BY customer_key, substr(create_date, 1, 7)
) a
    LEFT JOIN (
        SELECT customer_key, substr(create_date, 1, 7) AS umonth
        FROM ods_sales_orders
        GROUP BY customer_key, substr(create_date, 1, 7)
    ) b
    ON a.customer_key = b.customer_key
        AND substring(a.umonth, 6, 2) = substring(b.umonth, 6, 2) - 1
LIMIT 10;
image.png

2、统计每个月份的消费人数情况即可得到回购率

SELECT a.umonth, COUNT(a.customer_key) AS mcount, COUNT(b.customer_key) AS lcount
    , concat(round(COUNT(b.customer_key) / COUNT(a.customer_key) * 100, 2), '%') AS ratio
FROM (
    SELECT customer_key, substr(create_date, 1, 7) AS umonth
    FROM ods_sales_orders
    GROUP BY customer_key, substr(create_date, 1, 7)
) a
    LEFT JOIN (
        SELECT customer_key, substr(create_date, 1, 7) AS umonth
        FROM ods_sales_orders
        GROUP BY customer_key, substr(create_date, 1, 7)
    ) b
    ON a.customer_key = b.customer_key
        AND substring(a.umonth, 6, 2) = substring(b.umonth, 6, 2) - 1
GROUP BY a.umonth;
image.png
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。