没错,留存的问题还没有写完,之前两篇把日、周、月当期活跃用户在后续周期的留存率问题解决了。但是还有个非常重要的指标,当期新增用户的留存率,这个指标也是很有价值的,我们必须要关注不同日期拉新用户的质量如何,看看不同时期新用户的后续留存情况,对后续拉新的时间选择也是有参考价值的。
其实实现也很简单,只需要在之前的基础上,先把当期的首次登陆用户找出来就行了。实现方式是,按照用户聚合,然后取日期最小值就能取出每个用户首次登陆日期了,SQL语句如下↓
SELECT
  user_id,
  DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date 
FROM
  liucun
GROUP BY
  user_id
然后就以此为基础,通过左连接把用户表格再连接一次,判断与首次登陆的日期相差多少天就行了,就能判断是第N天有活跃,就能计算N日留存和留存率了,SQL语句和结果如下↓
SELECT 
  t1.*,
  DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
  DATEDIFF(date(lc1.time),date(t1.date)) daydiff
FROM
  (SELECT
    user_id,
    DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date 
  FROM
    liucun
  GROUP BY
    user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id

后面就和之前思路一样了,就可以求出日留存率情况了,SQL语句如下,解释可以看前面两篇。
SELECT
  date,
  COUNT(DISTINCT user_id) 当日新增户数,
  COUNT(DISTINCT CASE WHEN daydiff=1 THEN user_id ELSE NULL END) 次日用户数,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次日留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三日留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=6 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 七日留存率
  FROM
  (SELECT 
  t1.*,
  DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
  DATEDIFF(date(lc1.time),date(t1.date)) daydiff
FROM
  (SELECT
    user_id,
    DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date 
  FROM
    liucun
  GROUP BY
    user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id) temp
GROUP BY
  date

然后按月实现方式和上一篇一样的思路,关联一个辅助表就行了,这里不详细解释了,可以参考上一篇,完整SQL语句和结果如下↓
SELECT
  月份,
  COUNT(DISTINCT user_id) 当月新增用户,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次月留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 两月留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=3 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三月留存率
  FROM
  (SELECT 
  t1.*,
  DATE_FORMAT(t1.date,"%Y-%m") 月份,
  DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
  d1.monthnum m0,
  d2.monthnum m1,
  d2.monthnum-d1.monthnum mdiff
FROM
  (SELECT
    user_id,
    DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date 
  FROM
    liucun
  GROUP BY
    user_id) as t1
  LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id
  LEFT JOIN date as d1 ON date(t1.date)=d1.日期
  LEFT JOIN date as d2 ON date(lc1.time)=d2.日期) temp
GROUP BY
  月份

那么按周的留存率也是一样的,SQL语句和结果如下↓
SELECT
  周次,
  COUNT(DISTINCT user_id) 当周新增用户,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次周留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 两周留存率,
  CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=3 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三周留存率
  FROM
  (SELECT 
  t1.*,
  d1.周次 周次,
  DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
  d2.weeknum-d1.weeknum wdiff
FROM
  (SELECT
    user_id,
    DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date 
  FROM
    liucun
  GROUP BY
    user_id) as t1
  LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id
  LEFT JOIN date as d1 ON date(t1.date)=d1.日期
  LEFT JOIN date as d2 ON date(lc1.time)=d2.日期) temp
GROUP BY
  周次
End
◆ PowerBI开场白
◆ Python高德地图可视化
◆ Python不规则条形图
