一、项目背景
本项目对某线上店铺在三四月份的销售情况进行分析,从整体销售情况、用户消费行为两个方向进行分析。其中用户消费行为中,对用户忠诚度、性别分布、年龄分布和生命周期进行了深入分析,用来帮助运营人员明确后续工作方向。
二、分析思路
从整体销售情况和用户消费行为两个方面开始分析。针对于用户来说,从忠诚度、性别分布、年龄分布、用户生命周期等指标分析用户消费情况。
三、分析过程
源数据分为两个表orderinfo表(订单详情表)和userinfo表(用户信息表)
3.1 整体销售情况表现
3.1.1 统计整体销量情况
3.1.2 统计整体销售额情况
select
year(paidTime),
month(paidTime),
count(1) as 销量,
round(sum(price),2) as 销售额
from orderinfo
where isPaid="已支付"
and paidTime<>'0000-00-00 00:00:00'
group by year(paidTime),month(paidTime);
可以看出4月份的订单量和销售额比三月份少了将近一半,具体原因需要结合业务深入分析。
3.2 用户消费行为分析
3.2.1 用户忠诚度
由于源数据只有3月和4月的,所以我们这里分析3月份的复购率和回购率。
复购率:当月购买了多次的用户占当月用户的比例
# a、先筛选出3月份的消费情况
select
*
from orderinfo
where isPaid="已支付"
and month(paidTime)="03";
# b、统计一下每个用户在3月份消费了多少次
select
userid,
count(1) as cons
from orderinfo
where isPaid="已支付"
and month(paidTime)="03"
group by userid;
# c、对购买次数做一个判断,统计出来那些消费了多次(大于1次)的用户数
select
count(1) as userid_cons,
sum(if(cons>1,1,0)) as fugou_cons,
sum(if(cons>1,1,0))/count(1) as fugou_rate
from (select
userid,
count(1) as cons
from orderinfo
where isPaid="已支付"
and month(paidTime)="03"
group by userid
) a;
本月回购率:本月购买用户中有多少用户下个月又再次购买
# a、统计每年每月的一个用户消费情况
select
userid,
date_format(paidTime,'%Y-%m-01') as month_dt,
count(1) as cons
from orderinfo
where isPaid="已支付"
group by userid,date_format(paidTime,'%Y-%m-01');
# b、相邻月份进行关联,能关联上的用户说明就是回购
select
*
from (select
userid,
date_format(paidTime,'%Y-%m-01') as month_dt,
count(1) as cons
from orderinfo
where isPaid="已支付"
group by userid,date_format(paidTime,'%Y-%m-01')) a
left join (select
userid,
date_format(paidTime,'%Y-%m-01') as month_dt,
count(1) as cons
from orderinfo
where isPaid="已支付"
group by userid,date_format(paidTime,'%Y-%m-01')) b
on a.userid=b.userid
and date_sub(b.month_dt,interval 1 month)=a.month_dt;
# c、统计每个月份的消费人数情况及格得到回购率
select
a.month_dt,
count(a.userid) ,
count(b.userid) ,
count(b.userid) / count(a.userid)
from (select
userid,
date_format(paidTime,'%Y-%m-01') as month_dt,
count(1) as cons
from orderinfo
where isPaid="已支付"
group by userid,date_format(paidTime,'%Y-%m-01')) a
left join (select
userid,
date_format(paidTime,'%Y-%m-01') as month_dt,
count(1) as cons
from orderinfo
where isPaid="已支付"
group by userid,date_format(paidTime,'%Y-%m-01')) b
on a.userid=b.userid
and date_sub(b.month_dt,interval 1 month)=a.month_dt
group by a.month_dt;
3.2.2 男女用户消费差异
由于源数据信息有限,这里只分析男女用户消费频次差异;
a、统计每个用户的消费次数,注意要带性别
select
a.userid,
sex,
count(1) as cons
from orderinfo a
inner join (select * from userinfo where sex<>'') b
on a.userid=b.userid
group by a.userid,sex;
b、对性别做一个消费次数平均计算
select
sex,
avg(cons) as avg_cons
from (select
a.userid,
sex,
count(1) as cons
from orderinfo a
inner join (select * from userinfo where sex<>'') b
on a.userid=b.userid
group by a.userid,sex) a
group by sex;
查询结果显示,男女生消费频次几乎相等,证明性别对该商品的销售情况无影响。
3.2.3 年龄分布情况
# a、计算每个用户的年龄,并对年龄进行分层:0-10:1,11-20:2,21-30:3
select
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age
from userinfo
where birth>'1901-00-00';
# b、关联订单信息,获取不同年龄段的一个消费频次和消费金额
select
a.userid,
age,
count(1) as cons,
sum(price) as prices
from orderinfo a
inner join (select
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age
from userinfo
where birth>'1901-00-00') b
on a.userid=b.userid
group by a.userid,age;
# c、再对年龄分层进行聚合,得到不同年龄层的消费情况
select
age,
avg(cons),
avg(prices)
from (select
a.userid,
age,
count(1) as cons,
sum(price) as prices
from orderinfo a
inner join (select
userid,
birth,
now(),
ceil(timestampdiff(year,birth,now())/10) as age
from userinfo
where birth>'1901-00-00') b
on a.userid=b.userid
group by a.userid,age) a
group by age;
age列表示年龄除以10之后,向上取余的结果。所以4表示在[30-40)岁之间,以此类推。
由于源数据脏数据太多了,这个结果无法直观的看出年龄分布情况,这里只是了解下分析方法。
3.2.4 用户生命周期
用户生命周期为多次消费的用户,第一次和最后一次消费间隔是多少天;
select
userid,
min(paidTime),
max(paidTime),
datediff(max(paidTime), min(paidTime)) as 时间间隔
from orderinfo
where isPaid="已支付"
group by userid
having count(1)>1;