离线数仓三

指标开发-时间维度分析业务开发

需求分析

  • 统计2019年期间每个季度的销售订单笔数、订单总额
  • 统计2019年期间每个月的销售订单笔数、订单总额
  • 统计2019年期间每周一到周日的销售订单笔数、订单总额
  • 统计2019年期间国家法定节假日、休息日、工作日的订单笔数、订单总额

生成日期维度数据

经过需求分析我们发现指标统计需要的日期类型比较多如果在事实表的时间字段上做文章,比较复杂,甚至有些根本无法获取到比如节假日、休息日、工作日信息。所以我们可以考虑提前准备一张日期维度表,准备好指标统计的日期类型。

利用我们前面学习的kettle工具生成日期维度数据,需要借助于kettle的js脚本组件。

kettle实现思路:

  • 日期使用 Canlendar、SimpleDateFormat,编写Javascript脚本实现
  • 国家法定节假日实现,需要往http://timor.tech/api/holiday/info/2018-03-02年月日,发送请求,获取结果,以下为该REST API接口简介
1、接口地址:http://timor.tech/api/holiday/info/数字日期,支持https协议。
2、返回数据:正常工作日对应结果为 0,,休息日对应结果为 1, 法定节假日对应结果为 2
3、节假日数据说明:本接口包含2017年起的中国法定节假日数据,数据来源国务院发布的公告,每年更新1次,确保数据最新
4、示例:
http://timor.tech/api/holiday/info/2018-03-02
返回数据:
{"code":0,"type":{"type":0,"name":"周五","week":5},"holiday":null}

{
  "code": 0,              // 0服务正常。-1服务出错
  "type": {
    "type": enum(0, 1, 2), // 节假日类型,分别表示 工作日、周末、节日。
    "name": "周六",         // 节假日类型中文名,可能值为 周一 至 周日、假期的名字、某某调休。
    "week": enum(1 - 7)    // 一周中的第几天。值为 1 - 7,分别表示 周一 至 周日。
  },
  "holiday": {
    "holiday": false,     // true表示是节假日,false表示是调休
    "name": "国庆前调休",  // 节假日的中文名。如果是调休,则是调休的中文名,例如'国庆前调休'
    "wage": 1,            // 薪资倍数,1表示是1倍工资
    "after": false,       // 只在调休下有该字段。true表示放完假后调休,false表示先调休再放假
    "target": '国庆节'     // 只在调休下有该字段。表示调休的节假日
  }
}
类型 示例值 中文名
date_key string 20000101 代理键
date_value string 2000-01-01 年-月-日
day_in_year string 1 当年的第几天
day_in_month string 1 当月的第几天
is_first_day_in_month string y 是否月的第一天
is_last_day_in_month string n 是否月的最后一天
weekday string 星期一 星期
week_in_month string 1 月的第几个星期
is_first_day_in_week string y、n 是否周一
is_dayoff string y、n 是否休息日
is_workday string y、n 是否工作日
is_holiday string y、n 是否国家法定节假日
date_type string workday、weekend、holiday<br />工作日、周末、法定节假日 日期类型<br />工作日:workday<br />国家法定节假日:holiday<br /><br />休息日:weekend
month_number string 1、2、..、12 月份
year string 2000 年份
quarter_name string Q1 季度名称
quarter_number string 1 季度
year_quarter string 2000-Q1 年-季度
year_month_number string 2000-01 年-月份

参考数据:

dim_date.date_value dim_date.day_in_month dim_date.is_first_day_in_month dim_date.is_last_day_in_month dim_date.weekday dim_date.week_in_month dim_date.is_first_day_in_week dim_date.is_dayoff dim_date.is_workday dim_date.is_holiday dim_date.date_type dim_date.month_number dim_date.year dim_date.year_month_number dim_date.quarter_name dim_date.quarter_number dim_date.year_quarter
2019-1-1 1 y n 2 1 n n n y holiday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-3 3 n n 4 1 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-4 4 n n 5 1 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-5 5 n n 6 1 n y n n weekend 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-2 2 n n 3 1 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-6 6 n n 7 2 n y n n weekend 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-7 7 n n 1 2 y n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-8 8 n n 2 2 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-9 9 n n 3 2 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1
2019-1-10 10 n n 4 2 n n y n workday 1 2019 Jan-19 Q1 1 2019-Q1

Hive创建日期维度表

参考SQL语句:

DROP TABLE IF EXISTS `itcast_dw`.`dim_date`;
CREATE TABLE `itcast_dw`.`dim_date`(
    `date_key` string,
    `date_value` string,
    `day_in_year` string,
    `day_in_month` string,
    `is_first_day_in_month` string,
    `is_last_day_in_month` string,
    `weekday` string,
    `week_in_month` string,
    `is_first_day_in_week` string,
    `is_dayoff` string,
    `is_workday` string,
    `is_holiday` string,
    `date_type` string,
    `month_number` string,                        
    `year` string,
    `year_month_number` string,
    `quarter_name` string,
    `quarter_number` string,
    `year_quarter` string
)
stored as parquet TBLPROPERTIES('parquet.compression'='SNAPPY');

使用Kettle构建以下组件结构图

  • 生成记录
  • 增加序列
  • Javascript代码
  • Http Client组件(5个并行度)
  • JSON Input
  • Switch/case
  • 字段映射为常量
    • 工作日
    • 法定节假日
    • 节假日调休补班
    • 休息日
  • 字段选择
  • Parquet Output
kettle生成自定义日期维度数据组件图

组件配置

1、生成记录组件

生成记录组件-配置限制为 365
配置增加序列组件

3、Javascript组件,生成基本时间维度数据

 //初始日期
        var initStr = "2019-01-01";
        //转为日期对象
        var dateFormat = new java.text.SimpleDateFormat("yyyy-MM-dd");
        var initDate = dateFormat.parse(initStr);
        //获取日历对象
        var calendar = java.util.Calendar.getInstance();
        //设置日历的时间就是我们的初始时间
        calendar.setTime(initDate);
        //使用日历加减天,变换日期
        calendar.add(java.util.Calendar.DAY_OF_MONTH, setup);
        //获取增加一天之后的日期
        var newDate = calendar.getTime();
        var date_value = dateFormat.format(newDate);
        dateFormat.applyPattern("yyyyMMdd");
        var date_key = dateFormat.format(newDate);
        //一年当中第几天?
       var day_in_year = calendar.get(java.util.Calendar.DAY_OF_YEAR)+"";
        //一月当中第几天
        var day_in_month = calendar.get(java.util.Calendar.DAY_OF_MONTH)+"";
        //是否是月的第一天
        var is_first_day_in_month = "n";
        if (day_in_month.equals("1")) {
            is_first_day_in_month = "y";
        }
        //是否是月的最后一天
        var is_last_day_in_month = "n";
        calendar.add(java.util.Calendar.DAY_OF_MONTH, 1);
        var nextDay = calendar.get(java.util.Calendar.DAY_OF_MONTH) + "";
        if (nextDay.equals("1")) {
            is_last_day_in_month = "y";
        }
        //注意增加之后的日历对象需要再减回去
        calendar.add(java.util.Calendar.DAY_OF_MONTH, -1);

        //星期几
       var weekday= (calendar.get(java.util.Calendar.DAY_OF_WEEK)-1)+"";
       //星期日:1,星期六是7
        if(weekday.equals("0")){
            weekday="7";
        }
        //获取月中第几周,先减一保证时间正确(我们的规则)
        calendar.add(java.util.Calendar.DAY_OF_MONTH, -1);
        var week_in_month = calendar.get(java.util.Calendar.WEEK_OF_MONTH)+"";
        //再增加回去
        calendar.add(java.util.Calendar.DAY_OF_MONTH, 1);
        //是否是周一
        var is_first_day_in_week = "n";
        if (weekday.equals("1")) {
            is_first_day_in_week = "y";
        }

        var is_dayoff = "n";
        var is_workday = "n";
        var is_holiday = "n";
        var date_type = "workday";
        //定义查询的url
        var holidayUrl = "http://timor.tech/api/holiday/info/" + date_value;
        //月份
        dateFormat.applyPattern("MM");
        var month_number = dateFormat.format(newDate);

        dateFormat.applyPattern("yyyy");
        var year = dateFormat.format(newDate);
        var year_month_number = year + "-" + month_number;
        //季度
        var quarter_name = "";
        var quarter_number = "";
        var year_quarter = "";
        //按照我们的规则,否则默认是国外规则,星期日属于下个周
        switch (calendar.get(java.util.Calendar.MONTH) + 1) {
            case 1:
            case 2:
            case 3:
                quarter_name = "Q1";
                quarter_number = "1";
                year_quarter = year + "-" + quarter_name;
                break;
            case 4:
            case 5:
            case 6:
                quarter_name = "Q2";
                quarter_number = "2";
                year_quarter = year + "-" + quarter_name;
                break;
            case 7:
            case 8:
            case 9:
                quarter_name = "Q3";
                quarter_number = "3";
                year_quarter = year + "-" + quarter_name;
                break;
            case 10:
            case 11:
            case 12:
                quarter_name = "Q4";
                quarter_number = "4";
                year_quarter = year + "-" + quarter_name;
                break;

        }

4、配置HttpClient组件

配置HttpClient组件

5、配置JSON Input组件

配置JSON Input组件

6、配置 switch/case 组件

配置 switch/case 组件

7、配置工作日、法定节假日、休息日等组件

配置工作日、法定节假日、休息日等组件

8、配置字段选择组件

配置字段选择组件

9、配置parquet输出

配置parquet输出

验证数据

select * from itcast_dw.dim_date limit 10;

创建ads层数据表

drop table if exists itcast_ads.ads_order_date;
create table itcast_ads.ads_order_date(
    date_type string,   -- 时间维度类型
    date_val string,   -- 时间维度值
    order_count bigint, -- 订单笔数
    order_money double  -- 订单交易额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;

统计半年期间每个季度的销售订单笔数、订单总额

时间维度类型:

  • 1 → 季度
  • 2 → 月度
  • 3 → 星期
  • 4 → 日期类型(节假日、休息日、工作日)

统计2019年期间每个季度的销售订单笔数、订单总额

insert overwrite table itcast_ads.ads_order_date partition(dt='20190909')
select
    '1' as date_type,                                      -- 时间维度类型,1表示季度
    t1.year_quarter as date_val,                           -- 季度名称
    count(orderid),                            -- 订单总数
    sum(totalMoney)                         -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
    on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by t1.year_quarter
order by t1.year_quarter;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '1';

统计2019年期间每个月的销售订单笔数、订单总额

insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '2' as date_type, --2表示月类型
    t1.year_month_number as date_val,
    count(orderid),                  -- 订单总数
    sum(totalMoney)                         -- 订单总额
from
    itcast_dw.dim_date t1
    left join
     (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
 group by 
    t1.year_month_number
order by t1.year_month_number;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '2';

统计2019年期间每周一到周日的销售订单笔数、订单总额

insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '3' as date_type,
    t1.weekday as date_val,
     count(orderid),                  -- 订单总数
    sum(totalMoney)                         -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by
    t1.weekday
order by t1.weekday;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '3';

统计2019年期间国家法定节假日、休息日、工作日的订单笔数、订单总额

insert into table itcast_ads.ads_order_date partition(dt='20190909')
select
    '4' as date_type,
    t1.date_type as date_val,
    count(orderid) as order_count,                  -- 订单总数
    sum(totalMoney) as order_money                        -- 订单总额
from 
    itcast_dw.dim_date t1
    left join
    (select * from itcast_dw.fact_orders where dt >='20190101' and dw_end_date='9999-12-31') t2
     on t1.date_value = date_format(createtime, 'yyyy-MM-dd')
group by
    t1.date_type
order by t1.date_type;

-- 测试
select * from itcast_ads.ads_order_date where dt='20190909' and date_type = '4';

指标开发-订单分析地域、分类维度分析业务开发

需求分析

集团总公司分为很多的分公司(销售事业部)

分公司
唐山市分公司
邯郸市分公司
邢台市分公司
保定市分公司
承德市分公司
沧州市分公司
廊坊市分公司
衡水市分公司

为了公司的经营需要,公司需要定期检查各个分公司的经营指标完成情况,运营部门提出数据分析需求:

  • 交易金额
  • 交易笔数
  • 微信交易笔数
  • 支付宝交易笔数

通过以上指标数据,公司的运营人员能够了解到各个分公司的经营情况,在出现经营问题时,能够及时发现,并根据问题,去寻找问题原因。

大区 城市 一级分类 二级分类 微信交易笔数 支付宝交易笔数 交易笔数 交易金额
0 9999 0 0 1001341 1238581 2239922 62348998

维度如下:

  • 商品分类
  • 行政区域

要求:最终可以根据不同大区、不同城市、不能分类级别查询交易数据,也就是要求支持不同维度的组合查询。

创建 dw 层数据表

该层主要创建维度表与事实表。为了便于识别,维度表增加 dim_ 前缀。事实表增加 fact_ 前缀:

fact_order_goods_wide

-- 1. 创建订单明细事实表
DROP TABLE IF EXISTS itcast_dw.fact_order_goods_wide;
create table if not exists itcast_dw.fact_order_goods_wide(
    order_id string,            -- 订单id
    goods_cat_3d_id string,     -- 商品三级分类id
    shop_id string,             -- 店铺id
    payment_id string,          -- 订单支付方式
    goods_num bigint,           -- 商品数量
    pay_money double,           -- 订单明细金额
    paytime string              -- 订单时间
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

dim_goods_cat

-- 表创建
DROP TABLE IF EXISTS itcast_dw.dim_goods_cat;
create table if not exists itcast_dw.dim_goods_cat(
    cat_3d_id string,                   -- 三级商品分类id
    cat_3d_name string,                 -- 三级商品分类名称
    cat_2d_id string,                    -- 二级商品分类Id
    cat_2d_name string,                  -- 二级商品分类名称
    cat_1t_id string,                   -- 一级商品分类id
    cat_1t_name string                  -- 一级商品分类名称
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

dim_shops

-- 创建dw层商铺维度表
DROP TABLE IF EXISTS itcast_dw.dim_shops;
-- 1. 店铺维度表
create table if not exists itcast_dw.dim_shops(
    shop_id string,             -- 店铺id,
    shop_name string,           -- 店铺名称
    city_id string,             -- 店铺所属城市组织机构id
    city_name string,           -- 店铺所属城市组织机构名称
    region_id string,         -- 店铺所属省份组织机构id
    region_name string        -- 店铺所属省份组织机构名称
)
partitioned by (dt string)
STORED AS PARQUET  TBLPROPERTIES('parquet.compression'='SNAPPY');

dim_payment

-- 1. 创建支付方式维度表
drop table if exists itcast_dw.dim_payment;
create table if not exists itcast_dw.dim_payment(
    payment_id string,          -- 支付方式id
    payment_name string         -- 支付方式名称
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

创建 ads 层数据表

-- 创建ads(数据集市层)订单分析表
DROP TABLE IF EXISTS itcast_ads.ads_trade_order;
create table if not exists itcast_ads.ads_trade_order(
    area_type string,               -- 区域范围:区域类型(全国、大区、城市)
    region_name string,             -- 区域名称
    city_name string,               -- 城市名称
    category_type string,           --分类类型(一级、二级)
    category_1st_name string,       -- 一级分类名称
    category_2d_name string,        -- 二级分类名称
    payment_name string,            -- 支付方式(所有、微信、支付宝、...)
    total_count bigint,             -- 订单数量
    total_goods_num bigint,         -- 商品数量
    total_money double              -- 支付金额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;

ods层数据至dw层

fact_orders

直接查询之前准备好的拉链表:

select * from itcast_dw.fact_orders where  dt ='20190909' and dw_end_date = '9999-12-31' limit 5;

dim_goods

直接查询之前准备好的拉链表:

-- 测试
select * from itcast_dw.dim_goods where dw_end_date = '9999-12-31' limit 5;

dw层数据维度拉宽

数据维度表拉宽

区域店铺维度数据拉宽

店铺表(itcast_shops)与组织机构表(itcast_org)进行关联

列名称 数据类型 数据来源
shop_id string itcast_shops
shop_name string itcast_shops
city_id string itcast_org
city_name string itcast_org
region_id string itcast_org
region_name string itcast_org

参考代码:

-- 2. 加载订单组织机构维度表数据
insert overwrite table itcast_dw.dim_shops partition(dt='20190909')
select
    t1.shopid as shop_id,           -- 店铺id
    t1.shopname as shop_name,       -- 店铺名称
    t2.orgid as city_id,            -- 城市组织机构id
    t2.orgname as city_name,        -- 城市组织机构名称
    t3.orgid as region_id,          -- 区域组织机构id
    t3.orgname as region_name       -- 区域组织机构名称
from
    (select shopid, shopname, areaid from itcast_ods.itcast_shops where dt='20190909') t1                         -- 店铺数据
    left join
    (select orgid, parentid, orgname, orglevel from itcast_ods.itcast_org where orglevel=2 and dt='20190909') t2    -- 城市组织机构数据
    on t1.areaid = t2.orgid
    left join
    (select orgid, parentid, orgname, orglevel from itcast_ods.itcast_org where orglevel=1 and dt='20190909') t3    -- 省份组织机构数据
    on t2.parentid = t3.orgid;
    -- 测试数据
    select * from itcast_dw.dim_org limit 15;
商品分类维度数据拉宽

商品表维度数据拉宽

参考代码:

列名称 数据类型 数据来源
cat_3d_id string itcast_goods_cats
cat_3d_name string itcast_goods_cats
cat_2d_id string itcast_goods_cats
cat_2d_name string itcast_goods_cats
cat_1t_id string itcast_goods_cats
cat_1t_name string itcast_goods_cats
insert overwrite table itcast_dw.dim_goods_cat partition(dt='20190909')
select 
    t3.catid as cat_3d_id,          -- 三级分类id
    t3.catname as cat_3d_name,      -- 三级分类名称
    t2.catid as cat_2d_id,          -- 二级分类id
    t2.catname as cat_2d_name,      -- 二级分类名称
    t1.catid as cat_1t_id,          -- 一级分类id
    t1.catname as cat_1t_name       -- 一级分类名称
from 
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=3 and dt='20190909') t3 -- 商品三级分类数据
    left join
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=2 and dt='20190909') t2 -- 商品二级分类数据
    on t3.parentid = t2.catid
    left join
    (select catid, catname, parentid from itcast_ods.itcast_goods_cats where cat_level=1 and dt='20190909') t1 -- 商品一级分类数据
    on t2.parentid = t1.catid;
    
-- 测试数据
select * from itcast_dw.dim_goods_cat limit 5;
支付方式维度表拉宽
列名称 数据类型 数据来源
payment_id string itcast_payments
payment_name string itcast_payments
-- 2. 加载支付方式维度数据
-- 需要额外添加一行数据 0 -> 其他
insert overwrite table itcast_dw.dim_payment partition(dt='20190909')
select 
    t1.id as payment_id,        -- 支付方式id
    t1.payName as payment_name  -- 支付方式名称
from 
    (select id, payName from itcast_ods.itcast_payments where dt='20190909') t1;

    -- 测试查询支付方式维度数据
select * from itcast_dw.dim_payment limit 5;
事实表拉宽
列名称 数据类型 数据来源
order_id string itcast_orders
goods_cat_3d_id string dim_goods
shop_id string dim_goods
payment_id string itcast_orders
goods_num bigint itcast_order_goods
pay_money double itcast_orders
paytime string itcast_orders
-- 2. 拉宽订单明细事实表
insert overwrite table itcast_dw.fact_order_goods_wide partition(dt='20190909')
select
    t1.orderid as order_id,
    t3.goodscatid as goods_cat_3d_id,
    t3.shopid as shop_id,
    t1.paytype as payment_id,
    t2.goodsnum as goods_num,
    t2.goodsnum*t2.payprice  as pay_money,
    t1.paytime as paytime
from
    (select orderid, paytype, paytime from itcast_dw.fact_orders where dt='20190909') t1                     -- 订单表数据
    left join
    (select orderid, goodsid, goodsnum, payprice from itcast_ods.itcast_order_goods where dt='20190909') t2     -- 订单明细数
    on t1.orderid = t2.orderid
    left join
    (select goodsid, shopid, goodscatid from itcast_dw.dim_goods where dw_end_date = '9999-12-31') t3           -- 商品数量
    on t2.goodsid = t3.goodsid;
    
-- 测试数据
select * from itcast_dw.fact_order_goods_wide limit 5;

指标计算

-- 指标统计
-- 创建 ads 层结果表
create table if not exists itcast_ads.ads_trade_order(
    area_type string,               -- 区域范围:区域类型(全国、大区、城市)
    region_name string,             -- 区域名称
    city_name string,               -- 城市名称
    category_type string,           --分类类型(一级、二级)
    category_1st_name string,       -- 一级分类名称
    category_2d_name string,        -- 二级分类名称
    payment_name string,            -- 支付方式(所有、微信、支付宝、...)
    total_count bigint,             -- 订单数量
    total_goods_num bigint,         -- 商品数量
    total_money double              -- 支付金额
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;

指标计算汇总计算

维度数据:

  • 区域维度数据
  • 商品分类维度数据。

根据需求,需要依据这两个维度进行汇总。最终在业务系统中展示如下:

数据展示示例

公司高管拥有查询所有数据的权限,省份不选择时,看到的为全国范围内的数据,如果指定省份(本次开发的系统为大区)时,看到的为指定省份数据,如果指定某个城市,则看到的是指定城市的数据,同理商品分类维度类似推算。

由于区域维度为分级维度,分为全国、大区、城市三级。通常情况下,我们会计算出城市维度,然后对城市维度数据进行汇总计算得出大区维度和全国维度数据,实际上,由于一个订单,由于会有多个商品,这些商品属于不同的商店,这些商店又属于不同的区域,故不能简单累加计算。

ogid orderid shopid cityid regionid goodsid dt
120242 532 100050 北京 华北 100111 20190905
120243 532 100058 石家庄 华北 100176 20190905
120244 532 100062 上海 华东 100105 20190905

根据此订单信息分析,orderid为532的订单,实际上仅为一个订单,该订单共计3件商品,分别属于不同的商店,这些商店分属北京、石家庄、上海城市。

在区域维度计算时。精确到城市维度数据计算应该如下:

大区 城市 订单量
华北 北京 1
华北 石家庄 1
华东 上海 1

精确到大区维度数据计算应该如下:

大区 订单量
华北 1
华东 1

精确到全国维度数据计算时结果数据应该如下:

区域 订单量
全国 1

通过分析,发现如果使用已经计算过的城市维度的数据汇总至大区时,华北大区的订单量会被计算为2。故不能简单累加。每个区域级别的数据需要单独计算,同理在商品分类维度中,也有同样的情况。

全国、无商品分类维度的交易信息

由于需求中需要列出支付宝支付笔数,微信交易笔数与总交易笔数,根据数据信息,支付共计分为支付宝交易、微信交易、现金交易、其他交易和未知支付类型(支付方式,0:未知;1:支付宝,2:微信;3、现金;4、其他)

故每个维度的数据计算时,除按照支付类型进行分组获取各种支付类型支付的数据外,还要获取不分任何类型的数据。

需求:

-- 1、全国、无商品分类不分支付类型维度分析

-- 2、 全国、无商品分类分支付类型维度分析

注意事项:

第一个INSERT导入数据到数据集市,使用overwrite,第二个则应该使用into

计算如下:

-- 1、全国、无商品分类不分支付类型维度分析
insert overwrite table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t1.order_id) as total_count,--订单笔数需要去重
    sum(t1.goods_num) as total_goods_num,--总的商品数量
    sum(t1.pay_money ) as total_money
from
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t1;
    
-- 2、测试获取数据
select * from itcast_ads.ads_trade_order t where t.area_type = '全国';
-- 全国、无商品分类分支付类型维度分析
-- 注意:要使用 insert into,否则数据会被覆写
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
(select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
left join
    (select * from itcast_dw.dim_payment where dt='20190909') t1   
    on t1.payment_id = t2.payment_id
group by t1.payment_name;

-- 测试查询数据
 select * from itcast_ads.ads_trade_order t1 where t1.payment_name != '所有' limit 5;

全国、一级商品分类维度交易信息

需求:

1、获取全国、一级商品分类维度的不分支付类型数据

2、获取全国、一级商品分类维度的分支付类型数据

-- 获取全国、一级商品分类维度的分支付类型数据
 -- 3.1 获取全国、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t1.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(case when t2.goods_num is null
       then 0
       else t2.goods_num
        end
       ) as total_goods_num,
    sum(case when t2.pay_money is null
        then 0
        else t2.pay_money
        end
        ) as total_money
from
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t1
    left join 
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.cat_3d_id = t2.goods_cat_3d_id
group by t1.cat_1t_name;
-- 测试
select * from itcast_ads.ads_trade_order where dt='20190909' and category_type = '一级分类' and payment_name = '所有';
 -- 3.2 获取全国、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_payment where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.payment_id = t2.payment_id
    left join 
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.payment_name, t3.cat_1t_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '一级分类' and payment_name != '所有';

全国、二级商品分类维度交易信息

需求:

1、获取全国、二级商品分类维度的分支付类型数据

2、获取全国、二级商品不分类维度的分支付类型数据

-- 全国、二级商品分类维度分析
-- 1. 获取全国、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    t1.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_payment where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.payment_id = t2.payment_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.payment_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '二级分类' limit 10;

-- 2. 获取全国、二级商品不分类维度的不分支付类型数据
insert overwrite table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '全国' as area_type,
    '' as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t2.cat_1t_name as category_1st_name,
    t2.cat_2d_name as category_2d_name,
    '所有' as payment_name,
    count(distinct t1.order_id) as total_count,
    sum(t1.goods_num) as total_goods_num,
    sum(t1.pay_money) as total_money
from
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t1
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t2
    on t1.goods_cat_3d_id = t2.cat_3d_id
group by t2.cat_1t_name, t2.cat_2d_name;

-- 测试查询数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '全国' and category_type = '二级分类' and payment_name = '所有' limit 10;

大区、无商品分类维度的交易信息

需求:

1、获取大区、无商品分类维度的分支付类型数据

2、获取大区、无商品分类维度的不分支付类型数据

参考代码:

-- 大区、无商品分类维度的交易信息
-- 1、获取大区、无商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
group by t1.region_name, t3.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '所有';

-- 2、获取大区、不分商品不分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
group by t1.region_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '所有' and payment_name = '所有';

大区、一级商品分类维度交易信息

1、获取大区、一级商品分类维度的分支付类型数据

2、获取大区、一级商品分类维度的不分支付类型数据

参考代码:

-- 3. 大区、一级商品分类维度交易信息
-- 3.1、获取大区、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shop where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t3.cat_1t_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '一级分类' and payment_name != '所有';


-- 3.2、获取大区、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shop where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t3.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '一级分类' and payment_name = '所有';

大区、二级商品分类维度交易信息

1、获取大区、二级商品分类维度的分支付类型数据

2、获取大区、二级商品分类维度的不分支付类型数据

参考代码:

-- 大区、二级商品分类维度交易信息
--- 1、获取大区、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t3.cat_1t_name, t3.cat_2d_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '二级分类' limit 10;

-- 2、获取大区、二级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '大区' as area_type,
    t1.region_name as region_name,
    '' as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '大区' and category_type = '二级分类' and payment_name='所有';

城市、无商品分类维度交易信息

1、 获取城市、无商品分类维度的分支付类型数据

2、获取城市、无商品分类维度的不分支付类型数据

参考代码:

-- #################
-- 城市、无商品分类维度交易信息
-- #################
-- 1. 获取城市、无商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
group by t1.region_name, t1.city_name, t3.payment_name;

-- 获取测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '所有';

-- 2、获取城市、无商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '所有' as category_type,
    '' as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
group by t1.region_name, t1.city_name;


-- 获取测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '所有' and payment_name = '所有';

城市、一级商品分类维度交易信息

1、获取城市、一级商品分类维度的不分支付类型数据

2、获取城市、一级商品分类维度的分支付类型数据

参考代码:

-- #### 城市、一级商品分类维度交易信息
-- #################
-- 1、获取城市、一级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '一级分类' as category_type,
    t4.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    t3.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t3
    on t2.payment_id = t3.payment_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t4
    on t2.goods_cat_3d_id = t4.cat_3d_id
group by t1.region_name, t1.city_name, t3.payment_name, t4.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '一级分类';

-- 2、获取城市、一级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '一级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    '' as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t1.city_name, t3.cat_1t_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '一级分类' and payment_name = '所有';

城市、二级商品分类维度交易信息

1、获取城市、二级商品分类维度的不分支付类型数据

2、获取城市、二级商品分类维度的分支付类型数据

参考代码:

-- #### 城市、二级商品分类维度交易信息
-- #################
-- 1、获取城市、二级商品分类维度的分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    t4.payment_name as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
    left join
    (select * from itcast_dw.dim_payment where dt='20190909') t4
    on t2.payment_id = t4.payment_id
group by t1.region_name, t1.city_name, t3.cat_1t_name, t3.cat_2d_name, t4.payment_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '二级分类';

-- 2、获取城市、二级商品分类维度的不分支付类型数据
insert into table itcast_ads.ads_trade_order partition(dt='20190909')
select
    '城市' as area_type,
    t1.region_name as region_name,
    t1.city_name as city_name,
    '二级分类' as category_type,
    t3.cat_1t_name as category_1st_name,
    t3.cat_2d_name as category_2d_name,
    '所有' as payment_name,
    count(distinct t2.order_id) as total_count,
    sum(t2.goods_num) as total_goods_num,
    sum(t2.pay_money) as total_money
from
    (select * from itcast_dw.dim_shops where dt='20190909') t1
    left join
    (select * from itcast_dw.fact_order_goods_wide where dt='20190909') t2
    on t1.shop_id = t2.shop_id
    left join
    (select * from itcast_dw.dim_goods_cat where dt='20190909') t3
    on t2.goods_cat_3d_id = t3.cat_3d_id
group by t1.region_name, t1.city_name, t3.cat_1t_name, t3.cat_2d_name;

-- 测试数据
select * from itcast_ads.ads_trade_order where dt='20190909' and area_type = '城市' and category_type = '二级分类' and payment_name = '所有';

指标开发-用户订单指标业务开发

需求分析

  • 电商平台往往需要根据用户的购买数据来分析用户的行为
  • 根据用户的消费行为习惯,给运营部门提供用户分析数据指标
  • 主要围绕订单表开展分析

基于用户的订单指标

统计指标 所需字段
第一次消费时间 支付时间
最近一次消费时间 支付时间
首单距今时间 支付时间
尾单距今时间 支付时间
最小消费金额 订单支付金额
最大消费金额 订单支付金额
累计消费次数(不含退拒) 订单id、订单状态
累计消费金额(不含退拒) 订单支付金额、订单状态
近30天购买次数(不含退拒) 是否为近30天、订单id、订单状态
近30天购买金额(不含退拒) 是否为近30天、订单支付金额、订单状态
近30天购买次数(含退拒) 是否为近30天、订单id
近30天购买金额(含退拒) 是否为近30天、订单支付金额
客单价(含退拒) 订单支付金额、客户订单数
客单价(不含退拒) 订单支付金额、客户订单数、订单状态
近60天客单价(含退拒) 是否为近60天、订单支付金额、客户订单数
近60天客单价(不含退拒) 是否为近60天、订单支付金额、订单状态
常用收货地址 用户收货地址
常用支付方式 支付方式
学校下单总数 用户收货地址
单位下单总数 用户收货地址
家里下单总数 用户收货地址
上午下单总数 订单id、是否为上午
下午下单总数 订单id、是否为下午
晚上下单总数 订单id、是否为晚上

订单状态

  • 交易未成功(拒收/reject)

    • 订单未送达、送达后未签收或签收后办理了退货
    • 订单状态显示「交易未成功」
  • 取消(退货/return)

    • 订单中的商品缺货、订单超出了订单保留期限,或订单进行了取消操作
    • 订单显示「取消」状态
  • 等待支付

    • 用户选择 「网上支付」和 「银行转账」这两种付款方式
    • 在暂未收到款项时,订单会显示「等待支付」
  • 等待预售商品到货

    • 订购的商品为预售商品
    • 商品到货前订单会显示 「等待预售商品到货」
  • 正在配货

    • 订单正在库房配货。
  • 等待移仓

    • 订单中的商品当地库房缺货
    • 将从北京仓库调货至当地仓库,订单显示「等待移仓」
  • 正在移仓

    • 订单中的商品正从北京仓库发往当地仓库,订单会显示「正在移仓」
  • 已配货

    • 此订单已完成配货,正在等待发货
  • 已发货

    • 订单已从库房发出,正在配送途中,订单显示「已发货」
  • 已送达

    1. 已收到商品并在“我的订单”中进行了「收货反馈」,订单会显示「已送达」
    2. 未进行「收货反馈」操作,系统在发货后的20天默认变为「已送达」
    3. 订单状态为「已发货」,但物流配送信息显示「配送成功」时,系统默认态显示为「已送达」
    4. 国内平邮订单,未进行「确认收货」操作,系统在发货25天后默认收到商品,订单状态显示为「已送达」
    5. 海外订单,如果未进行「确认收货」操作,系统会在发货60天后默认您收到商品,订单状态显示为「已送达」
  • 交易成功

    • 若订单状态为」已送达」,且此状态后的15天内未发生退货,系统将默认变为「交易成功」

创建dw层表

1、创建itcast_dw.dim_user表

字段 说明
userId 用户id
loginName 登录名
userSex 性别
drop table if exists itcast_dw.dim_user;
create table itcast_dw.dim_user(
    userId          bigint,
    loginName       string,
    userSex         bigint
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

2、创建dw层用户收货地址表 itcast_dw.dim_user_address

--创建dw层dim_user_address表
drop table if exists itcast_dw.dim_user_address;
create table itcast_dw.dim_user_address(
    addressId    bigint,
    userId       bigint,
    userName     string,
    otherName    string,
    userPhone    string,
    areaIdPath   string,
    areaId       bigint,
    userAddress  string,
    isDefault    bigint,
    dataFlag     bigint,
    createTime   string
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

--从ods层itcast_user_address导出数据到dim_user_address表
insert overwrite table itcast_dw.dim_user_address partition(dt="20190909")
select 
addressId,
userId,
userName, 
otherName,
userPhone,
areaIdPath,
areaId,
userAddress,
isDefault,
dataFlag, 
createTime 
from itcast_ods.itcast_user_address where dt="20190909";

3、创建订单时间标志、地址标志宽表 fact_order_wide

字段名称 说明
... ...
flag30 近30天标志
flag60 近60天标志
flag90 近90天标志
flag180 近180天标志
flagTimeBucket 一天时间段标志(凌晨、早晨、上午...
othername 地址标志(家里、学校、工作单位..)
-- 地址拉宽
drop table itcast_dw.fact_order_wide;
create table itcast_dw.fact_order_wide(
    orderId bigint,
    orderStatus bigint,
    payType bigint,
    userId bigint,
    userAddressId bigint,
    payTime string,
    payMoney double,
    createtime string,
    flag30 bigint,
    flag60 bigint,
    flag90 bigint,
    flag180 bigint,
    flagTimeBucket string,
    othername string
)
partitioned by(dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

订单宽表ETL处理

加载用户维度数据

参考代码:

insert overwrite table itcast_dw.dim_user partition(dt='20190909')
select 
    userId,   
    loginName,
    userSex  
from
    itcast_ods.itcast_users where dt ='20190909' ;
    
--验证
select * from itcast_dw.dim_user limit 10;

时间近30天、90天、180天、订单上午、下午时间拉宽

参考代码:

insert overwrite table itcast_dw.fact_order_wide partition(dt='20190909')
select
    t1.orderId,
    t1.orderStatus,
    t1.payType,
    t1.userId,
    t1.userAddressId,
    t1.payTime,
    t1.totalMoney,
    t1.createtime,
    --近30天
    case when datediff(current_timestamp, t1.createtime) <= 30
        then 1
    else 0
    end as flag_30,
     --近60天
    case when datediff(current_timestamp, t1.createtime) <= 60
        then 1
    else 0
    end as flag_60,
     --近90天
    case when datediff(current_timestamp, t1.createtime) <= 90
        then 1
    else 0
    end as flag_90, 
     --近180天
    case when datediff(current_timestamp, t1.createtime) <= 180
    then 1
    else 0
    end as flag_180,
    --所属时间段
    case when hour(t1.createtime) >= 0 and hour(t1.createtime) < 6
        then '凌晨'
    when hour(t1.createtime) >= 6 and hour(t1.createtime) < 12
        then '上午'
    when hour(t1.createtime) >= 12 and hour(t1.createtime) < 14
        then '中午'
    when hour(t1.createtime) >= 14 and hour(t1.createtime) < 18
        then '下午'
    else '晚上'
    end as flag_time_bucket,
    --家里单位学校
    t2.othername
from 
   (select orderId,orderStatus,payType,userId,userAddressId,payTime,totalMoney,createtime from  itcast_dw.fact_orders
where dw_end_date ='9999-12-31') t1  
left join
    (select * from itcast_dw.dim_user_address where dt='20190909') t2
    on t1.userAddressId = t2.addressId;
        
-- 测试语句
select * from itcast_dw.fact_order_wide limit 5; 

指标开发

指标开发 一

指标 字段说明
第一次消费时间 支付时间
最近一次消费时间 支付时间
首单距今时间 支付时间
尾单距今时间------分析用户什么时候来购买商品以及多久没有购买了 支付时间
最小消费金额 订单支付金额
最大消费金额 订单支付金额

参考代码:

select
    t1.userid,
    t1.loginname,
    MIN(t2.payTime) as first_paytime, --首次下单时间
    MAX(t2.payTime) as lastest_paytime, --尾单时间
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t2.payTime)) as first_day_during_days,--首单距今
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t2.payTime)) as lastest_day_durning_days, --尾单距今
    MIN(t2.paymoney) as min_paymoney,
    MAX(t2.paymoney) as max_paymoney
from
   (select * from itcast_dw.fact_order_wide where dt='20190909') as t2
    left join
     (select * from itcast_dw.dim_user where dt='20190909') as t1
    on t1.userId = t2.userId
group by t1.userid,t1.loginname
limit 5;

指标开发二

指标
累计消费次数(不含退拒)
累计消费金额(不含退拒)
近30天购买次数(不含退拒)
近30天购买金额(不含退拒)
近30天购买次数(含退拒)
近30天购买金额(含退拒)----分析用户最近的消费能力

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;


指标开发三

指标
客单价(含退拒)
客单价(不含退拒)
近60天客单价(含退拒)
近60天客单价(不含退拒)

参考代码:

-- 指标开发三
-- 1. 客单价(含退拒)
-- 2. 客单价(不含退拒)
-- 3. 近60天客单价(含退拒)-----分析用户消费水平
-- 4. 近60天客单价(不含退拒)

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
      
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      --客单价含退拒
      SUM(t1.paymoney) / SUM(1) AS atv,
       --客单价不含退拒
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
              --近60天客单价含退拒
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
          --近60天不含退拒
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;


指标开发四

指标
学校下单总数
单位下单总数
家里下单总数
上午下单总数
下午下单总数
晚上下单总数

参考代码

select
    t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
        SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count-- 晚上下单总数
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
group by t2.userid,t2.loginname limit 5;

指标开发五

指标
常用收货地址:某个地址对应的订单数量越多则越常用

1、加载订单地址分析表

参考代码:

-- 创建订单地址分析表
drop table if exists itcast_ads.tmp_order_address;
create table itcast_ads.tmp_order_address(
    userId bigint,          -- 用户Id
    otherName string,       -- 地址类型(家里、学校...)
    totalCount bigint,     -- 下单数
    rn bigint       -- 下单排名
)
partitioned by (dt string)
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

--从tmp_order_datetag_wide统计最常用地址
insert overwrite table itcast_ads.tmp_order_address partition(dt='20190909')
select
t3.userid,
t3.othername,
t3.ordercount,
 row_number() over( partition by t3.userid order by ordercount desc ) rn  --partiton by userid:按照用户分组,order by ordercount :按照订单数量排序 降序  ,rn:组内的序号
from 
(select
    t1.userId as userid,
    t1.othername as othername,
   count(t1.orderid) as ordercount  -->每个用户每个地址订单的数量
from
    (select * from itcast_dw.fact_order_wide where dt='20190909') t1
   
group by t1.userid,t1.otherName order by t1.userid  ) t3  ;

-- 测试
select * from itcast_ads.tmp_order_address order by userId, rn limit 10;

2、统计常用收货地址指标

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,    -- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address 
      
from 
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
group by t2.userid,t2.loginname limit 5;

指标开发六

指标
常用支付方式:某种支付方式对应的订单数量越多则越常用

1、加载支付方式排名

-- 创建支付方式分析表
drop table if exists itcast_ads.tmp_order_paytype;
create table itcast_ads.tmp_order_paytype(
    userid bigint,              -- 用户id
    payType bigint,           -- 支付类型id
    payName string,             -- 支付码
    totalCount bigint,          -- 订单总数
    rn bigint                   -- 等级
)
partitioned by (dt string)
STORED AS PARQUET;

-- 加载支付方式分析
insert overwrite table itcast_ads.tmp_order_paytype partition(dt='20190909')
select
    t3.*,
    row_number() over(partition by userId order by totalCount desc) rn
from
    (select
        t1.userId,
        t1.payType,
        t2.payment_name,
        sum(1) as totalCount   --sum(1)等同于count效果
    from
     (select * from itcast_dw.fact_order_wide where dt='20190909') t1
     left join 
        (select * from itcast_dw.dim_payment where dt='20190909') t2
        on t1.payType = t2.payment_id
    group by t1.userId, t1.payType,t2.payment_name) t3;
    
-- 测试
select * from itcast_ads.tmp_order_paytype limit 5;

2、统计常用支付方式指标

参考代码:

select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,-- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address,
--常用的支付方式
      MAX(case when t4.rn = 1
        then t4.payName
        else ''
        end) as most_usual_paytype
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
    left join (select * from itcast_ads.tmp_order_paytype where dt='20190909') as t4
    on t1.userId = t4.userId
group by t2.userid,t2.loginname limit 5;


创建ads层表、加载数据

drop table if exists itcast_ads.user_order_measure;
create table itcast_ads.user_order_measure(
    userid string,                          -- 用户id
    username string,                        -- 用户名称
    first_paytime string,                   -- 第一次消费时间
    lastest_paytime string,                 -- 最近一次消费时间
    first_day_during_days bigint,           -- 首单距今时间
    lastest_day_durning_days bigint,        -- 尾单距今时间
    min_paymoney double,                    -- 最小消费金额
    max_paymoney double,                    -- 最大消费金额
    total_count_without_back bigint,        -- 累计消费次数(不含退拒)
    total_money_without_back double,        -- 累计消费金额(不含退拒)
    total_count_without_back_30 bigint,     -- 近30天累计消费次数(不含退拒)
    total_money_without_back_30 double,     -- 近30天累计消费金额(不含退拒)
    total_count_30 bigint,                  -- 近30天累计消费次数(含退拒)
    total_money_30 double,                  -- 近30天累计消费金额(含退拒)
    atv double,                             -- 客单价(含退拒)
    atv_withoutback double,                 -- 客单价(不含退拒)
    atv_60 double,                          -- 近60天客单价(含退拒)
    atv_60_withoutback double,              -- 近60天客单价(不含退拒)
    school_order_count bigint,              -- 学校下单总数
    company_order_count bigint,             -- 单位下单总数
    home_order_count bigint,                -- 家里下单总数
    am_order_count bigint,                  -- 上午下单总数
    pm_order_count bigint,                  -- 下午下单总数
    night_order_count bigint,               -- 晚上下单总数
    most_usual_address string,              -- 常用收货地址
    most_usual_paytype string               -- 常用支付方式
)
partitioned by (dt string)
row format delimited fields terminated by '\001' stored as TEXTFILE;
    

insert overwrite table itcast_ads.user_order_measure partition (dt='20190909')
select
   t2.userid,
    t2.loginname,
    MIN(t1.payTime) as first_paytime,
    MAX(t1.payTime) as lastest_paytime,
    DATEDIFF(CURRENT_TIMESTAMP, MIN(t1.payTime)) as first_day_during_days,
    DATEDIFF(CURRENT_TIMESTAMP, MAX(t1.payTime)) as lastest_day_durning_days,
    MIN(t1.paymoney) as min_paymoney,
    MAX(t1.paymoney) as max_paymoney,
sum(
case when t1.orderstatus !=10 and t1.orderstatus !=11
    then 1
    else 0
    end
) as total_count_without_back,--累计消费次数不含退拒,
sum(case when t1.orderstatus !=10 and t1.orderstatus !=11
   then t1.paymoney
   else 0
    end
   ) as total_money_without_back, --累计消费金额不含退拒
   --累计近30天消费次数不含退拒
   sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then 1
       else 0
       end
      ) as total_count_without_back_30,
    --累计近30天消费金额不含退拒
     sum(case when t1.flag30 =1 and t1.orderstatus !=10 and t1.orderstatus !=11
      then t1.paymoney
       else 0
       end
      ) as total_money_without_back_30,
        --累计近30天消费次数含退拒
   sum(case when t1.flag30 =1 
      then 1
       else 0
       end
      ) as total_count_without_30,
    --累计近30天消费金额含退拒
     sum(case when t1.flag30 =1 
      then t1.paymoney
       else 0
       end
      ) as total_money_with_back_30,
      SUM(t1.paymoney) / SUM(1) AS atv,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 
        then t1.paymoney else 0 end) / 
        SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 then 1 else 0
              end) AS atv_withoutback,
    SUM(case when t1.flag60 = 1 
        then t1.paymoney else 0 end) / SUM(case when t1.flag60 = 1
            then 1
            else 0
          end) AS atv_60,
    SUM(case when t1.orderStatus != 10 or t1.orderStatus != 11 and t1.flag60 = 1
        then t1.paymoney
        else 0
        end) / SUM(case when (t1.orderStatus != 10 or t1.orderStatus != 11) and t1.flag60 = 1
        then 1
        else 0
      end) AS atv_60_withoutback,
       SUM(case when t1.otherName = '学校'
        then 1
        else 0
        end) as school_order_count,      -- 学校下单总数
    SUM(case when t1.otherName = '单位'
        then 1
        else 0
        end) as company_order_count,    -- 单位下单总数
    SUM(case when t1.otherName = '家里'
        then 1
        else 0
        end) as home_order_count,        -- 家里下单总数
    SUM(case when t1.flagTimeBucket = '上午'
        then 1
        else 0
        end) as am_order_count,          -- 上午下单总数
    SUM(case when t1.flagTimeBucket = '下午'
        then 1
        else 0
        end) as pm_order_count,          -- 下午下单总数
    SUM(case when t1.flagTimeBucket = '晚上'
        then 1
        else 0
        end) as night_order_count,-- 晚上下单总数
      --最常用地址
max(case when t3.rn =1
then t3.othername
else ''
end) as most_usual_address,
--常用的支付方式
      MAX(case when t4.rn = 1
        then t4.payName
        else ''
        end) as most_usual_paytype
from
(select * from itcast_dw.fact_order_wide where dt="20190909") t1 
left join  
(select * from itcast_dw.dim_user where dt="20190909") t2 on 
t1.userid=t2.userid
    left join
    (select * from itcast_ads.tmp_order_address where dt='20190909') as t3
    on t1.userId = t3.userId
    left join (select * from itcast_ads.tmp_order_paytype where dt='20190909') as t4
    on t1.userId = t4.userId
group by t2.userid,t2.loginname ;


-- 测试
select * from itcast_ads.user_order_measure limit 10;
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,921评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,635评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,393评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,836评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,833评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,685评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,043评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,694评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,671评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,670评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,779评论 1 332
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,424评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,027评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,984评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,214评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,108评论 2 351
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,517评论 2 343

推荐阅读更多精彩内容

  • 数据仓库维度模型设计 维度建模基本概念 维度模型是数据仓库领域大师Ralph Kimall所倡导,他的《数据仓库工...
    pauls阅读 837评论 0 1
  • 1.数仓分层概述 数据仓库分为三层,自下而上为:数据引入层(ODS,Operation Data Store)数据...
    勇于自信阅读 4,045评论 0 3
  • 表的分类 1 实体表,一般是指一个现实存在的业务对象,比如用户,商品,商家等 2 维度表,一般是指对应一些业务状态...
    IsComing乜邪阅读 528评论 0 1
  • 第8章 大数据领域建模综述第9章 数据整合及管理体系第10章 维度设计(重点!)第11章 事实表设计 第8章 大数...
    天线嘟嘟茄阅读 2,943评论 0 1
  • 今天感恩节哎,感谢一直在我身边的亲朋好友。感恩相遇!感恩不离不弃。 中午开了第一次的党会,身份的转变要...
    迷月闪星情阅读 10,551评论 0 11