一、sql测试题

题目1

第一步：
区域信息存在门店表里，维护人和委托类型存在房源表里。
因为要用到两个表的内容，所以把两个表连接。
因为每个区域都要选择出来，所以用left join。
同时，有人维护意味着 hold_ucid is not null，加上这个筛选条件。
select的内容先用*代替，在第二步中具体选择。

select *
from shop_info a
left join house_info b
on a.shop_code = b.hold_shop_code
where b.hold_ucid is not null;

第二步：
求出每个区域中有人维护切委托类型为二手和新房的房源量,
按照区域分组，对委托类型进行sum(if(xxx,1,0))条件判断并计数。类似计算复购率时所用的统计复购人数的方法。

select a.holdarea as 区域,
sum(if(b.del_type='二手',1,0)) as 二手房源量,
sum(if(b.del_type='新房',1,0)) as 新房房源量
from shop_info a
left join house_info b
on a.shop_code = b.hold_shop_code
where b.hold_ucid is not null
group by a.holdarea;

题目2

第一步：
今天的日期为current_date()。
30天前为今天日期往前偏移30天，即date_sub(current_date(), interval 30 day)。
根据题目提示，先筛选出最近一次接通成功记录在30天以外的房源id。

select distinct house_id
from telephone_info
where call_status != '接通失败' and create_date < date_sub(current_date(), interval 30 day);

第二步：根据题目提示，筛选出从未有过接通成功记录的房源的房源id。

select distinct house_id
from telephone_info
where house_id not in (select distinct house_id from telephone_info where call_status !='接通失败');

第三步：根据以上两个查询的结果，作为where筛选条件，选出近30天内未成功接通（接通状态为“接通失败”）客户电话的所有房源id。

select distinct house_id
from telephone_info
where house_id in (select distinct house_id
from telephone_info
where call_status != '接通失败' and create_date < date_sub(current_date(), interval 30 day))
or house_id in (select distinct house_id
from telephone_info
where house_id not in (select distinct house_id from telephone_info where call_status !='接通失败'));

第四步：将上表作为where筛选条件，同时在筛选时选出成功接通电话的记录。再按照房源id分组，求出最新的通话记录时间。

select house_id as 房源id, max(create_date) as 电话拨打日期
from telephone_info
where house_id in (select distinct house_id
from telephone_info
where house_id in (select distinct house_id
from telephone_info
where call_status != '接通失败' and create_date < date_sub(current_date(), interval 30 day))
or house_id in (select distinct house_id
from telephone_info
where house_id not in (select distinct house_id from telephone_info where call_status !='接通失败')))
and call_status != '接通失败'
group by house_id;

二、python基础

1

运行结果如下

函数中第二个参数有默认值，这个默认值里的列表，只有在定义函数时会创建一次。所以第一次添加10到这个列表时，10就被存在这个列表当中了。当第三次添加a到列表时，还是添加在之前已经存在10的列表里，所以列表里有10和a两个值。执行完上面三步以后，再去打印的话，list里就有10和a两个值。

2

2.1

2.2

将函数定义内的num设为全局变量即可。

def f1():
    global num
    num=20

2.3

不相同。

一个结果是
[{num:1}, {num:2},{num:3},{num:4},...]
另一个的结果是
[{num:9}, {num:9},{num:9},{num:9},...]

因为Python传的参数是传地址。第一个每次都是新建一个字典，再将每次的字典append到list中，每个字典的地址是不同的。而第二个是先创建了一个字典，再修改字典中的值，再添加到list中。这样的话，每次引用的都是同一个地址的字典，这个字典的值被改变的话，引用这个字典的内容都会改变。

三、python进阶

1.python的传参是传值还是传址

传址

2.如何判断同一个DataFrame中的一列是否包含另一列的元素(举例说明)

参考上图，将两列都转化为列表。之后创建一个空列表用于储存包含在另外一列中的元素，利用遍历和in，将包含在另外一列中的元素放进空列表中储存。最后看这个空列表中是否有值。

3.尝试将以下SQL语句功能用Pandas表达(提示:结合DataFrame的GROUP BY、agg、filter函数)

df_sometable = df_sometable[Condition1布尔索引].groupby(['Column1','Column2']).filter(Condition2).agg({'Column3':'mean','Column4':'sum'}).reset_index()

2020-06-27题目四