不知不觉还有两个月又要十一小长假了,对于我和闺蜜们,这是十分难得可以一起出游的机会。每年为了这个短途的国内游,我们都要头疼上一阵子,最主要的原因是我们在不同的城市工作,需要找一个大家都能到,也比较方便,又都没去过的地方。
正好最近在学sql,于是我自告奋勇,想先从12306下手,看看能不能帮我做做筛选。说干就干,我找到了一个开源的爬虫,可以拿到12306的车次信息。所以第一件事,我爬了所有从青岛、北京、上海三个地方出发的车次信息,一共16493条。
下一步需要筛选出来,三个地方都能够到达的地方,发现我们这三个城市都能坐火车到达的有64个车站。
select distinct c.to_station
from
(
select a.from_station,a.arrive_time,b.from_station as from_station2,b.arrive_time as arrive_time2,a.to_station
from tb_train_5 a
join tb_train_5 b on a.to_station = b.to_station
and a.from_station like '%上海%'
and b.from_station like '%青岛%') c
join tb_train_5 on tb_train_5.to_station = c.to_station
and tb_train_5.from_station like '%北京%'
64个车站还是无从下手,于是我又把我们三个城市分别的到达时间列出来,发现,虽然这些车站我们都能到达,但是有一些车站三个人的到达时间差太大甚至到了一整个白天,本来时间就很紧张经不起这么大段的浪费。
select distinct c.from_station,c.arrive_time,c.from_station2,c.arrive_time2,tb_train_5.from_station as from_station3,tb_train_5.arrive_time,c.to_station
from
(
select a.from_station,a.arrive_time,b.from_station as from_station2,b.arrive_time as arrive_time2,a.to_station
from tb_train_5 a
join tb_train_5 b on a.to_station = b.to_station
and a.from_station like '%上海%'
and b.from_station like '%青岛%') c
join tb_train_5 on tb_train_5.to_station = c.to_station
and tb_train_5.from_station like '%北京%'
跟闺蜜商量了一下,一致决定,我们希望能够在2个小时只能都到齐,然后一起打车出发去宾馆。于是我又对到达时间做了限制,得到了39个车站。
select distinct c.to_station
from
(
select a.from_station,a.arrive_time,b.from_station as from_station2,b.arrive_time as arrive_time2,a.to_station
from tb_train_5 a
join tb_train_5 b on a.to_station = b.to_station
and a.from_station like '%上海%'
and b.from_station like '%青岛%'
and timestampdiff(minute,a.arrive_time,b.arrive_time)<=120
and timestampdiff(minute,b.arrive_time,a.arrive_time)<=120) c
join tb_train_5 on tb_train_5.to_station = c.to_station
and tb_train_5.from_station like '%北京%'
and timestampdiff(minute,tb_train_5.arrive_time,c.arrive_time)<=120
and timestampdiff(minute,c.arrive_time,tb_train_5.arrive_time)<=120
and timestampdiff(minute,c.arrive_time2,tb_train_5.arrive_time)<=120
and timestampdiff(minute,tb_train_5.arrive_time,c.arrive_time2)<=120
好啦,我要拿着这39个车站去找闺蜜啦。也只是最近在学sql就想着用一个真实案例来模拟一下,这个过程应该有很多种sql的写法,如果小伙伴们有更好的方法欢迎与我交流啦。