hive取随机的数据,可以使用rand()函数,用rand()对数据排序,取topN
如果要用到分组取随机数,比如每个班级随机取10人,针对这种每个分组取topN的情况,可以使用
row_number() over(partition by fieldx order by rand()) as rn
示例:
select date,imei
from(
select date,imei,row_number() over(partition by sp_modify order by rand()) as rn
from tmp_mod ) mod
where mod.rn <= 1000