A表left join B表
(1)A表数据来了,B没来
(2)A表数据来了,B在规定时间内到
(3)A表数据来了,B在规定时间后面到
怎么处理?
left join与right join由于Flink官方并没有给出明确的方案,无法通过join来实现,但是可以用coGroup
public static class LeftJoin implements CoGroupFunction<Tuple3<String, String, Long>, Tuple3<String, String,
Long>, Tuple5<String,
String, String, Long, Long>> {
// 将key相同,并且在同一窗口的数据取出来
@Override
public void coGroup(Iterable<Tuple3<String, String, Long>> first, Iterable<Tuple3<String, String, Long>> second,
Collector<Tuple5<String, String, String, Long, Long>> out) throws Exception {
for (Tuple3<String, String, Long> leftElem : first) {
boolean hadElements = false;
//如果左边的流join上了右边的流rightElements就不为空,就会走下面的增强for循环
for (Tuple3<String, String, Long> rightElem : second) {
//将join上的数据输出
out.collect(new Tuple5<>(leftElem.f0, leftElem.f1, rightElem.f1, leftElem.f2,
rightElem.f2));
hadElements = true;
}
if (!hadElements) {
//没join上,给右边的数据赋空值
out.collect(new Tuple5<>(leftElem.f0, leftElem.f1, "null", leftElem.f2, -1L));
}
}
}
}
对于正常数据,直接join
对于其他情况,看业务需求吧,不需要立马舍弃的,可用状态或者第三方存储来等待