https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/
Operator | 作用 | 流的转换 |
---|---|---|
map | 将一个元素转换成另外一个元素 | DataStream → DataStream本 |
flapmap | 将几个的一个元素转换为零个,一个或者多个 | DataStream → DataStream |
filter | 保留集合中返回true的元素 | DataStream → DataStream |
keyBy | 对数据流进行逻辑分区,相同的key在同一分区 | DataStream → KeyedStream |
reduce | 遍历集合,依次合并元素最终生产一个元素 | KeyedStream → DataStream |
fold | 遍历结合从第一个元素到最后一个元素依次连接起来 | KeyedStream → DataStream |
Aggregations | emmmm | KeyedStream → DataStream |
Window | 基于已经分区的stream,将元素划分窗口 | KeyedStream → WindowedStream |
WindowAll | 基于未分区的stream,将所有元素集中到一个task | DataStream → AllWindowedStream |
Apply(Window) | 自定义函数处理窗口内所有的元素 | WindowedStream → DataStream AllWindowedStream → DataStream |
Window Reduce | 窗口内所有元素reduce到一个结果 | WindowedStream → DataStream |
Window Fold | 同stream的fold | WindowedStream → DataStream |
Aggregations on windows | 同stream的Aggregations | WindowedStream → DataStream |
Union | 将两个流合并 | DataStream* → DataStream |
Window Join | 两个流join成一个流,指定分区key,在指定window,窗口是必须的 | DataStream,DataStream → DataStream |
Interval Join | 流2 join 流1中一段时间的元素 | KeyedStream,KeyedStream → DataStream |
Window CoGroup | 双流join,指定窗口 | DataStream,DataStream → DataStream |
Connect | 联合两个流,保留各种state | DataStream,DataStream → ConnectedStreams |
CoMap, CoFlatMap | 同map, CoFlatMap | ConnectedStreams → DataStream |
Split | 流拆分 | DataStream → SplitStream |
Select | 从SplitStream分离出DataStream | SplitStream → DataStream |
Iterate | - | DataStream → IterativeStream → DataStream |
- | - | - |
Extract Timestamps | 设置event time | DataStream → DataStream |
- map 将每个元素乘以2
DataStream<Integer> dataStream = //...
dataStream.map(new MapFunction<Integer, Integer>() {
@Override
public Integer map(Integer value) throws Exception {
return 2 * value;
}
});
- flatMap 单词分隔
dataStream.flatMap(new FlatMapFunction<String, String>() {
@Override
public void flatMap(String value, Collector<String> out)
throws Exception {
for(String word: value.split(" ")){
out.collect(word);
}
}
});
- filter 保留value=0的元素
dataStream.filter(new FilterFunction<Integer>() {
@Override
public boolean filter(Integer value) throws Exception {
return value != 0;
}
});
- keyby
dataStream.keyBy("someKey") // Key by field "someKey"
dataStream.keyBy(0) // Key by the first element of a Tuple
- reduce 求和
keyedStream.reduce(new ReduceFunction<Integer>() {
@Override
public Integer reduce(Integer value1, Integer value2)
throws Exception {
return value1 + value2;
}
});
- fold
A fold function that, when applied on the sequence (1,2,3,4,5), emits the sequence "start-1", "start-1-2", "start-1-2-3", ..
DataStream<String> result =
keyedStream.fold("start", new FoldFunction<Integer, String>() {
@Override
public String fold(String current, Integer value) {
return current + "-" + value;
}
});
- Aggregations
keyedStream.sum(0);
keyedStream.sum("key");
keyedStream.min(0);
keyedStream.min("key");
keyedStream.max(0);
keyedStream.max("key");
keyedStream.minBy(0);
keyedStream.minBy("key");
keyedStream.maxBy(0);
keyedStream.maxBy("key");
- Window Join
dataStream.join(otherStream)
.where(<key selector>).equalTo(<key selector>)
.window(TumblingEventTimeWindows.of(Time.seconds(3)))
.apply (new JoinFunction () {...});
- Interval Join
// this will join the two streams so that
// key1 == key2 && leftTs - 2 < rightTs < leftTs + 2
keyedStream.intervalJoin(otherKeyedStream)
.between(Time.milliseconds(-2), Time.milliseconds(2)) // lower and upper bound
.upperBoundExclusive(true) // optional
.lowerBoundExclusive(true) // optional
.process(new IntervalJoinFunction() {...});
- Split
SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {
@Override
public Iterable<String> select(Integer value) {
List<String> output = new ArrayList<String>();
if (value % 2 == 0) {
output.add("even");
}
else {
output.add("odd");
}
return output;
}
});