Jobs and Scheduling

Scheduling

  • 1、Flink的Execution resources是通过Task Slots定义的。每个TaskManager 都有一个或多个Task Slots,每个Task Slot都可以运行one pipeline of parallel tasks。一个pipeline由多个successive tasks组成,例如一个MapFunction的第n个并行实例和一个ReduceFunction的第n个并行实例。请注意,Flink总是并发执行successive tasks:对于Streaming programs来说,在任何情况下都会发生,对于批处理程序来说,它也经常发生。
  • 2、Flink通过SlotSharingGroup和CoLocationGroup来定义Task可以共享一个Task Slots。
  • 3、在下图表明了,在默认情况下,一个Task Slot可以运行一个由多个连续Task组成的Pipline。注意和Operator Chains区分(Flink会尽可能地将operator的subtask链接(chain)在一起形成task。每个task在一个线程中执行。虽然subTask会共享Task Slot,但还是一个个独立的Task,而不是链接成一个Task,在TaskSlot中的subTasks还是会被独立的线程执行,TaskSlot目前只隔离内存)。


    image.png

JobManager Data Structures

During job execution, the JobManager keeps track of distributed tasks, decides when to schedule the next task (or set of tasks), and reacts to finished tasks or execution failures.

The JobManager receives the JobGraph, which is a representation of the data flow consisting of operators (JobVertex) and intermediate results (IntermediateDataSet). Each operator has properties, like the parallelism and the code that it executes. In addition, the JobGraph has a set of attached libraries, that are necessary to execute the code of the operators.

The JobManager transforms the JobGraph into an ExecutionGraph. The ExecutionGraph is a parallel version of the JobGraph: For each JobVertex, it contains an ExecutionVertex per parallel subtask. An operator with a parallelism of 100 will have one JobVertex and 100 ExecutionVertices. The ExecutionVertex tracks the state of execution of a particular subtask. All ExecutionVertices from one JobVertex are held in an ExecutionJobVertex, which tracks the status of the operator as a whole. Besides the vertices, the ExecutionGraph also contains the IntermediateResult and the IntermediateResultPartition. The former tracks the state of the IntermediateDataSet, the latter the state of each of its partitions.

image.png

Each ExecutionGraph has a job status associated with it. This job status indicates the current state of the job execution.

A Flink job is first in the created state, then switches to running and upon completion of all work it switches to finished. In case of failures, a job switches first to failing where it cancels all running tasks. If all job vertices have reached a final state and the job is not restartable, then the job transitions to failed. If the job can be restarted, then it will enter the restarting state. Once the job has been completely restarted, it will reach the created state.

In case that the user cancels the job, it will go into the cancelling state. This also entails the cancellation of all currently running tasks. Once all running tasks have reached a final state, the job transitions to the state cancelled.

Unlike the states finished, canceled and failed which denote a globally terminal state and, thus, trigger the clean up of the job, the suspended state is only locally terminal. Locally terminal means that the execution of the job has been terminated on the respective JobManager but another JobManager of the Flink cluster can retrieve the job from the persistent HA store and restart it. Consequently, a job which reaches the suspended state won’t be completely cleaned up.

image.png

During the execution of the ExecutionGraph, each parallel task goes through multiple stages, from created to finished or failed. The diagram below illustrates the states and possible transitions between them. A task may be executed multiple times (for example in the course of failure recovery). For that reason, the execution of an ExecutionVertex is tracked in an Execution. Each ExecutionVertex has a current Execution, and prior Executions.
image.png

FROM:

https://ci.apache.org/projects/flink/flink-docs-release-1.6/internals/job_scheduling.html

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • “今天吃什么?” “随意吧!” “今天穿什么?” “随意吧!” “你喜欢看什么书?” “比较随意!“ ”你未来怎么...
    是京京呀阅读 414评论 0 1
  • 2017 12 25 星期一 阴 今天的天气特别的冷,让猫咪都一直蜷在他的窝里。没事的人可以在家...
    郑宇雅芯妈阅读 163评论 0 0