DolphinScheduler2.0.0源码分析过程(02)

上一篇文章是:

DolphinScheduler2.0.0源码分析过程(01)

 

我们接着上篇文章接着分析:


 

ok,先截图看一下我们目前建立的项目情况和后台数据库情况:

DolphinScheduler2.0.0源码分析过程(02)

 

 DolphinScheduler2.0.0源码分析过程(02)

 

 DolphinScheduler2.0.0源码分析过程(02)

 

 下面我们开始点击 任务运行按钮,开始运行一次任务,后台我先只开启ApiApplicationServer和MasterServer,暂不启动WorkerServer。这样观察一下会不会报错。

DolphinScheduler2.0.0源码分析过程(02)

 

 DolphinScheduler2.0.0源码分析过程(02)

 

 贴一下此时的MasterServer的日志情况:

[INFO] 2021-11-24 11:29:50.248 org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[243] - find command 79, slot:1 :
[INFO] 2021-11-24 11:29:50.249 org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[186] - find one command: id: 79, type: START_PROCESS
[INFO] 2021-11-24 11:29:50.300 org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[209] - handle command end, command 79 process 79 start...
[INFO] 2021-11-24 11:29:50.338 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1146] - add task to stand by list: test_shell
[INFO] 2021-11-24 11:29:50.347 org.apache.dolphinscheduler.service.process.ProcessService:[1093] - start submit task : test_shell, instance id:79, state: RUNNING_EXECUTION
[INFO] 2021-11-24 11:29:50.376 org.apache.dolphinscheduler.service.process.ProcessService:[1106] - end submit task to db successfully:82 test_shell state:SUBMITTED_SUCCESS complete, instance id:79 state: RUNNING_EXECUTION  
[INFO] 2021-11-24 11:29:50.376 org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[120] - task ready to submit: TaskInstance{id=82, name='test_shell', taskType='SHELL', processInstanceId=79, processInstanceName='null', state=SUBMITTED_SUCCESS, firstSubmitTime=Wed Nov 24 11:29:50 CST 2021, submitTime=Wed Nov 24 11:29:50 CST 2021, startTime=null, endTime=null, host='null', executePath='null', logPath='null', retryTimes=0, alertFlag=NO, processInstance=null, processDefine=null, pid=0, appLink='null', flag=YES, dependency='null', duration=null, maxRetryTimes=0, retryInterval=1, taskInstancePriority=MEDIUM, processInstancePriority=MEDIUM, dependentResult='null', workerGroup='default', environmentCode=-1, environmentConfig='null', executorId=1, executorName='null', delayTime=0, dryRun=0}
[INFO] 2021-11-24 11:29:50.438 org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[130] - master submit success, task : test_shell
[ERROR] 2021-11-24 11:29:50.448 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=19, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=19, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[INFO] 2021-11-24 11:29:50.458 org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1162] - remove task from stand by list, id: 82 name:test_shell
[ERROR] 2021-11-24 11:29:53.468 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=20, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=20, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:29:56.488 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=21, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=21, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:29:59.509 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=22, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=22, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:02.530 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=23, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=23, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:05.550 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=24, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=24, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:08.612 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=25, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=25, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:11.630 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=26, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=26, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:14.642 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=27, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=27, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:17.651 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=28, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=28, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:20.670 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=29, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=29, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:23.698 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=30, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=30, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)
[ERROR] 2021-11-24 11:30:26.717 org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer:[140] - dispatch error: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=31, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
org.apache.dolphinscheduler.server.master.dispatch.exceptions.ExecuteException: fail to execute : Command [type=TASK_EXECUTE_REQUEST, opaque=31, bodyLen=1673] due to no suitable worker, current task needs worker group default to execute
    at org.apache.dolphinscheduler.server.master.dispatch.ExecutorDispatcher.dispatch(ExecutorDispatcher.java:89)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.dispatch(TaskPriorityQueueConsumer.java:137)
    at org.apache.dolphinscheduler.server.master.consumer.TaskPriorityQueueConsumer.run(TaskPriorityQueueConsumer.java:100)

可以看到一直在报错,我们先分析一下这里的日志,逐句解释一下:

第一行是:find command 79, slot:1 : 

意思是:发现79号任务,在一号槽。啥意思呢?其实当我们保存任务的时候,我们的任务是被保存到了t_ds_task_definition表中,那么MasterServer会根据定时任务时间,定期把这个t_ds_task_definition表中符合执行条件的每个任务定义变成任务实例,保存到t_ds_command表中。t_ds_command每一条记录都代表了将要执行的任务实例。紧接着还是MasterServer会根据任务数量和MasterServer线程的数量做一个hash计算,计算一下由哪个MasterServer来负责这个command的全程执行监控。如果某个command被分配到了某个MasterServer,那么就需要由这个MasterServer全程负责监控command的执行情况,具体的执行是由这个MasterServer来联系WorkerServer进程来具体干活。WorkerServer把活干完了还需要反馈给MasterServer,然后MasterServer再去把任务实例状态更新为已经完成。command表一般观察的时候是空表,因为MasterServer一旦取走了该command,就把这条记录从command表清空了,然后添加一条执行中的记录到t_ds_task_instance表。所以当执行完成后,MasterServer更新的表是t_ds_task_instance这个表。不是command表。

 

第二行是:find one command: id: 79, type: START_PROCESS  

发现一个命令,id是79,类型是start_process

 

第三行是:handle command end, command 79 process 79 start...

处理任务结束,79号任务的79号子进程开始执行

 

上一篇:MySQL中CASE WHEN用法


下一篇:同事乱用 Redis 卡爆,我真是醉了...