当前位置:网站首页>Dolphin scheduler scheduling spark task stepping record
Dolphin scheduler scheduling spark task stepping record
2022-04-23 13:42:00 【Ruo Xiaoyu】
1、 About spark The scheduling of worker Deploy
I'm testing Dolphinscheduler It adopts the cluster mode , Two machines are deployed master, Two machines are deployed worker, and hadoop and spark It is deployed on other machines . In the configuration dolphinscheduler_env.sh How to set the file spark The environmental address is very confused . The first problem in test scheduling is that you can't find spark-submit file
command: line 5: /bin/spark-submit: No such file or directory
Viewing the scheduling process through the log will clearly see ,DS Need to pass through dolphinscheduler_env.sh Configured in the file SPARK_HOME1 Look for spark-submit Script . It cannot find the path on different servers .

So I think of two solutions to the problem :
1、 hold spark Get your installation package worker Next , But this may involve hadoop Of yarn Other configuration .
2、 stay spark client Deploy another one on the deployment machine Dolphinscheduler worker, In this way, only consider DS Its own configuration is enough .
I finally chose option two .
In addition to worker Copy the installation file of the node to spark client On and off the machine , Also pay attention to follow the relevant steps in the installation :
- Create the same user as other nodes , for example dolphinscheduler
- take DS The installation directory is authorized to dolphinscheduler user
- Modify the of each node /etc/hosts file
- Create secret free
- Modify each DS Node dolphinscheduler/conf/common.properties File configuration
- Create the corresponding directory according to the configuration file , And authorize , Such as /tmp/dolphinscheduler Catalog
- Reconfigure the worker Node dolphinscheduler_env.sh file , add to SPARK_HOME route .
- Restart the cluster .
2、spark-submit The problem of Execution Authority
In the process of task submission and execution , my spark The test task also involves testing hdfs The operation of , So the running tenant owns hdfs The powers of the bigdata.
function spark Failure , Tips :
/opt/soft/spark/bin/spark-submit: Permission denied
At first, I thought it was the wrong tenant to choose , But think about it bigdata and hadoop Deployed together , and bidata Users also have spark jurisdiction , Obviously it's not the user's problem . Then you should think of spark-submit It's execution authority , So give users excute jurisdiction .
- chmod 755 spark
3、 Mingming spark The task was executed successfully , however DS The interface still fails to display
In operation , Found me spark The task has written the processed file to HDFS Catalog , In line with my mission logic . Check the log , Show that the task is successful , But there is one error:
[ERROR] 2021-11-15 16:16:26.012 - [taskAppId=TASK-3-43-72]:[418] - yarn applications: application_1636962361310_0001 , query status failed, exception:{
}
java.lang.Exception: yarn application url generation failed
at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationUrl(HadoopUtils.java:208)
at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:418)
at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404)

This error report can be seen as DS Need to go to a yarn Query under the path of application The state of , After getting this status, show the execution results , But I didn't get it , Obviously we're going to see where he goes to get , Can you configure this address .
I check the source code , find HadoopUtils.getApplicationUrl This method

appaddress Need to get a yarn.application.status.address Configuration parameters for


Find the default configuration in the source code , Although it says HA The default mode can be retained , But watch my yarn Not installed in ds1 Upper , So here we need to change it to ourselves yarn Address .
Configure this parameter to scheduling spark Of worker node /opt/soft/dolphinscheduler/conf/common.properties file
# if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
yarn.application.status.address=http://ds1:%s/ws/v1/cluster/apps/%s
版权声明
本文为[Ruo Xiaoyu]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230602186775.html
边栏推荐
- The difference between is and as in Oracle stored procedure
- Two ways to deal with conflicting data in MySQL and PG Libraries
- Tangent space
- Oracle modify default temporary tablespace
- Interface idempotency problem
- [indicators] precision, recall
- Common commands of ADB shell
- Oracle defines self incrementing primary keys through triggers and sequences, and sets a scheduled task to insert a piece of data into the target table every second
- Django::Did you install mysqlclient?
- [barycentric coordinate interpolation, perspective correction interpolation] principle and usage opinions
猜你喜欢

Example of specific method for TIA to trigger interrupt ob40 based on high-speed counter to realize fixed-point machining action

TERSUS笔记员工信息516-Mysql查询(2个字段的时间段唯一性判断)
![[point cloud series] learning representations and generative models for 3D point clouds](/img/c5/712bd448fa6c0bffc09ce57f6e56b5.png)
[point cloud series] learning representations and generative models for 3D point clouds

【重心坐标插值、透视矫正插值】原理以及用法见解

SAP UI5 应用开发教程之七十二 - SAP UI5 页面路由的动画效果设置

零拷贝技术

为什么从事云原生开发需要学习容器技术

AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)

交叉碳市场和 Web3 以实现再生变革

You and the 42W bonus pool are one short of the "Changsha bank Cup" Tencent yunqi innovation competition!
随机推荐
Feature Engineering of interview summary
sys. dbms_ scheduler. create_ Job creates scheduled tasks (more powerful and rich functions)
[point cloud series] so net: self organizing network for point cloud analysis
Interface idempotency problem
Logstash数据处理服务的输入插件Input常见类型以及基本使用
RTOS mainstream assessment
Summary of request and response and their ServletContext
JS compares different elements in two arrays
MySQL 8.0.11 download, install and connect tutorials using visualization tools
./gradlew: Permission denied
Common analog keys of ADB shell: keycode
Why do you need to learn container technology to engage in cloud native development
Interval query through rownum
POM of SSM integration xml
How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!
Window analysis function last_ VALUE,FIRST_ VALUE,lag,lead
Test on the time required for Oracle to delete data with delete
[point cloud series] deepmapping: unsupervised map estimation from multiple point clouds
Xi'an CSDN signed a contract with Xi'an Siyuan University, opening a new chapter in IT talent training
Test the time required for Oracle library to create an index with 7 million data in a common way