当前位置:网站首页>Dolphin scheduler configuring dataX pit records
Dolphin scheduler configuring dataX pit records
2022-04-23 13:41:00 【Ruo Xiaoyu】
1、tmp/dolphinscheduler/exec/process Failed to create the next file
dolphinscheduler Dispatch datax The task needs to be tmp/dolphinscheduler/exec/process Create a series of temporary directory files , But in worker In the operation log /opt/soft/dolphinscheduler/logs/dolphinscheduler-worker.log See the error report of creation failure
[taskAppId=TASK-1-10-13]:[178] - datax task failure
java.io.IOException: Directory ‘/tmp/dolphinscheduler/exec/process/1/1/10/13’ could not be created

The permission to find this directory is root, I dolphinscheduler Is installed in dolphin Under the user , So I want to modify the machine tmp File permissions
$ sudo chown -R dolphin:dolphin tmp
2、datax Environment variable setting problem
Use dolphinscheduler Dispatch datax When the task , data source 、 Tasks can be created successfully , Is that the operation always fails , I can't see the log directly , Then log in to the running worker machine , see /opt/soft/dolphinscheduler/logs/dolphinscheduler-worker.log Log files , See the hint ERR
[INFO] 2021-11-09 11:25:35.446 - [taskAppId=TASK-1-11-14]:[138] - -> python2.7: can’t open file ‘/opt/soft/datax/bin/datax.py/bin/datax.py’: [Errno 20] Not a directory

Express datax Path configuration error for , The file cannot be found .
see vim /opt/soft/dolphinscheduler/conf/env/

This path is the official default before , Now you don't need to specify to bin And running files , Just go to the installation directory .
Route
export DATAX_HOME=/opt/soft/datax/bin/datax.py
Change it to
export DATAX_HOME=/opt/soft/datax
After the save , Rerun mission

Rerun successful

3、dolphinscheduler Dispatch Datax perform mysql To hive Data exchange of , Because the default data source selection can only be mysql Relational database , So you need to choose a custom template , Custom configuration, connection address and other information json.

Profile templates ( This configuration is the configuration of my final successful version , Some parameters need to be configured according to your own information )
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"*"
],
"connection": [
{
"jdbcUrl": [
"jdbc:mysql://xx.xx.xx.xx:3306/datatest?useUnicode=true&characterEncoding=utf8&useSSL=false"
],
"table": [
"test_table_info"
]
}
],
"password": "cloud",
"username": "root",
"where": ""
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "order_id",
"type": "string"
},
{
"name": "str_cd",
"type": "string"
},
{
"name": "gds_cd",
"type": "string"
},
{
"name": "pay_amnt",
"type": "string"
},
{
"name": "member_id",
"type": "string"
},
{
"name": "statis_date",
"type": "string"
}
],
"compress": "",
"defaultFS": "hdfs:// Yours hdfs namenode Address :9000",
"fieldDelimiter": ",",
"fileName": "hive_test_table_info",
"fileType": "text",
"path": "/hive/hive.db/hive_test_table_info",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
3.1、 The first problem encountered during the execution of the task : And HDFS Appears when a connection is established IO abnormal
[job-0] ERROR Engine - the DataX Intelligent analysis , The most likely error cause for this task is :
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-06], Description:[ And HDFS Appears when a connection is established IO abnormal .].
- java.net.ConnectException: Call From xxxxx/10.xx.xx.xx to 10.xx.1xx.1xx:8020 failed on connection exception: java.net.ConnectException: Connection refused;
- For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Because I configured it according to the online template hdfs port 8020, In fact, our port is 9000. So here I put 8020 Changed to 9000.
3.2 The second question is : And HDFS Appears when a connection is established IO abnormal . But the content of this exception is different
ERROR JobContainer - Exception when job run
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-06], Description:[ And HDFS Appears when a connection is established IO abnormal .]. - org.apache.hadoop.security.AccessControlException: Permission denied: user=developer01, access=WRITE, inode="/hive/hive.db/hive_test_table_info":bigdata:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)

Here, because the tenant I use when executing is developer01, and hdfs Owned by hive Table write permissions are bigdata user , Therefore, when performing tasks, you should first configure and select tenants with corresponding permissions
In the Security Center – Tenant management Menu configuration bigdata Tenant

Edit this datax Mission , Select... When saving bigdata Tenant

3.3 The third problem encountered in the implementation is mysql Connection problem , This is executing datax Of mysql To mysql I have also encountered , This custom json I didn't pay attention to the document , When I see the log, I think .
ERROR RetryUtil - Exception when calling callable, abnormal Msg:DataX Unable to connect to the corresponding database , Probably because :1) Configured ip/port/database/jdbc error , Unable to connect .2) Configured username/password error , Authentication failure . Please and DBA Confirm whether the connection information of the database is correct .
Here as long as jdbc Add... To the configuration of the connection useSSL=false that will do .

版权声明
本文为[Ruo Xiaoyu]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230602186898.html
边栏推荐
- You and the 42W bonus pool are one short of the "Changsha bank Cup" Tencent yunqi innovation competition!
- 【视频】线性回归中的贝叶斯推断与R语言预测工人工资数据|数据分享
- [barycentric coordinate interpolation, perspective correction interpolation] principle and usage opinions
- Common analog keys of ADB shell: keycode
- Campus takeout system - "nongzhibang" wechat native cloud development applet
- Oracle database combines the query result sets of multiple columns into one row
- Example interview | sun Guanghao: College Club grows and starts a business with me
- Lpddr4 notes
- TCP reset Gongji principle and actual combat reproduction
- Riscv MMU overview
猜你喜欢

How do ordinary college students get offers from big factories? Ao Bing teaches you one move to win!

Ai21 labs | standing on the shoulders of giant frozen language models

Common types and basic usage of input plug-in of logstash data processing service

@Excellent you! CSDN College Club President Recruitment!
![[official announcement] Changsha software talent training base was established!](/img/ee/0c2775efc4578a008c872022a95559.png)
[official announcement] Changsha software talent training base was established!
![[indicators] precision, recall](/img/40/8fbd9b83bd2fe8ac13ffeb1834b002.png)
[indicators] precision, recall

顶级元宇宙游戏Plato Farm,近期动作不断利好频频

Oracle defines self incrementing primary keys through triggers and sequences, and sets a scheduled task to insert a piece of data into the target table every second

【重心坐标插值、透视矫正插值】原理以及用法见解

榜样专访 | 孙光浩:高校俱乐部伴我成长并创业
随机推荐
Servlet of three web components
SAP ui5 application development tutorial 72 - animation effect setting of SAP ui5 page routing
[point cloud series] learning representations and generative models for 3D point clouds
Unified task distribution scheduling execution framework
Super 40W bonus pool waiting for you to fight! The second "Changsha bank Cup" Tencent yunqi innovation competition is hot!
Test the time required for Oracle library to create an index with 7 million data in a common way
Bottomsheetdialogfragment + viewpager + fragment + recyclerview sliding problem
Machine learning -- naive Bayes
Oracle view related
PyTorch 21. NN in pytorch Embedding module
QT calling external program
Short name of common UI control
Oracle generates millisecond timestamps
Error 403 in most cases, you or one of your dependencies are requesting
[point cloud series] so net: self organizing network for point cloud analysis
Part 3: docker installing MySQL container (custom port)
Ai21 labs | standing on the shoulders of giant frozen language models
Oracle calculates the difference between two dates in seconds, minutes, hours and days
playwright控制本地穀歌瀏覽打開,並下載文件
Isparta is a tool that generates webp, GIF and apng from PNG and supports the transformation of webp, GIF and apng