当前位置:网站首页>Dolphin scheduler configuring dataX pit records
Dolphin scheduler configuring dataX pit records
2022-04-23 13:41:00 【Ruo Xiaoyu】
1、tmp/dolphinscheduler/exec/process Failed to create the next file
dolphinscheduler Dispatch datax The task needs to be tmp/dolphinscheduler/exec/process Create a series of temporary directory files , But in worker In the operation log /opt/soft/dolphinscheduler/logs/dolphinscheduler-worker.log See the error report of creation failure
[taskAppId=TASK-1-10-13]:[178] - datax task failure
java.io.IOException: Directory ‘/tmp/dolphinscheduler/exec/process/1/1/10/13’ could not be created
The permission to find this directory is root, I dolphinscheduler Is installed in dolphin Under the user , So I want to modify the machine tmp File permissions
$ sudo chown -R dolphin:dolphin tmp
2、datax Environment variable setting problem
Use dolphinscheduler Dispatch datax When the task , data source 、 Tasks can be created successfully , Is that the operation always fails , I can't see the log directly , Then log in to the running worker machine , see /opt/soft/dolphinscheduler/logs/dolphinscheduler-worker.log Log files , See the hint ERR
[INFO] 2021-11-09 11:25:35.446 - [taskAppId=TASK-1-11-14]:[138] - -> python2.7: can’t open file ‘/opt/soft/datax/bin/datax.py/bin/datax.py’: [Errno 20] Not a directory
Express datax Path configuration error for , The file cannot be found .
see vim /opt/soft/dolphinscheduler/conf/env/
This path is the official default before , Now you don't need to specify to bin And running files , Just go to the installation directory .
Route
export DATAX_HOME=/opt/soft/datax/bin/datax.py
Change it to
export DATAX_HOME=/opt/soft/datax
After the save , Rerun mission
Rerun successful
3、dolphinscheduler Dispatch Datax perform mysql To hive Data exchange of , Because the default data source selection can only be mysql Relational database , So you need to choose a custom template , Custom configuration, connection address and other information json.
Profile templates ( This configuration is the configuration of my final successful version , Some parameters need to be configured according to your own information )
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"*"
],
"connection": [
{
"jdbcUrl": [
"jdbc:mysql://xx.xx.xx.xx:3306/datatest?useUnicode=true&characterEncoding=utf8&useSSL=false"
],
"table": [
"test_table_info"
]
}
],
"password": "cloud",
"username": "root",
"where": ""
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "order_id",
"type": "string"
},
{
"name": "str_cd",
"type": "string"
},
{
"name": "gds_cd",
"type": "string"
},
{
"name": "pay_amnt",
"type": "string"
},
{
"name": "member_id",
"type": "string"
},
{
"name": "statis_date",
"type": "string"
}
],
"compress": "",
"defaultFS": "hdfs:// Yours hdfs namenode Address :9000",
"fieldDelimiter": ",",
"fileName": "hive_test_table_info",
"fileType": "text",
"path": "/hive/hive.db/hive_test_table_info",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
3.1、 The first problem encountered during the execution of the task : And HDFS Appears when a connection is established IO abnormal
[job-0] ERROR Engine - the DataX Intelligent analysis , The most likely error cause for this task is :
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-06], Description:[ And HDFS Appears when a connection is established IO abnormal .].
- java.net.ConnectException: Call From xxxxx/10.xx.xx.xx to 10.xx.1xx.1xx:8020 failed on connection exception: java.net.ConnectException: Connection refused;
- For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Because I configured it according to the online template hdfs port 8020, In fact, our port is 9000. So here I put 8020 Changed to 9000.
3.2 The second question is : And HDFS Appears when a connection is established IO abnormal . But the content of this exception is different
ERROR JobContainer - Exception when job run
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-06], Description:[ And HDFS Appears when a connection is established IO abnormal .]. - org.apache.hadoop.security.AccessControlException: Permission denied: user=developer01, access=WRITE, inode="/hive/hive.db/hive_test_table_info":bigdata:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
Here, because the tenant I use when executing is developer01, and hdfs Owned by hive Table write permissions are bigdata user , Therefore, when performing tasks, you should first configure and select tenants with corresponding permissions
In the Security Center – Tenant management Menu configuration bigdata Tenant
Edit this datax Mission , Select... When saving bigdata Tenant
3.3 The third problem encountered in the implementation is mysql Connection problem , This is executing datax Of mysql To mysql I have also encountered , This custom json I didn't pay attention to the document , When I see the log, I think .
ERROR RetryUtil - Exception when calling callable, abnormal Msg:DataX Unable to connect to the corresponding database , Probably because :1) Configured ip/port/database/jdbc error , Unable to connect .2) Configured username/password error , Authentication failure . Please and DBA Confirm whether the connection information of the database is correct .
Here as long as jdbc Add... To the configuration of the connection useSSL=false that will do .
版权声明
本文为[Ruo Xiaoyu]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230602186898.html
边栏推荐
- Lpddr4 notes
- Operations related to Oracle partition
- torch. Where can transfer gradient
- Innobackupex incremental backup
- CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
- Set Jianyun x Feishu Shennuo to help the enterprise operation Department realize office automation
- Using open to open a file in JNI returns a - 1 problem
- TIA博途中基于高速计数器触发中断OB40实现定点加工动作的具体方法示例
- 校园外卖系统 - 「农职邦」微信原生云开发小程序
- Oracle creates tablespaces and modifies user default tablespaces
猜你喜欢
SAP ui5 application development tutorial 72 - trial version of animation effect setting of SAP ui5 page routing
校园外卖系统 - 「农职邦」微信原生云开发小程序
[Video] Bayesian inference in linear regression and R language prediction of workers' wage data | data sharing
Xi'an CSDN signed a contract with Xi'an Siyuan University, opening a new chapter in IT talent training
Detailed explanation of ADB shell top command
kettle庖丁解牛第16篇之输入组件周边讲解
为什么从事云原生开发需要学习容器技术
The interviewer dug a hole for me: how many concurrent TCP connections can a single server have?
联想拯救者Y9000X 2020
AI21 Labs | Standing on the Shoulders of Giant Frozen Language Models(站在巨大的冷冻语言模型的肩膀上)
随机推荐
[point cloud series] unsupervised multi task feature learning on point clouds
Stack protector under armcc / GCC
Database transactions
CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
[point cloud series] full revolutionary geometric features
Servlet of three web components
[indicators] precision, recall
Common types and basic usage of input plug-in of logstash data processing service
Machine learning -- naive Bayes
Is Hongmeng system plagiarism? Or the future? Professional explanation that can be understood after listening in 3 minutes
Super 40W bonus pool waiting for you to fight! The second "Changsha bank Cup" Tencent yunqi innovation competition is hot!
Opening: identification of double pointer instrument panel
What does the SQL name mean
Campus takeout system - "nongzhibang" wechat native cloud development applet
[official announcement] Changsha software talent training base was established!
Resolution: argument 'radius' is required to be an integer
Unified task distribution scheduling execution framework
顶级元宇宙游戏Plato Farm,近期动作不断利好频频
Oracle renames objects
浅谈js正则之test方法bug篇