当前位置:网站首页>PYSPARK ON YARN报错集合
PYSPARK ON YARN报错集合
2022-08-10 15:25:00 【不吃天鹅肉】
错误一:a non-zero exit code 13. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err :
这个错误整整困扰了我两天,用client模式提交没有问题,一旦切换成cluster模式就报错
Application application_1659522163899_1043 failed 2 times due to AM Container for appattempt_1659522163899_1043_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2022-08-04 17:47:23.976]Exception from container-launch.
Container id: container_e207_1659522163899_1043_02_000001
Exit code: 13
[2022-08-04 17:47:24.174]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for TERM
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for HUP
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for INT
22/08/04 17:47:22 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
22/08/04 17:47:22 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
22/08/04 17:47:22 INFO spark.SecurityManager: Changing view acls groups to:
22/08/04 17:47:22 INFO spark.SecurityManager: Changing modify acls groups to:
22/08/04 17:47:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
22/08/04 17:47:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/04 17:47:23 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1659522163899_1043_000002
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
22/08/04 17:47:23 ERROR yarn.ApplicationMaster: User application exited with status 1
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User application exited with status 1)
22/08/04 17:47:23 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:509)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:273)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:913)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:912)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:912)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:103)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://cluster/user/hdfs/.sparkStaging/application_1659522163899_1043
22/08/04 17:47:23 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
22/08/04 17:47:23 INFO util.ShutdownHookManager: Shutdown hook called
[2022-08-04 17:47:24.184]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for TERM
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for HUP
22/08/04 17:47:22 INFO util.SignalUtils: Registering signal handler for INT
22/08/04 17:47:22 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
22/08/04 17:47:22 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
22/08/04 17:47:22 INFO spark.SecurityManager: Changing view acls groups to:
22/08/04 17:47:22 INFO spark.SecurityManager: Changing modify acls groups to:
22/08/04 17:47:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
22/08/04 17:47:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/04 17:47:23 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1659522163899_1043_000002
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
22/08/04 17:47:23 ERROR yarn.ApplicationMaster: User application exited with status 1
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User application exited with status 1)
22/08/04 17:47:23 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:509)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:273)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:913)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:912)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:912)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:103)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)
22/08/04 17:47:23 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://cluster/user/hdfs/.sparkStaging/application_1659522163899_1043
22/08/04 17:47:23 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
22/08/04 17:47:23 INFO util.ShutdownHookManager: Shutdown hook called
For more detailed output, check the application tracking page: http://wj-hdp-3:8088/cluster/app/application_1659522163899_1043 Then click on links to logs of each attempt.
. Failing the application.
各种查资料,网上也没有一样得问题,仅有得两个问题都没有答案,疯了,打开yarn看不到日志,打开spark webui更是连application都没有,最后尝试着用yarn logs -applicationId application_1659522163899_1651
看了一下,结果发现,报错导入模块失效,可是我明明安装了呀
End of LogType:stderr
***********************************************************************
Container: container_e207_1659522163899_1651_02_000001 on wj-hdp-9_45454_1659674465064
LogAggregationType: AGGREGATED
======================================================================================
LogType:stdout
LogLastModifiedTime:Fri Aug 05 12:41:05 +0800 2022
LogLength:151
LogContents:
Traceback (most recent call last):
File "pyspark_test.py", line 3, in <module>
import findspark
ModuleNotFoundError: No module named 'findspark'
End of LogType:stdout
最后看见那个 on wj-hdp-,明白了,由于我是用的pyspark集群模式运行,你的代码在每个服务器乱窜,所以当前节点安装对应模块根本没用,得在所有节点全部安装才行。这就是为什么客户端模式运行没问题,因为当前节点是有这个模块的。然后集群模式其他节点没有,最后全部安装上相关模块成功了。重点是yarn logs查看日志,上面简单的报错信息根本没用。
错误二:hive表或视图找不到
这个简单,把hive的hive-site.xml复制到spark的conf目录下就ok了,或者也可以在代码里添加config
spark = SparkSession.builder.config('','').enableHiveSupport().getOrCreate()
里面填hive-site.xml里复制过来的东西,或者把hive-site.xml在提交的时候通过–files的方式加上也行。
目前还有一个问题,就是我的–files好像不起作用的样子,随后在研究。
边栏推荐
- [Letter from Wu Enda] The development of reinforcement learning!
- 请查收 2022华为开发者大赛备赛攻略
- Colocate Join :ClickHouse的一种高性能分布式join查询模型
- Custom picker scroll selector style
- scala 10种函数高级应用
- Boss raises salary!Look at my WPF Loading!!!
- 5G NR MIB Detailed Explanation
- 易基因|深度综述:m6A RNA甲基化在大脑发育和疾病中的表观转录调控作用
- Cesium Quick Start 4-Polylines primitive usage explanation
- 2025年推出 奥迪透露将推出大型SUV产品
猜你喜欢
为什么中国的数字是四位一进,而西方的是三位一进?
全志V853开发板移植基于 LVGL 的 2048 小游戏
Oracle database backup DMP file is too big, what method can be split into multiple DMP when backup?
metaForce佛萨奇2.0系统开发功能逻辑介绍
多线程面试指南
嵌入式开发:嵌入式基础——使用指针数组映射外设
【服务器数据恢复】raid5崩溃导致lvm信息和VXFS文件系统损坏的数据恢复案例
电商秒杀项目收获(二)
NFT digital collection development issue - digital collection platform
Rich Dad Poor Dad Reading Notes
随机推荐
产品说明丨如何使用MobPush快速创建应用
程序员=加班??——掌握时间才能掌握人生
FP6378AS5CTR SOT-23-5 高效1MHz2A同步降压调节器
Allwinner V853 development board transplants LVGL-based 2048 games
智为链接,慧享生活,荣耀智慧服务,只为 “懂” 你
持续集成实战 —— Jenkins自动化测试环境搭建
Community News——Congratulations to Dolphin Scheduling China User Group for 9 new "Community Administrators"
MySQL batch update and batch update method of different values of multiple records
redis 源码源文件说明
JVM学习——2——内存加载过程(类加载器)
一个 ABAP 开发的新浪微博语义情感分析工具
异地多活方法论
匿名函数和全部内置函数详细认识(下篇)
腾讯云TDP-对象存储COS产品新用户福利
Colocate Join :ClickHouse的一种高性能分布式join查询模型
颜色空间
Chapter II Module Encyclopedia "collections Module"
安克创新每一个“五星好评”背后,有怎样的流程管理?
26、压缩及解压缩命令
An ABAP tool that can print the browsing history of a user in the system for BSP applications