当前位置:网站首页>Spark FAQ sorting - must see before interview
Spark FAQ sorting - must see before interview
2022-04-23 04:41:00 【Z-hhhhh】
One 、job、stage、Task What is the relationship between ?
-
One job It can contain more than one stage
-
One stage Contains multiple task
Two 、job、stage、Task What is the relationship between ?
- Every time a task is submitted , It creates a job, That is to call action Operator will create job【 When the operator is called, the return value is not RDD Type can be classified as Action operator 】
- Divide according to wide dependence and narrow dependence stage, If it's broad dependence , Just add a new one stage
- Task The number is actually the number of partitions
3、 ... and 、 What is wide dependence 、 Narrow dependence ?
- If a father RDD The partition is divided into several subdomains RDD The use of , It's just wide dependence 【 Superbirth 】 If a father RDD The partition is divided into several subdomains RDD The use of , It's just wide dependence 【 Superbirth 】
- If a father RDD The partition is only used by one child RDD Partition use , It's narrow dependence 【 only 】 If a father RDD The partition is only used by one child RDD Partition use , It's narrow dependence 【 only 】
- abstract class Dependencyabstract class Dependency
- abstract class NarrowDependency extend Dependency There is an abstract way ,getParents()
- class OneToOneDependency extend NarrowDependency Implement abstract methods getParents()
- class RangeDependency extend NarrowDependency Implement abstract methods getParents()class RangeDependency extend NarrowDependency Implement abstract methods getParents()
- class ShuffleDependency extend Dependencyclass ShuffleDependency extend Dependency
- abstract class NarrowDependency extend Dependency There is an abstract way ,getParents()
Four 、Action Operator and Transformation What is an operator , List some ?
-
Action The operator will create job, Will be executed immediately . for example :take ,first,collect,foreach,foreachPartition.
-
Transformation Not immediately , But there will be some dependencies recorded , And functions . for example :map,filter,flatMap,reduceByKey,groupByKey wait .
5、 ... and 、reduceByKey and groupByKey What's the difference? ?
reduceByKey:reduceByKey The results will be sent to reducer I've been talking to everyone before mapper In Ben
To carry out merge, It's a bit like in MapReduce Medium combiner. The advantage of doing so is ,
stay map Do it once reduce after , The amount of data will be greatly reduced , This reduces transmission , Guarantee reduce
The end can calculate the result faster .
groupByKey:groupByKey For each RDD Medium value Values are aggregated to form a sequence
(Iterator), This operation occurred at reduce End , Therefore, it is bound to transmit all data through the network ,
Cause unnecessary waste . At the same time, if the amount of data is very large , It may also cause OutOfMemoryError.
Conclusion :
So we're doing a lot of data reduce It is recommended to use reduceByKey. It can not only raise the speed
degree , It can also prevent the use of groupByKey Memory overflow caused by .
6、 ... and 、RDD Five attributes
7、 ... and 、Spark The architecture and job submission process of
版权声明
本文为[Z-hhhhh]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220559122610.html
边栏推荐
- Open the past and let's start over.
- Leetcode001 -- returns the subscript of the array element whose sum is target
- Error occurs when thymeleaf th: value is null
- 协程与多进程的完美结合
- Key points of AWS eks deployment and differences between console and eksctl creation
- The perfect combination of collaborative process and multi process
- leetcode001--返回和为target的数组元素的下标
- 補:注解(Annotation)
- Record the blind injection script
- MYSQL查询至少连续n天登录的用户
猜你喜欢
指纹Key全国产化电子元件推荐方案
无线键盘全国产化电子元件推荐方案
【论文阅读】【3d目标检测】point transformer
How to regulate intestinal flora? Introduction to common natural substances, probiotics and prebiotics
Chapter 4 - understanding standard equipment documents, filters and pipelines
Recommended scheme for national production of electronic components of wireless keyboard
QML advanced (V) - realize all kinds of cool special effects through particle simulation system
Understand the gut organ axis, good gut and good health
Small volume Schottky diode compatible with nsr20f30nxt5g
383. 赎金信
随机推荐
2020 is coming to an end, special and unforgettable.
leetcode009--用二分查找在数组中搜索目标值
【论文阅读】【3d目标检测】Improving 3D Object Detection with Channel-wise Transformer
Redis 命令大全
Kotlin. The binary version of its metadata is 1.6.0, expected version is 1.1.15.
The last day of 2021 is the year of harvest.
Microbial neuroimmune axis -- the hope of prevention and treatment of cardiovascular diseases
test
leetcode008--实现strStr()函数
Brushless motor drive scheme based on Infineon MCU GTM module
Shanghai Hangxin technology sharing 𞓜 overview of safety characteristics of acm32 MCU
【论文阅读】【3d目标检测】Voxel Transformer for 3D Object Detection
Mysql, binlog log query
Supplement 14: cmake practice project notes (to be continued 4 / 22)
[paper reading] [3D object detection] voxel transformer for 3D object detection
test
Iron and intestinal flora
Record the blind injection script
Code007 -- determine whether the string in parentheses matches
Bridge between ischemic stroke and intestinal flora: short chain fatty acids