当前位置:网站首页>Sortby use of spark operator
Sortby use of spark operator
2022-04-23 15:48:00 【Uncle flying against the wind】
Preface
sortBy, Sort as the name suggests , stay Spark in , Use sortBy You can sort a set of data to be processed , This set of data is not limited to numbers , It can also be tuples and other types ;
sortBy
Function signature
def sortBy[K](f: (T) => K , ascending: Boolean = true , numPartitions: Int = this.partitions.length)(implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[T]
Function description
This operation is used to sort the data . Before sorting , Data can be passed through f Function to process , And then according to f Function processingTo sort the results of , The default is ascending . New after sorting RDD The number of partitions is the same as the original RDD The number of partitions is oneCause . There is... In the middle shuffle The process of ;
Case presentation
Next, sort the data in a set , Save to local file directory
import org.apache.spark.{SparkConf, SparkContext}
object SortBy_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
val rdd = sc.makeRDD(List(1,2,3,4,5,6,7,9), 2)
rdd.sortBy(num => num)
rdd.saveAsTextFile("E:\\output")
sc.stop()
}
}
Run the above code , You can see that two files are generated in the local directory

Open separately 2 File , You can find , The data is sorted in two different files

Put... In the set tuple Data according to key Sort the output
import org.apache.spark.{SparkConf, SparkContext}
object SortBy_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
var rddStr = sc.makeRDD(List(
("a",3),("d",2),("e",7)
),2)
rddStr.sortBy(t => t._1)
rddStr.collect().foreach(println)
sc.stop()
}
}
Run the above code , Observe the console output

版权声明
本文为[Uncle flying against the wind]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231544587328.html
边栏推荐
猜你喜欢

New developments: new trends in cooperation between smartmesh and meshbox

大厂技术实现 | 行业解决方案系列教程
![[AI weekly] NVIDIA designs chips with AI; The imperfect transformer needs to overcome the theoretical defect of self attention](/img/bf/2b4914276ec1083df697383fec8f22.png)
[AI weekly] NVIDIA designs chips with AI; The imperfect transformer needs to overcome the theoretical defect of self attention

C language --- string + memory function

MetaLife与ESTV建立战略合作伙伴关系并任命其首席执行官Eric Yoon为顾问

How can poor areas without networks have money to build networks?

一刷314-剑指 Offer 09. 用两个栈实现队列(e)

Cap theorem

Codejock Suite Pro v20.3.0

山寨版归并【上】
随机推荐
新动态:SmartMesh和MeshBox的合作新动向
PHP function
[self entertainment] construction notes week 2
王启亨谈Web3.0与价值互联网“通证交换”
大厂技术实现 | 行业解决方案系列教程
Demonstration meeting on startup and implementation scheme of swarm intelligence autonomous operation smart farm project
IronPDF for .NET 2022.4.5455
APISIX jwt-auth 插件存在错误响应中泄露信息的风险公告(CVE-2022-29266)
山寨版归并【上】
Go language, condition, loop, function
s16.基于镜像仓库一键安装containerd脚本
PHP operators
Mobile finance (for personal use)
Fastjon2他来了,性能显著提升,还能再战十年
JVM - Chapter 2 - class loader subsystem
Why is IP direct connection prohibited in large-scale Internet
[AI weekly] NVIDIA designs chips with AI; The imperfect transformer needs to overcome the theoretical defect of self attention
Control structure (I)
Independent operation smart farm Innovation Forum
Config learning notes component