当前位置:网站首页>Sortby use of spark operator
Sortby use of spark operator
2022-04-23 15:48:00 【Uncle flying against the wind】
Preface
sortBy, Sort as the name suggests , stay Spark in , Use sortBy You can sort a set of data to be processed , This set of data is not limited to numbers , It can also be tuples and other types ;
sortBy
Function signature
def sortBy[K](f: (T) => K , ascending: Boolean = true , numPartitions: Int = this.partitions.length)(implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[T]
Function description
This operation is used to sort the data . Before sorting , Data can be passed through f Function to process , And then according to f Function processingTo sort the results of , The default is ascending . New after sorting RDD The number of partitions is the same as the original RDD The number of partitions is oneCause . There is... In the middle shuffle The process of ;
Case presentation
Next, sort the data in a set , Save to local file directory
import org.apache.spark.{SparkConf, SparkContext}
object SortBy_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
val rdd = sc.makeRDD(List(1,2,3,4,5,6,7,9), 2)
rdd.sortBy(num => num)
rdd.saveAsTextFile("E:\\output")
sc.stop()
}
}
Run the above code , You can see that two files are generated in the local directory
Open separately 2 File , You can find , The data is sorted in two different files
Put... In the set tuple Data according to key Sort the output
import org.apache.spark.{SparkConf, SparkContext}
object SortBy_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
var rddStr = sc.makeRDD(List(
("a",3),("d",2),("e",7)
),2)
rddStr.sortBy(t => t._1)
rddStr.collect().foreach(println)
sc.stop()
}
}
Run the above code , Observe the console output
版权声明
本文为[Uncle flying against the wind]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231544587328.html
边栏推荐
猜你喜欢
布隆过滤器在亿级流量电商系统的应用
Vision of building interstellar computing network
Spark 算子之filter使用
Why is IP direct connection prohibited in large-scale Internet
贫困的无网地区怎么有钱建设网络?
使用 Bitnami PostgreSQL Docker 镜像快速设置流复制集群
c语言---指针进阶
Large factory technology implementation | industry solution series tutorials
Demonstration meeting on startup and implementation scheme of swarm intelligence autonomous operation smart farm project
Cookie&Session
随机推荐
Metalife established a strategic partnership with ESTV and appointed its CEO Eric Yoon as a consultant
JVM - Chapter 2 - class loader subsystem
CAP定理
Go并发和通道
ICE -- 源码分析
CVPR 2022 优质论文分享
Go language, array, pointer, structure
shell_2
CVPR 2022 quality paper sharing
Cookie&Session
WPS brand was upgraded to focus on China. The other two domestic software were banned from going abroad with a low profile
Vision of building interstellar computing network
leetcode-374 猜数字大小
字符串最后一个单词的长度
C, calculation method and source program of bell number
How do you think the fund is REITs? Is it safe to buy the fund through the bank
utils.DeprecatedIn35 因升级可能取消,该如何办
js正则判断域名或者IP的端口路径是否正确
单体架构系统重新架构
布隆过滤器在亿级流量电商系统的应用