当前位置:网站首页>Spark 算子之distinct使用
Spark 算子之distinct使用
2022-04-23 15:45:00 【逆风飞翔的小叔】
前言
相信使用过mysql的同学对sql语句中distinct关键字并不陌生,使用distinct关键字可以对查询的数据进行去重操作,在Spark 中,可以做类似的理解;
函数签名
def distinct()(implicit ord: Ordering[T] = null): RDD[T]def distinct( numPartitions: Int )(implicit ord: Ordering[T] = null): RDD[T]
函数说明
将数据集中重复的数据去重
案例:对集合中的一组数字去重
import org.apache.spark.{SparkConf, SparkContext}
object Distinct_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
val rdd = sc.makeRDD(List(1,2,3,4,5,3,5,2,2))
rdd.distinct().collect().foreach(println)
sc.stop()
}
}
运行上面的程序,观察控制台输出结果,可以发现重复的元素最终只输出了一次
版权声明
本文为[逆风飞翔的小叔]所创,转载请带上原文链接,感谢
https://blog.csdn.net/congge_study/article/details/124356121
边栏推荐
- Go语言条件,循环,函数
- Node.js ODBC连接PostgreSQL
- Rsync + inotify remote synchronization
- 一刷313-剑指 Offer 06. 从尾到头打印链表(e)
- MySQL Cluster Mode and application scenario
- Connectez PHP à MySQL via aodbc
- What if the server is poisoned? How does the server prevent virus intrusion?
- [leetcode daily question] install fence
- 删除字符串中出现次数最少的字符
- 贫困的无网地区怎么有钱建设网络?
猜你喜欢
pgpool-II 4.3 中文手册 - 入门教程
cadence SPB17. 4 - Active Class and Subclass
Treatment of idempotency
网站压测工具Apache-ab,webbench,Apache-Jemeter
WPS brand was upgraded to focus on China. The other two domestic software were banned from going abroad with a low profile
c语言---字符串+内存函数
CAP定理
Application of Bloom filter in 100 million flow e-commerce system
C#,贝尔数(Bell Number)的计算方法与源程序
MySQL集群模式與應用場景
随机推荐
JVM-第2章-类加载子系统(Class Loader Subsystem)
Mumu, go all the way
导入地址表分析(根据库文件名求出:导入函数数量、函数序号、函数名称)
计算某字符出现次数
控制结构(二)
Config组件学习笔记
String sorting
通过 PDO ODBC 将 PHP 连接到 MySQL
Today's sleep quality record 76 points
One brush 314 sword finger offer 09 Implement queue (E) with two stacks
s16. One click installation of containerd script based on image warehouse
Redis主从复制过程
c语言---指针进阶
Why is IP direct connection prohibited in large-scale Internet
What role does the software performance test report play? How much is the third-party test report charged?
单体架构系统重新架构
What if the server is poisoned? How does the server prevent virus intrusion?
考试考试自用
WPS brand was upgraded to focus on China. The other two domestic software were banned from going abroad with a low profile
What are the mobile app software testing tools? Sharing of third-party software evaluation