当前位置:网站首页>Distinct use of spark operator
Distinct use of spark operator
2022-04-23 15:48:00 【Uncle flying against the wind】
Preface
Believed to have been used mysql Yes sql In the sentence distinct Keywords are not unfamiliar , Use distinct Keyword can be used to de duplicate the queried data , stay Spark in , You can make a similar understanding ;
Function signature
def distinct()(implicit ord: Ordering[T] = null): RDD[T]def distinct( numPartitions: Int )(implicit ord: Ordering[T] = null): RDD[T]
Function description
Remove duplicate data from a dataset
Case study : De duplication of a set of numbers in a set
import org.apache.spark.{SparkConf, SparkContext}
object Distinct_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
val rdd = sc.makeRDD(List(1,2,3,4,5,3,5,2,2))
rdd.distinct().collect().foreach(println)
sc.stop()
}
}
Run the above program , Observe the console output , It can be found that duplicate elements are finally output only once

版权声明
本文为[Uncle flying against the wind]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231544587441.html
边栏推荐
- Open source project recommendation: 3D point cloud processing software paraview, based on QT and VTK
- 提取不重复的整数
- Partitionby of spark operator
- 【自娱自乐】构造笔记 week 2
- IronPDF for . NET 2022.4.5455
- Leetcode-396 rotation function
- CVPR 2022 quality paper sharing
- [open source tool sharing] MCU debugging assistant (oscillograph / modification / log) - linkscope
- Spark 算子之交集、并集、差集
- 现在做自媒体能赚钱吗?看完这篇文章你就明白了
猜你喜欢

Configuration of multi spanning tree MSTP

CAP定理

Best practices of Apache APIs IX high availability configuration center based on tidb

Pgpool II 4.3 Chinese Manual - introductory tutorial

C language --- string + memory function

多线程原理和常用方法以及Thread和Runnable的区别

Implement default page

Config组件学习笔记

C, calculation method and source program of bell number

c语言---字符串+内存函数
随机推荐
CVPR 2022 优质论文分享
多级缓存使用
Spark 算子之sortBy使用
dlopen/dlsym/dlclose的简单用法
一刷313-剑指 Offer 06. 从尾到头打印链表(e)
删除字符串中出现次数最少的字符
Demonstration meeting on startup and implementation scheme of swarm intelligence autonomous operation smart farm project
shell_ two
utils.DeprecatedIn35 因升级可能取消,该如何办
Merging of Shanzhai version [i]
[open source tool sharing] MCU debugging assistant (oscillograph / modification / log) - linkscope
Spark 算子之交集、并集、差集
Upgrade MySQL 5.1 to 5.611
Go语言数组,指针,结构体
Deletes the least frequently occurring character in the string
[self entertainment] construction notes week 2
怎么看基金是不是reits,通过银行购买基金安全吗
Why disable foreign key constraints
Open source project recommendation: 3D point cloud processing software paraview, based on QT and VTK
fatal error: torch/extension. h: No such file or directory