当前位置:网站首页>Distinct use of spark operator
Distinct use of spark operator
2022-04-23 15:48:00 【Uncle flying against the wind】
Preface
Believed to have been used mysql Yes sql In the sentence distinct Keywords are not unfamiliar , Use distinct Keyword can be used to de duplicate the queried data , stay Spark in , You can make a similar understanding ;
Function signature
def distinct()(implicit ord: Ordering[T] = null): RDD[T]def distinct( numPartitions: Int )(implicit ord: Ordering[T] = null): RDD[T]
Function description
Remove duplicate data from a dataset
Case study : De duplication of a set of numbers in a set
import org.apache.spark.{SparkConf, SparkContext}
object Distinct_Test {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Operator")
val sc = new SparkContext(sparkConf)
val rdd = sc.makeRDD(List(1,2,3,4,5,3,5,2,2))
rdd.distinct().collect().foreach(println)
sc.stop()
}
}
Run the above program , Observe the console output , It can be found that duplicate elements are finally output only once

版权声明
本文为[Uncle flying against the wind]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231544587441.html
边栏推荐
猜你喜欢

携号转网最大赢家是中国电信,为何人们嫌弃中国移动和中国联通?

Timing model: gated cyclic unit network (Gru)

Configuration of multi spanning tree MSTP

Config组件学习笔记

c语言---指针进阶

Import address table analysis (calculated according to the library file name: number of imported functions, function serial number and function name)

IronPDF for .NET 2022.4.5455

大型互联网为什么禁止ip直连

一刷314-剑指 Offer 09. 用两个栈实现队列(e)

王启亨谈Web3.0与价值互联网“通证交换”
随机推荐
一刷314-剑指 Offer 09. 用两个栈实现队列(e)
Leetcode-396 rotation function
Modèle de Cluster MySQL et scénario d'application
PHP function
Codejock Suite Pro v20. three
Go语言条件,循环,函数
Vision of building interstellar computing network
utils. Deprecated in35 may be cancelled due to upgrade. What should I do
Redis master-slave replication process
Config learning notes component
cadence SPB17. 4 - Active Class and Subclass
布隆过滤器在亿级流量电商系统的应用
腾讯Offer已拿,这99道算法高频面试题别漏了,80%都败在算法上
Spark 算子之partitionBy
北京某信护网蓝队面试题目
mysql乐观锁解决并发冲突
Cookie&Session
【AI周报】英伟达用AI设计芯片;不完美的Transformer要克服自注意力的理论缺陷
R语言中绘制ROC曲线方法二:pROC包
Mumu, go all the way