当前位置:网站首页>Installation and deployment of Flink and wordcount test
Installation and deployment of Flink and wordcount test
2022-04-23 04:41:00 【Z-hhhhh】
One 、 Local mode
Simulate locally in a multithreaded manner Flink Multiple roles in .( The development environment does not )
Download address :https://flink.apache.org/downloads.html
Here is the download :flink-1.13.0-bin-scala_2.12.tgz
Upload to common location , Then decompress .
start-up :
Switch to flink Of bin Under the table of contents , perform ./start-cluster.sh, Then view the process .

Two 、Standalone Independent cluster mode

( If you do the first step first , Remember to stop the service first ,stop-cluster.sh)
- Upload 、 decompression tar package .
- Modify the configuration file
cd conf/flink-conf.yaml
Appoint Flink colony JobManager RPC mailing address .
( My environment here is three machines , Respectively master,slave1,slave2)
jobmanager.rpc.address: master
Be careful :flink-conf.yaml Middle configuration key/value It's time “:” You need a space after , Otherwise, the configuration will not take effect .
Explanation of other important configurations :
# jobmanager and taskmanager Communication port number
jobmanager.rpc.port: 6123
# JobManager Total process memory size
jobmanager.memory.process.size: 1600m
# At present taskmanager How much memory does the whole process occupy
taskmanager.memory.process.size: 1728m
# Every taskmanager What can be provided slots
taskmanager.numberOfTaskSlots: 1
# The default parallelism
parallelism.default: 1
#jobmanager Failover strategy ,1.9 New properties that appear after , Regional recovery strategy
jobmanager.execution.failover-strategy: region
#flink web The port number of the interface
rest.port: 8081
vi conf/workers
Appoint Flink colony TaskManager.
slave1
slave2
- Send to the other two nodes .
scp -r flink-1.13.0 slave1:$PWD
scp -r flink-1.13.0 slave2:$PWD
- Configure environment variables
vi /etc/profile
export FLINK_HOME=/usr/software/flink-1.13.0
export PATH=$PATH:$FLINK_HOME/bin
source /etc/profile
-
Start cluster

-
see Flink Node process

Can pass master:8081 visit Flink Of web Interface

3、 ... and 、Standalone HA

From the previous architecture, we can find that JobManager There is an obvious single point problem (SPOF,single point of failure).JobManager Responsible for task scheduling and resource allocation , once JobManager There was an accident , The consequences can be imagined .
We have the help of Zookeeper With the help of the , One Standalone Of Flink The cluster will have multiple live JobManager, Only one of them is working , Others are in Standby state . When at work JobManager After losing the connection ( Such as downtime or Crash),Zookeeper From Standby Choose a new one JobManager To take over Flink colony .
-
Cluster planning
master:JobManager+TaskManager
slave1: JobManager+TaskManager
slave2: JobManager -
Modify the configuration file
vi masters
master:8081
slave1:8081
vi worksers
master
slave1
slave2
vi flink-conf.yaml
Add the following
state.backend: filesystem
state.backend.fs.checkpointdir: hdfs://master:9000/flink-checkpoints
high-availability: zookeeper
high-availability.storageDir: hdfs://master:9000/flink/ha/
high-availability.zookeeper.quorum: master:2181,slave1:2181,slave2:2181
- Sync flink Profile directory conf To the other two nodes
scp -r conf slave1:$PWD
scp -r conf slave2:$PWD
-
modify slave1 Upper jobmanager mailing address
vi flink-conf.yaml
jobmanager.rpc.address: slave1 -
start-up
start-up HDFS
start-up Zookeeper
take flink-shaded-hadoop-2-uber-2.7.5-10.0.jar copy to flink Of lib Under the table of contents , All three nodes are copied .
start-up flink

- Check the process

7. test
yum install -y nc
nc -lk 1234
Four 、Flink on yarn
4.1.1、flink And yarn Interaction
Use... In the actual development process Flink on yarn There are many modes

4.1.2、 To configure
close yarn Memory check for ,yarn-site.xml. And distribute it to other nodes .
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
distribution
scp -r yarn-site.xml slave1:$PWD
scp -r yarn-site.xml slave2:$PWD
Whether to start a thread to check the amount of virtual memory that each task is using , If the task exceeds the assigned value , Kill it directly , The default is true. In this, we need to shut down , Because for flink Use yarn In mode , It's easy to have memory overruns , This is the time yarn Will automatically kill job.
Restart all services
dhfs
yarn
zookeeper
flink
4.2、Session Pattern

- characteristic : You need to apply for resources in advance , start-up JobManager and TaskManger
- advantage : You don't need to apply for resources every time you submit an assignment , Instead, use resources that have been applied for , So as to improve the execution efficiency
- shortcoming : After the job execution is completed , Resources will not be released , Therefore, it will always occupy system resources
- Application scenarios : It is suitable for situations where homework submission is frequent , Scenes with more small assignments
4.2.1、 Case study
yarn-session.sh( Open up resources ) + flink run( Submit tasks )
stay yarn Previous start one Flink conversation , Execute the following command :
yarn-session.sh -n 2 -tm 800 -s 1 -d flink run /usr/software/flink-1.13.0/examples/batch/WordCount.jar
Parameter description :
# -n To apply for 2 A container , This is how many taskmanager
# -tm Represent each TaskManager The memory size of
# -s Represent each TaskManager Of slots Number
# -d Indicates that the later program mode is running
stay yarn Of 8088 The interface can see Flink session cluster Running .

Task to complete

4.3、 Per-job Pattern

- characteristic : Every time you submit an assignment, you need to apply for a resource
- advantage : Job run completed , Resources will be released immediately , It will not occupy system resources all the time
- shortcoming : Every time you submit an assignment, you need to apply for resources , It will affect the execution efficiency , Because it takes time to apply for resources
- Application scenarios : Suitable for scenes with less homework 、 Big homework scene
4.3.1、 Case study
Submit directly job
flink run -m yarn-cluster -yjm 1024 -ytm 1024 /usr/software/flink-1.13.0/examples/batch/WordCount.jar
Parameter description
# -m jobmanager The address of
# -yjm 1024 Appoint jobmanager Memory information
# -ytm 1024 Appoint taskmanager Memory information
Check the specific parameter description :
flink --help
5、 ... and 、Flink test WordCount
-
install netcat
yum install -y nc -
Turn on nc
nc -lk 1314 -
Count the word frequency input by the port
flink run /usr/software/flink-1.13.0/examples/streaming/SocketWindowWordCount.jar --hostname master --port 1314
stay 1314 Random input at terminal

ctrl+c After that , The other end prompts that the task is completed

- Check the statistics
stay Flink Of TaskManager Node log Under the table of contents . View to .out Final document .

版权声明
本文为[Z-hhhhh]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220559122651.html
边栏推荐
- QML进阶(四)-绘制自定义控件
- 记录一下盲注脚本
- 优麒麟 22.04 LTS 版本正式发布 | UKUI 3.1开启全新体验
- Programmers complain: I really can't live with a salary of 12000. Netizen: how can I say 3000
- IEEE Transactions on industrial information (TII)
- The 14th issue of HMS core discovery reviews the long article | enjoy the silky clip and release the creativity of the video
- SQL statement for adding columns in MySQL table
- Migrate from MySQL database to AWS dynamodb
- 229. Find mode II
- C language: Advanced pointer
猜你喜欢

Fusobacterium -- symbiotic bacteria, opportunistic bacteria, oncobacterium

Key points of AWS eks deployment and differences between console and eksctl creation

2021数学建模国赛一等奖经验总结与分享

Ali's ten-year technical experts jointly created the "latest" jetpack compose project combat drill (with demo)

无线键盘全国产化电子元件推荐方案

520.检测大写字母

Go反射法则

Small volume Schottky diode compatible with nsr20f30nxt5g

Recommended scheme for national production of electronic components of wireless keyboard

那些年我面试过的Android开发岗总结(附面试题+答案解析)
随机推荐
Logger and zap log Library in go language
[AI vision · quick review of robot papers today, issue 32] wed, 20 APR 2022
shell wc (统计字符数量)的基本使用
Summary of Android development posts I interviewed in those years (attached test questions + answer analysis)
520.检测大写字母
Open the past and let's start over.
做数据可视化应该避免的8个误区
程序员抱怨:1万2的工资我真的活不下去了,网友:我3千咋说
leetcode003--判断一个整数是否为回文数
eksctl 部署AWS EKS
Summary of MySQL de duplication methods
A new method for evaluating the quality of metagenome assembly - magista
HMS Core Discovery第14期回顾长文|纵享丝滑剪辑,释放视频创作力
补充番外14:cmake实践项目笔记(未完待续4/22)
MYSQL50道基础练习题
Supplement 14: cmake practice project notes (to be continued 4 / 22)
KVM error: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock‘
【论文阅读】【3d目标检测】Improving 3D Object Detection with Channel-wise Transformer
leetcode006--查找字符串数组中的最长公共前缀
SQL statement for adding columns in MySQL table