当前位置:网站首页>Yarn core parameter configuration
Yarn core parameter configuration
2022-04-23 10:12:00 【zhaojiew】
Because there is no environment to test , Shangshan Silicon Valley hadoop material , Leave a hole to fill
Yarn Configure the case and related parameters
demand : from 1G In the data , Count the number of times each word appears . The server 3 platform , Each set is equipped with 4G Memory , 4 nucleus CPU, 4 Threads .
1G / 128m = 8 individual MapTask; 1 individual ReduceTask; 1 individual mrAppMaster, Average operation per node 10 individual / 3 platform ≈ 3 A mission (4 3 3)
modify yarn-site.xml
<!-- Select the scheduler , The default volume -->
<property>
<description>The class to use as the resource scheduler.</description>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capaci
ty.CapacityScheduler</value>
</property>
<!-- ResourceManager Number of threads processing scheduler requests , Default 50; If the number of tasks submitted is greater than 50, Sure Increase the value , But not more than 3 platform * 4 Threads = 12 Threads ( Removing other applications can't actually be more than 8) -->
<property>
<description>Number of threads to handle scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>8</value>
</property>
<!-- Whether to let yarn Automatic detection of hardware configuration , The default is false, If the node has many other applications , Suggest Manual configuration . If the node has no other applications , You can use automatic -->
<property>
<description>Enable auto-detection of node capabilities such as memory and CPU.
</description>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>false</value>
</property>
<!-- Whether to regard virtual core as CPU Check the number , The default is false, Using physics CPU Check the number -->
<property>
<description>Flag to determine if logical processors(such as
hyperthreads) should be counted as cores. Only applicable on Linux
when yarn.nodemanager.resource.cpu-vcores is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true.
</description>
<name>yarn.nodemanager.resource.count-logical-processors-ascores</name>
<value>false</value>
</property>
<!-- Virtual kernel and physical kernel multiplier , The default is 1.0 -->
<property>
<description>Multiplier to determine how to convert phyiscal cores to
vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
is set to -1(which implies auto-calculate vcores) and
yarn.nodemanager.resource.detect-hardware-capabilities is set to true.
The number of vcores will be calculated as number of CPUs * multiplier.
</description>
<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>
</property>
<!-- NodeManager The amount of memory used , Default 8G, It is amended as follows 4G Memory -->
<property>
<description>Amount of physical memory, in MB, that can be allocated
for containers. If set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<!-- nodemanager Of CPU Check the number , If it is not set automatically according to the hardware environment, the default is 8 individual , It is amended as follows 4 individual -->
<property>
<description>Number of vcores that can be allocated
for containers. This is used by the RM scheduler when allocating
resources for containers. This is not used to limit the number of
CPUs used by YARN containers. If it is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically determined from the hardware in case of Windows and Linux.
In other cases, number of vcores is 8 by default.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
</property>
<!-- Container minimum memory , Default 1G -->
<property>
<description>The minimum allocation for every container request at the
RM in MBs. Memory requests lower than this will be set to the value of
this property. Additionally, a node manager that is configured to have
less memory than this value will be shut down by the resource manager.
</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<!-- Container maximum memory , Default 8G, It is amended as follows 2G -->
<property>
<description>The maximum allocation for every container request at the
RM in MBs. Memory requests higher than this will throw an
InvalidResourceRequestException.
</description>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<!-- The container is the smallest CPU Check the number , Default 1 individual -->
<property>
<description>The minimum allocation for every container request at the
RM in terms of virtual CPU cores. Requests lower than this will be set to
the value of this property. Additionally, a node manager that is configured
to have fewer virtual cores than this value will be shut down by the
resource manager.
</description>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<!-- The container is the largest CPU Check the number , Default 4 individual , It is amended as follows 2 individual -->
<property>
<description>The maximum allocation for every container request at the
RM in terms of virtual CPU cores. Requests higher than this will throw an
InvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
</property>
<!-- Virtual memory check , The default , Change to close -->
<property>
<description>Whether virtual memory limits will be enforced for
containers.</description>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!-- Virtual memory and physical memory setting ratio , Default 2.1 -->
<property>
<description>Ratio between virtual memory to physical memory when
setting memory limits for containers. Container allocations are
expressed in terms of physical memory, and virtual memory usage is
allowed to exceed this allocation by this ratio.
</description>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
Capacity scheduler multi queue configuration
demand 1: default The queue accounts for% of the total memory 40%, The maximum resource capacity accounts for 60%, hive The queue accounts for% of the total memory 60%, The maximum resource capacity accounts for 80%.
demand 2: Configure queue priority
Since the default is the capacity scheduler configuration , Therefore, there is no need to specify the configuration file as capacity-scheduler.xml
stay capacity-scheduler.xml The configuration in is as follows :
Modify the configuration
<!-- Specify multiple queues , increase hive queue -->
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>default,hive</value>
<description>The queues at the this level (root is the root queue).
</description>
</property>
<!-- Reduce default The rated capacity of queue resources is 40%, Default 100% -->
<property>
<name>yarn.scheduler.capacity.root.default.capacity</name>
<value>40</value>
</property>
<!-- Reduce default The maximum capacity of queue resources is 60%, Default 100% -->
<property>
<name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
<value>60</value>
</property>
Add properties to the newly added queue
Also in capacity-scheduler.xml Middle configuration , Direct copy default Configuration modification of , Equivalent to writing the same attribute twice
<!-- Appoint hive The rated resource capacity of the queue -->
<property>
<name>yarn.scheduler.capacity.root.hive.capacity</name>
<value>60</value>
</property>
<!-- The maximum number of resources a user can use the queue , 1 Express -->
<property>
<name>yarn.scheduler.capacity.root.hive.user-limit-factor</name>
<value>1</value>
</property>
<!-- Appoint hive The maximum resource capacity of the queue -->
<property>
<name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
<value>80</value>
</property>
<!-- start-up hive queue -->
<property>
<name>yarn.scheduler.capacity.root.hive.state</name>
<value>RUNNING</value>
</property>
<!-- Which users have the right to submit jobs to the queue -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
<value>*</value>
</property>
<!-- Which users have access to the queue , Administrator rights ( see / Kill ) -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_administer_queue</name>
<value>*</value>
</property>
<!-- Which users have the right to configure the priority of submitting tasks -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</nam e>
<value>*</value>
</property>
<!-- Timeout setting of the task : yarn application -appId appId -updateLifetime Timeout-->
<!-- If application Timeout specified , Submitted to the queue application The maximum timeout that can be specified cannot exceed this value .-->
<property>
<name>yarn.scheduler.capacity.root.hive.maximum-applicationlifetime</name>
<value>-1</value>
</property>
<!-- If application No timeout specified , Then use default-application-lifetime As default -->
<property>
<name>yarn.scheduler.capacity.root.hive.default-applicationlifetime</name>
<value>-1</value>
</property>
Distribution profile , And restart yarn Or refresh the queue
yarn rmadmin -refreshQueues

towards Hive Queue submit task
shell The way
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount -D mapreduce.job.queuename=hive /input /output
jar package
The default task submission is to submit to default Queued . If you want to submit a task to another queue , Need to be in Driver In a statement
Configuration conf = new Configuration();
conf.set("mapreduce.job.queuename","hive");
Job job = Job.getInstance(conf);
Task priority
Capacity scheduler , Support task priority configuration , When resources are tight , High priority tasks will get resources first . By default , Yarn Limit the priority of all tasks to 0, If you want to use the priority function of the task , This restriction must be opened
modify yarn-site.xml file , Add the following parameters
<property>
<name>yarn.cluster.max-application-priority</name>
<value>5</value>
</property>
Distribution configuration , restart yarn
Simulate resource constrained environments , Submit calculation pi The task of
hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar pi 5 2000000
Submit higher priority tasks during operation , Found jumping in line
hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar pi -D mapreduce.job.priority=5 5 2000000
Modify task priority
yarn application -appID <ApplicationID> -updatePriority priority
yarn application -appID application_1611133087930_0009 -updatePriority 5
Fair scheduler multi queue configuration
Create two queues , Namely test and atguigu( Name the group to which the user belongs ). The following effects are expected : If the user specifies a queue when submitting a task , Then the task is submitted to the specified queue to run ; If no queue is specified , test The task submitted by the user to root.group.test The queue runs , atguigu The task submitted to root.group.atguigu The queue runs
The configuration of the fair scheduler involves two files , One is yarn-site.xml, The other is the fair scheduler queue allocation file fair-scheduler.xml( The file name can be customized )
modify yarn-site.xml file , Appoint fair The location of the scheduler's configuration file
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairS
cheduler</value>
<description> Configure fair scheduler </description>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/opt/module/hadoop-3.1.3/etc/hadoop/fair-scheduler.xml</value>
<description> Indicates the fair scheduler queue allocation profile </description>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>false</value>
<description> Inter queue resource preemption is prohibited </description>
</property>
To configure fair-scheduler.xml
<?xml version="1.0"?>
<allocations>
<!-- In a single queue Application Master The maximum proportion of resources occupied , Value 0-1 , General configuration of the enterprise 0.1 -->
<queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
<!-- The default value for the maximum resource of a single queue test atguigu default -->
<queueMaxResourcesDefault>4096mb,4vcores</queueMaxResourcesDefault>
<!-- Add a queue test -->
<queue name="test">
<!-- Queue minimum resource -->
<minResources>2048mb,2vcores</minResources>
<!-- Maximum queue resources -->
<maxResources>4096mb,4vcores</maxResources>
<!-- The maximum number of applications running simultaneously in the queue , Default 50, Configure according to the number of threads -->
<maxRunningApps>4</maxRunningApps>
<!-- In line Application Master The maximum proportion of resources occupied -->
<maxAMShare>0.5</maxAMShare>
<!-- The queue resource weight , The default value is 1.0 -->
<weight>1.0</weight>
<!-- Resource allocation policy within the queue -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<!-- Add a queue atguigu -->
<queue name="atguigu" type="parent">
<!-- Queue minimum resource -->
<minResources>2048mb,2vcores</minResources>
<!-- Maximum queue resources -->
<maxResources>4096mb,4vcores</maxResources>
<!-- The maximum number of applications running simultaneously in the queue , Default 50, Configure according to the number of threads -->
<maxRunningApps>4</maxRunningApps>
<!-- In line Application Master The maximum proportion of resources occupied -->
<maxAMShare>0.5</maxAMShare>
<!-- The queue resource weight , The default value is 1.0 -->
<weight>1.0</weight>
<!-- Resource allocation policy within the queue -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<!-- Task queue allocation policy , Configurable multi-layer rules , Match from the first rule , Until the match is successful -->
<queuePlacementPolicy>
<!-- Specify the queue when submitting a task , If no submission queue is specified , Then continue to match the next rule ; false Express : If it means Fixed queue does not exist , Automatic creation of... Is not allowed -->
<rule name="specified" create="false"/>
<!-- Submitted to the root.group.username queue , if root.group non-existent , Automatic creation of... Is not allowed ; if root.group.user non-existent , Allow automatic creation of -->
<rule name="nestedUserQueue" create="true">
<rule name="primaryGroup" create="false"/>
</rule>
<!-- The last rule must be reject perhaps default. Reject Indicates that the creation of the submission was rejected and failed , default Means to submit a task to default queue -->
<rule name="reject" />
</queuePlacementPolicy>
</allocations>
Distribute the configuration and restart Yarn test
hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar pi - Dmapreduce.job.queuename=root.test 1 1
If no queue is specified, it will be submitted to the queue matching the user name
版权声明
本文为[zhaojiew]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230949430904.html
边栏推荐
- SQL tuning series - Introduction to SQL tuning
- Exercise questions and simulation test of refrigeration and air conditioning equipment operation test in 2022
- Juc并发编程09——Condition实现源码分析
- 2022茶艺师(初级)考试试题模拟考试平台操作
- MapReduce核心和基础Demo
- [untitled]
- What are the system events of Jerry's [chapter]
- 杰理之用户如何最简单的处理事件【篇】
- 第120章 SQL函数 ROUND
- 一文看懂 LSTM(Long Short-Term Memory)
猜你喜欢
MapReduce计算流程详解
Operation of 2022 tea artist (primary) test question simulation test platform
第120章 SQL函数 ROUND
lnmp的配置
杰理之更准确地确定异常地址【篇】
自定义登录失败处理
Exercise questions and simulation test of refrigeration and air conditioning equipment operation test in 2022
正大国际讲解道琼斯工业指数到底是什么?
中职网络安全2022国赛之CVE-2019-0708漏洞利用
2022 mobile crane driver test question bank simulation test platform operation
随机推荐
Realizing data value through streaming data integration (5) - stream processing
349、两个数组的交集
Ansible cloud computing automation
Go language practice mode - functional options pattern
杰理之通常影响CPU性能测试结果的因素有:【篇】
Jerry's factors that usually affect CPU performance test results are: [article]
formatTime时间戳格式转换
【省选联考 2022 D2T1】卡牌(状态压缩 DP,FWT卷积)
What are the system events of Jerry's [chapter]
Pyqt5 and communication
【无标题】
209、长度最小的子数组(数组)
Arm debugging (1): two methods to redirect printf to serial port in keil
Windows安装redis并将redis设置成服务开机自启
Zhengda international explains what the Dow Jones industrial index is?
Art template template engine
DBA常用SQL语句(4)- Top SQL
Can Jerry's AES 256bit [chapter]
杰理之通常程序异常情况有哪些?【篇】
JUC concurrent programming 06 -- in-depth analysis of AQS source code of queue synchronizer