首页手记 YARN集群资源如何分配

YARN集群资源如何分配

标签：

Hadoop

如何分配集群资源 (怎么配置Yarn)

总资源

集群中每台机器的配置 (RAM,CPU,Disk,网卡)

预留资源

总资源 - 集群中运行服务需要的资源(操作系统OS,DataNode,NodeManger,HBase,Hive,ZK,Impala..)

配置集群

YARN分配资源主要参数:

yarn.nodemanager.resource.memory-mb 每个节点分配的内存
yarn.nodemanager.resource.cpu-vcores 每个节点分配的虚拟CPU

YARN资源调度分配主要参数:

yarn.scheduler.minimum-allocation-mb container最少内存
yarn.scheduler.maximum-allocation-mb container最大内存<限制分配资源大小>

Determine HDP Memory Configuration Settings

文档地址

python hdp-configuration-utils.py -c 12 -m 48 -d 12 -k False

 Using cores=12 memory=48GB disks=12 hbase=False
 Profile: cores=12 memory=43008MB reserved=6GB usableMem=42GB disks=12
 Num Container=21
 Container Ram=2048MB
 Used Ram=42GB
 Unused Ram=6GB
 ***** mapred-site.xml *****
 mapreduce.map.memory.mb=2048
 mapreduce.map.java.opts=-Xmx1536m
 mapreduce.reduce.memory.mb=2048
 mapreduce.reduce.java.opts=-Xmx1536m
 mapreduce.task.io.sort.mb=768
 ***** yarn-site.xml *****
 yarn.scheduler.minimum-allocation-mb=2048
 yarn.scheduler.maximum-allocation-mb=43008
 yarn.nodemanager.resource.memory-mb=43008
 yarn.app.mapreduce.am.resource.mb=2048
 yarn.app.mapreduce.am.command-opts=-Xmx1536m
 ***** tez-site.xml *****
 tez.am.resource.memory.mb=2048
 tez.am.java.opts=-Xmx1536m
 ***** hive-site.xml *****
 hive.tez.container.size=2048
 hive.tez.java.opts=-Xmx1536m
 hive.auto.convert.join.noconditionaltask.size=402653000

其他注意项

虚拟内存和物理内存检查

NodeManager 可以监控Container的虚拟和物理内存使用情况
一般都会关闭虚拟内存检查

Set -Xmx of java-opts of each container to 0.8 * (container memory allocation)

Bottleneck resource 瓶颈资源

Since there are three types of resources, different containers from different jobs may ask for different amount of resources. This can result in one of the resources becoming the bottleneck. Suppose we have a cluster with capacity (1000G RAM,16 Cores,16 disks) and each Mapper container needs (10G RAM,1 Core, 0.5 disks): at most, 16 Mappers can run in parallel because CPU cores become the bottleneck here.
As a result, (840G RAM, 8 disks) resources are not used by anyone. If you meet this situation, just check the RM UI http://:8088/cluster/nodes to figure out which resource is the bottleneck. You can probably allocate the leftover resources to jobs which can improve performance with such resource. For example, you can allocate more memory to sorting jobs which used to spill to disk.

作者：阿武z
链接：https://www.jianshu.com/p/81b6b19f9c11

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

青春有我

JAVA开发工程师

手记
篇

粉丝

205

获赞与收藏

1011

关注作者，订阅最新文章

阅读免费教程

后端通用面试教程

41个小节 32217 359

网络编程入门教程

20个小节 13299 250

Pandas 入门教程

25个小节 19918 373

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空