当前位置:网站首页>How can flume solve the problem of too many small files when collecting
How can flume solve the problem of too many small files when collecting
2022-04-22 08:50:00 【The figure under the stars】
- HDFS Store a lot of small files , What's the impact? ?
(1) At the metadata level : Every little file has a metadata , This includes the file path 、 file name 、 owner 、 Subordinate to the group 、 jurisdiction 、 Creation time, etc , This information is stored in namenode in . therefore , Too many small files , Will occupy namenode The server has a lot of memory , influence namenode Performance and service life of ;
(2) At the computational level : By default ,MR Will start a for each small file map Task calculation , It greatly affects the computing performance ; It also affects the disk addressing time . - How to solve the problem of too many small files
stay flume Set three parameters in the configuration :
(1) hdfs.rollInterval: When the file is created for more than a few seconds, it will scroll to generate a new file
(2) hdfs.rollSize: When the file reaches how many bytes, it will scroll to generate a new file
(3) hdfs.rollCount: When event When the number reaches how many, it will scroll to generate new files
版权声明
本文为[The figure under the stars]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220750015418.html
边栏推荐
- Meaning of GMT and CST in programming
- navicat连接oracle数据库失败:cannot load OCI DLL,87:Instant Client package is ...
- C language variable parameter usage
- 235. Nearest common ancestor of binary search tree (easy)
- require-dev.mikey179/vfsStream is invalid, it should not contain uppercase characters. Please use m
- require-dev.mikey179/vfsStream is invalid, it should not contain uppercase characters. Please use m
- 数组传参的本质
- Tencent video automatic check-in detailed version (V value obtained by multiple methods)
- Freshman advice
- 二分查找【详解】
猜你喜欢
随机推荐
POI operation excel three swordsman
Use serialize in the record node to automatically generate the table model and link the operation database (take PostgreSQL as an example)
插入排序及优化
PCIe learning - basic concepts of PCIe bus (6)
How does CSDN reprint articles
Nessus漏洞扫描简介
ROM、RAM、SRAM、DRAM、Flash、SDRAM区别
ROM、RAM、SRAM、DRAM、Flash、SDRAM區別
Installing PostgreSQL under Linux (CentOS)
C语言 可变参数 用法
Abbreviation for greater than / less than / equal to (abbreviation of SQL database includes mangodb)
@ data annotation in idea, get / set method does not work
二分查找【详解】
Tissu hyperledger 1. 4 construction de l'environnement et essais d'échantillons
win系统pinpoint编译安装遇到的坑和大家分享
RHEL7——进程管理
sql查询去除空数据和null 字段
C语言之scanf/sprintf、fscanf/fprintf、sscanf/sprintf、snprintf
PCIe learning - Introduction to PCIe bus architecture: transaction layer - data link layer - physical layer (8)
DTV terminology








