当前位置:网站首页>GFS distributed file system (Theory)
GFS distributed file system (Theory)
2022-04-23 15:25:00 【C chord~】
Catalog
②、 The role of the file system
One .GFS summary
1、 file system
①、 File system composition
- File system interface (API)
- A collection of software that manages objects
- Objects and properties
②、 The role of the file system
- From a system perspective , File system is to organize and back up the space of file storage device
- A system that stores and protects and retrieves stored files
- To be specific , It's responsible for creating files for users 、 Deposit in 、 read out 、 modify 、 Dump files 、 Control the storage of files
③.GFS Professional term
- Brick( Block storage server ) The server that actually stores user data
- Volume Local file system " Partition "
- FUSE File system in user space ( Category EXT4),” This is a pseudo file system “, The switching module of the client
- VFS( Virtual ports ) Kernel virtual file system , The user is submitting a request to VFS then VFS hand FUSH, Give it back GFS client , Finally, the client gives it to the remote storage
- Glusterd( service ) Is the process that runs the re storage node ( The client is running gluster client)GFS The whole process of use GFS The exchange between is made by Gluster client and glusterd complete
④、 GFS Characteristics
- Scalability and high performance : Extensibility , Expansion nodes , Improve performance through multiple nodes
- High availability : No single point of failure , There is a backup mechanism , similar Raid The disaster recovery mechanism of
- Global unified namespace : Centralized management , analogy API The nature of / Concept , The isolation area defined according to his name in the system , It's an independent space ; Unified namespace , Interact with the client , Store the request to the block data server at the back end
- Elastic volume management : It is convenient for capacity expansion and management and maintenance of back-end storage clusters , More complicated
- Based on standard protocol : Based on standardized file usage protocol , Give Way CentOS compatible GFS
Two .GFS working principle
- When an external request passes through the mount point ,linux The system kernel passes through VFS Interface , Send a request to FUSE
- FUSE Give the data to the memory /dev/fuse, Then submit to GFS client
- GFS The client processes the data , And through the network protocol ( Such as TCP、IB etc. ), Transferred to the GFS Server side
- GFS After the server receives the data , adopt VFS Interface , Transfer the data accordingly
3、 ... and .GFS Volume type
- In order to solve the problem of distributed file data index 、 The complexity of positioning , And used HASH Algorithm to assist
- Distributed ( Average distribution ) The benefits of :
- ① When the amount of data is increasing , The amount of data relative to each storage node ( probability ) They are equal.
- ② And if you consider the single point of failure problem , When the data is stored again c Storage nodes , Regarding this GFS There will be a backup mechanism , Default 3 Backup , therefore GFS Its own mechanism will produce redundancy to the data , So as to solve the single point of failure
①. Volume type
- Distributed volumes
- Strip roll
- Copy volume
- Distributed striped volume
- Distributed replication volumes
- Strip copy volume
- Distributed striped data volume
Distributed volumes
File by HASH The algorithm is distributed to all Brick Server On , This kind of roll is GlusterFS The default volume of ; In document units according to HASH The algorithm hashes to different Brick, In fact, it just expands the disk space , If a disk is damaged , Data will also be lost , File level RAID0, No fault tolerance . In this mode , The file is not partitioned , The file is stored directly in some Server Node . Due to the direct use of the local file system for file storage , So access efficiency has not improved , On the contrary, it will be reduced due to network communication .
characteristic
- Files are distributed on different servers , No redundancy
- Expand the size of the volume more easily and cheaply
- A single point of failure can cause data loss
- Rely on the underlying data protection
Strip roll
similar RAID0, Files are divided into data blocks and distributed to multiple servers in a polling manner Brick Server On , File storage is in blocks , Support large file storage , The bigger the file , The more efficient the read is , But there is no redundancy .
characteristic
- The data is divided into smaller pieces and distributed to different stripe areas in the block server cluster
- Distribution reduces load and smaller files speed up access
- No data redundancy
Copy volume
Synchronize files to multiple Brick On , Make it have multiple copies of files , It belongs to file level RAID 1, Fault tolerance . Because the data is scattered in multiple Brick in , So the read performance has been greatly improved , But write performance drops . Replication volumes are redundant , Even if one node is damaged , It does not affect the normal use of data . But because you want to save a copy , So disk utilization is low .
characteristic
- All servers in the volume keep a complete copy
- The number of copies of a volume can be determined when the customer creates
- At least two block servers or more
- Redundancy
Distributed striped volume
Brick Server The number is the number of bands ( Block distribution Brick Number ) Multiple , It has the characteristics of distributed roll and strip roll . Mainly used for large file access processing **, Creating a distributed striped volume requires at least 4 Servers .**
Distributed replication volumes
Distributed replication volumes (Distribute Replica volume):Brick Server The number is the number of mirrors ( Number of data copies ) Multiple , Features of both distributed and replicated volumes , It is mainly used when redundancy is needed .
Strip copy volume
- Strip copy volume (Stripe Replica volume) similar RAID 1 0, It has the characteristics of striped volume and replicated volume at the same time .
- Distributed striped replication volumes (Distribute Stripe Replicavolume) Composite volume of three basic volumes , Usually used for classes Map Reduce application
版权声明
本文为[C chord~]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231406063645.html
边栏推荐
- Redis master-slave synchronization
- Elk installation
- PSYNC synchronization of redis source code analysis
- Squid agent
- MySQL query library size
- Three uses of kprobe
- win10 任务栏通知区图标不见了
- asp. Net method of sending mail using mailmessage
- C语言超全学习路线(收藏让你少走弯路)
- Baidu written test 2022.4.12 + programming topic: simple integer problem
猜你喜欢
22年了你还不知道文件包含漏洞?
Basic operation of circular queue (Experiment)
推荐搜索 常用评价指标
Nuxt project: Global get process Env information
API gateway / API gateway (II) - use of Kong - load balancing
深度学习——超参数设置
木木一路走好呀
今日睡眠质量记录76分
What exactly does the distributed core principle analysis that fascinates Alibaba P8? I was surprised after reading it
群体智能自主作业智慧农场项目启动及实施方案论证会议
随机推荐
8.3 language model and data set
Detailed explanation of C language knowledge points -- data types and variables [1] - carry counting system
How to design a good API interface?
Difference between like and regexp
Mysql连接查询详解
深度学习——超参数设置
js——實現點擊複制功能
How to write the keywords in the cover and title? As we media, why is there no video playback
网站某个按钮样式爬取片段
A series of problems about the best time to buy and sell stocks
How to use OCR in 5 minutes
Knn,Kmeans和GMM
Kubernetes详解(十一)——标签与标签选择器
电脑怎么重装系统后显示器没有信号了
Summary of interfaces for JDBC and servlet to write CRUD
C language super complete learning route (collection allows you to avoid detours)
JUC学习记录(2022.4.22)
免费在upic中设置OneDrive或Google Drive作为图床
我的树莓派 Raspberry Pi Zero 2W 折腾笔记,记录一些遇到的问题和解决办法
Mysql database explanation (IX)