当前位置:网站首页>Read the article, high-performance and predictable data center network
Read the article, high-performance and predictable data center network
2022-08-11 03:51:00 【Lingyun moment】
At the just-concluded first China Computing Conference, Alibaba Cloud's Panjiu infrastructure attracted a lot of attention.During the period, "how to realize the efficient and stable operation of high-performance network" has become the most frequently asked question by customers. This article will reveal the core technology behind the "panjiu predictable network".
Panjiu predictable network
In recent years, the artificial intelligence industry has grown rapidly, but the growth rate of GPU computing power has never been able to meet the needs of artificial intelligence applications, so the distributed machine learning model has become the norm in the industry.It is not easy to make a huge number of heterogeneous computing resources work together efficiently, and high-performance network is the key enabling technology.
Panjiu Predictable Network is a high-performance and predictable data center network developed by Alibaba Cloud. It is application-centric and realizes a high-performance and predictable network through "Alibaba Cloud's full-stack self-developed + end-network integration technology".system.
The whole system has built a hard-core technical base through Alibaba Cloud's self-developed switches, self-developed network cards, and self-developed high-performance network protocol stacks, and through innovative end-network integration technology, each self-developed component can efficiently collaborate, has many significant advantages such as large scale, high bandwidth, low latency, high reliability, and predictable performance, providing a solid network base for Alibaba Cloud's ultra-large-scale computing and storage clusters.
Picture | Panjiu can expect network exhibition site
Showcase of three core technologies
High-performance network architecture
In order to optimize the best computing power and energy efficiency, Alibaba Cloud has developed the High Performance Network (HPN) high-performance network architecture. It adopts a 2-layer clos non-convergence structure with dual-plane forwarding, and can support up to more than 10,000 A100 GPUs.It can achieve the theoretical minimum static forwarding delay between any two points in the Wanka GPU cluster. More forwarding links also make the probability of hash congestion as low as possible, and achieve the optimal cluster computing performance as a whole..
In addition, the dual-plane architecture design ensures that a single device or single-plane network failure will not affect the entire cluster network. Coupled with the service access of dual uplinks in the stack, the entire network cluster is stable and reliable.Users can provide continuous network service capabilities, and users do not have to worry about the impact of data center network software and hardware failures.
Graph | High-performance predictable data center network architecture
Full-stack self-developed end-network integration
Self-developed switch
All network equipment and optical interconnection components in the high-performance network cluster have been independently developed. The software system based on AliNOS has effectively opened up the supervision and control capabilities of a single device and the whole network dimension, and realized supervision and control while rapidly iterating new functions.All-in-one, self-developed hardware devices are modularly designed in line with Alibaba Cloud's scenarios, realizing multi-dimensional autonomous control of cost, supply, and operation and maintenance capabilities.
Figure | Full-stack self-development of end-network integration
Self-developed high-performance protocol stack
Currently the most widely used high-performance protocol stacks in the industry are IB and RoCEv2, but both have certain deficiencies in large-scale applications (IB equipment is expensive and cannot communicate with Ethernet, so users often need to build an expensive IBPrivate network; RoCEv2 protocol enables PFC technology, resulting in huge stability risks and limited scale).
After several years of large-scale practice of RoCEv2, Alibaba Cloud has independently developed the high-performance network protocol Solar-RDMA since 2019.Solar-RDMA protocol can significantly reduce switch queue jitter through Alibaba's self-developed end-network integration HPCC congestion control algorithm, achieve high bandwidth and low latency while achieving PFC-free deployment, and ensure that data is transmitted between nodes in the shortest time., so as to ensure the continuous maximum output of computing power.
Self-developed high-performance network card
In order to truly achieve high performance, Alibaba Cloud started to design a hardware offload solution for the Solar-RDMA protocol in 2020, and successfully developed a high-performance network card FIC (Fusion Intelligence Card) that carries the protocol in 2021.At present, the FIC card has been launched on a large scale.
Platform Services
The efficient and stable operation of high-performance network is always the core requirement of customers.
In order to achieve this goal, Alibaba Cloud has developed its own NUSA (Network Unified Service Architecture) service platform, which provides end-to-end network automation service capabilities from R&D, testing, delivery, operation, and change.
Based on the innovative end-network integration technology system, NUSA provides high-performance network automatic provisioning services, automatic network performance measurement and diagnosis services, automatic network fault monitoring, alarm and location services, network-wide resource management and high-performance network virtualizationServe.
Through the integration of end-to-end and network key technologies, Alibaba Cloud has opened up a new era of predictable data center networks, providing the underlying network guarantee for the continuous and stable output of cluster computing power.
In the future, Alibaba Cloud will continue to evolve towards richer communication semantics, higher bandwidth, lower latency, and better usability.(End of text)
边栏推荐
- 【FPGA】day18-ds18b20实现温度采集
- A large horse carries 2 stone of grain, a middle horse carries 1 stone of grain, and two ponies carry one stone of grain. It takes 100 horses to carry 100 stone of grain. How to distribute it?
- 【FPGA】day22-SPI protocol loopback
- es-head插件插入查询以及条件查询(五)
- Differences and connections between distributed and clustered
- typedef定义结构体数组类型
- Getting Started with Raspberry Pi (5) System Backup
- The solution to the height collapse problem
- AI + medical: for medical image recognition using neural network analysis
- MySQL数据库存储引擎以及数据库的创建、修改与删除
猜你喜欢
随机推荐
多串口RS485工业网关BL110
[FPGA] Design Ideas - I2C Protocol
【愚公系列】2022年08月 Go教学课程 035-接口和继承和转换与空接口
【FPGA】day22-SPI协议回环
Leetcode 450. 删除二叉搜索树中的节点
轮转数组问题:如何实现数组“整体逆序,内部有序”?“三步转换法”妙转数组
Echart地图的省级,以及所有地市级下载与使用
移动端地图开发选择哪家?
高校就业管理系统设计与实现
MongoDB 基础了解(二)
Multi-merchant mall system function disassembly 26 lectures - platform-side distribution settings
MYSQLg高级------聚簇索引和非聚簇索引
Rotary array problem: how to realize the array "overall reverse, internal orderly"?"Three-step conversion method" wonderful array
Detailed explanation of VIT source code
Leetcode 669. 修剪二叉搜索树
A simple JVM tuning, learn to write it on your resume
The thirteenth day of learning programming
多商户商城系统功能拆解26讲-平台端分销设置
Is there any way for kingbaseES to not read the system view under sys_catalog by default?
Leetcode 108. 将有序数组转换为二叉搜索树