当前位置:网站首页>ASCII, Unicode and UTF-8
ASCII, Unicode and UTF-8
2022-08-10 22:31:00 【TABE_】
Encoding
Standard ASCII
Standard ASCII, also known as Basic ASCII, uses 7 binary digits (the remaining 1 binary 0 is 0) to represent all uppercase and lowercase letters, the numbers 0 to 9, punctuation, and the alphanumeric characters used in American English.Special control characters.
ASCII code just uses 7-bit binary number, when it is represented by a byte, its first bit is always 0.If only English is represented, one byte is enough, but to represent all the characters in the world, multiple bytes must be used.
Unicode
Unicode is to be able to represent all text on the computer.It sets a unified and unique binary encoding for each character in each language to meet the requirements of cross-language and cross-platform text conversion and processing.It should be noted that Unicode is only a symbol set, it only specifies the binary code of the symbol, but does not specify how the binary code should be stored.
UTF-8
UTF-8 is the most widely used unicode implementation on the Internet.UTF-8 is a variable-length encoding method, which can use 1~4 bytes to represent a symbol, and the byte length varies according to different symbols.
UTF-8 encoding rules:
- For a single-byte character, the first bit is set to 0, and the next 7 bits correspond to the Unicode code point of the character.Therefore, for characters 0 - 127 in English, it is exactly the same as the ASCII code.This means that documents from the ASCII era can be opened with UTF-8 encoding without any problems.
- For a character that needs to be represented by N bytes (N > 1), the first N bits of the first byte are set to 1, the N + 1th bit is set to 0, and the remaining N - 1 wordsThe first two bits of the section are set to 10, and the remaining bits are filled with the character's Unicode code point.
边栏推荐
- 阿里云贾朝辉:云XR平台支持彼真科技呈现国风科幻虚拟演唱会
- Shell编程之条件语句(二)
- Translating scientific and technological papers, how to translate from Russian to Chinese
- RADIUS Authentication Server Deployment Costs That Administrators Must Know
- xshell (sed command)
- These must-know JVM knowledge, I have sorted it out with a mind map
- QT笔记——用VS + qt 生成dll 和 调用生成的dll
- geemap的详细安装步骤及环境配置
- 《DevOps围炉夜话》- Pilot - CNCF开源DevOps项目DevStream简介 - feat. PMC成员胡涛
- 谁是边缘计算服务的采购者?是这六个关键角色
猜你喜欢
shell(文本打印工具awk)
LeetCode每日两题02:反转字符串中的单词 (均1200道)
shell脚本循环语句for、while语句
How to translate financial annual report, why choose a professional translation company?
谁是边缘计算服务的采购者?是这六个关键角色
Live Classroom System 08 Supplement - Tencent Cloud Object Storage and Course Classification Management
shell编程之免交互
These must-know JVM knowledge, I have sorted it out with a mind map
Common interview questions for APP UI automation testing, maybe useful~
shell编程之正则表达式与文本处理器
随机推荐
文件IO-缓冲区
file IO-buffer
STL-stack
C # Hex file transfer skills necessary article 】 【 bin file code implementation
这些不可不知的JVM知识,我都用思维导图整理好了
unusual understanding
FPGA - Memory Resources of 7 Series FPGA Internal Structure -03- Built-in Error Correction Function
How to secure users in LDAP directory service?
shell(文本打印工具awk)
边缘与云计算:哪种解决方案更适合您的连接设备?
Likou 221 questions, the largest square
亲测有效|处理风控数据特征缺失的一种方法
黑猫带你学Makefile第11篇:当头文件a.h改变时,如何将所有依赖头文件a.h的.c文件都重新编译
STL-stack
Redis Performance Impact - Asynchronous Mechanisms and Response Latency
Service - DHCP principle and configuration
Likou 215 questions, the Kth largest element in an array
企业云存储日常运行维护实践经验分享
RADIUS Authentication Server Deployment Costs That Administrators Must Know
LeetCode-402 - Remove K digits