当前位置:网站首页>How cursors work in Pulsar
How cursors work in Pulsar
2022-08-10 05:00:00 【SparkSql】
Computer failure on the client or server side with strong fault tolerance's messaging system ensures that no messages are lost or recovered.To provide this capability, a messaging system needs an accurate tracking mechanism for message consumption and acknowledgment.Apache Pulsar uses cursors for this purpose.
How cursors work
- The agent sends a message to the consumer.
- On receipt of this message, the consumer will acknowledge the message and send an acknowledgement back to the broker.
- The agent receives the confirmation and moves the cursor, updating the cursor ledger stored in BookKeeper.
Since proxies are stateless and cursors are stateful, Pulsar stores consumer location information in BookKeeper.When the broker receives an acknowledgment from the consumer, it updates the cursor ledger of the subscription the consumer is bound to.In the event of a consumer failure, message consumption and acknowledgment information remains safe, and no messages are recomputed when consumption is resumed.This is because cursor ledger data is securely stored on bookies according to the configured replication policy (i.e. the values of Ensemble Size, Write Quorum and Ack Quorum).
Note: A small subset of cursor metadata is stored in ZooKeeper, not in BookKeeper, such as cursorLedger index informationp>
Data partition
The data written to the topic may be a few megabytes or a few terabytes.So, the throughput of the topic is low in some cases and high in other cases, completely depends on the number of consumers.So what to do when some topics have high throughput and some are low?To solve this problem, Pulsar distributes the data of a topic to multiple machines, which is the so-called Partitions.
Partitioning is a very common method to ensure high throughput when dealing with massive data.By default, Pulsar topics are not partitioned, but can be easily Create Partitioned Topics, and specify the number of partitions.
After creating a partition topic, Pulsar can automatically partition data without affecting producers and consumers.That is, an application writes data to a topic, and after the topic is partitioned, the application code does not need to be modified.Partitioning is just an operation and maintenance operation, and the application does not need to care how the partitioning is performed.
Topic partitioning is handled by a process called a broker, and each node in a Pulsar cluster runs its own broker.
Cursor position
To know the exact position of the cursor in the topic, we can check the markDeletePosition
property of the cursor, which marks the position of the acknowledged message before the oldest unacknowledged message.Since this message and all of its predecessors have been acknowledged, it can be deleted.
Note: You can check the details of a topic with the command pulsar-admin topics stats
, the output of which contains the sameCursor-related information, such as markDeletePosition
, cursorLedger
, and individuallyDeletedMessages
.
In summary, whether the agent moves the cursor is related to the attribute markDeletePosition
Please note that multiple subscription cursors may be located on differentLocation.
How TTL affects cursors
By default, Pulsar permanently stores all unacknowledged messages.You might wonder if the cursor remains static if the message goes unacknowledged for a long time for some reason.In fact, we should avoid situations where there are a lot of unacknowledged messages, as this can mean a huge pressure on disk space.In this case, whether the cursor moves or not depends on the time-to-live (TTL) in the pulsar.
By setting a TTL policy, we can define the retention period for unacknowledged messages.After the configured time frame, Pulsar will automatically acknowledge these messages, forcing the cursor to move.It also means that these messages are ready to be deleted.
边栏推荐
- 大佬们,mysql cdc(2.2.1跟之前的版本)从savepoint起有时出现这种情况,有没有什
- canvas 画布绘制时钟
- Flutter开发:报错The following assertion was thrown resolving an image codec:Unable to…的解决方法
- 请教一下各位大佬。CDC社区中FlinkCDC2.2.0版本有说明支持的sqlserver版本 ,请
- Rpc接口压测
- 用 PySpark ML 构建机器学习模型
- openvino 安装(01)
- mysql常用命令有什么
- #【软件STM32cubeIDE下F103配置uart3+DMA收发+简单数据解析-基础样例】
- 2022年R2移动式压力容器充装考试题库模拟考试平台操作
猜你喜欢
随机推荐
MySQL事务的保证机制
添加路由的2种方式--router
Ueditor editor arbitrary file upload vulnerability
【sql】不同库查询前几条记录用法
【u-boot】u-boot驱动模型分析(02)
虚假新闻检测论文阅读(七):A temporal ensembling based semi-supervised ConvNet for the detection of fake news
智能锁控板的主要功能有哪些?如何使用?
Nexus_仓库类型
Using the DatePicker date control, Prop being mutated: "placement" error occurs
【LeetCode】41、 缺失的第一个正数
Ueditor编辑器任意文件上传漏洞
基于 EasyCV 复现 DETR 和 DAB-DETR,Object Query 的正确打开方式
链表的定义和使用
leetcode每天5题-Day13
I have a dream for Career .
【心理学·人物】第二期(学术X综艺)
解决“File has been changed outside the editor, reload?”提示
redis basic data types
Pulsar中游标的工作原理
域名DNS解析工具ping/nslookup/dig/host