当前位置:网站首页>Minutes of OpenMLDB Meetup No.2
Minutes of OpenMLDB Meetup No.2
2022-08-11 06:34:00 【Fourth Paradigm Developer Community】
1. Meeting Contents
The OpenMLDB community held the second meetup on April 16, 2022. The related videos and materials of the meeting are as follows:
● Co-founder of StreamNative Zhai Jia - facing the upstream data ecology of OpenMLDB, in-depth analysis of cloud native newsStreaming platform Apache Pulsar.
https://www.zhihu.com/zvideo/1499462777765244928
https://pan.baidu.com/s/1VNtnVvGhEpWNzecGklbqNQ
● Lu Mian, head of OpenMLDB R&D - for real-time feature computing scenarios, introduces the feature development process based on OpenMLDB and the architecture of the machine learning feature computing platform.
https://www.zhihu.com/zvideo/1499468557670461440 https://pan.baidu.com/s/1kBPJWnak254i_VdlupqRpA
● Huang Wei, OpenMLDB R&D Architect - Practical drill of OpenMLDB Pulsar Connector, taking you to efficiently connect real-time data to feature engineering.
https://www.zhihu.com/zvideo/1499478121748914176 https://pan.baidu.com/s/18-EhQMqhYNrP2IxSglGGdw
(Baidu network disk data collection password is open)
2. Discussion
During the meeting, several guests discussed with the community. Here we show some Q&A as follows:
Q1: In addition to computational logic, does OpenMLDB have a mechanism to ensure data consistency offline?
A: The offline and online data of OpenMLDB are separate storage engines.In most cases, the data used in offline development and the data used in online computing are not the same. Online data will continue to introduce new data only for real-time reasoning over time.So from this point of view, it is not necessary to maintain the consistency of offline and online data.
Q2: Is OpenMLDB suitable for IoT analytics?
A: Many data in the Internet of Things are time series data with time stamps.For this kind of time series data, it is theoretically very suitable to use OpenMLDB for analysis, including feature calculation.If you have related needs, you are welcome to interact and discuss with us in the community.
Q3: Which languages does OpenMLDB provide SDKs?
A: Currently OpenMLDB SDK can support Python, Java, and REST APIs.
Q4: If Flink is used for real-time inference, how can it be consistent with batch training Spark?
A: If Flink is used for real-time reasoning, it is currently difficult to achieve computational consistency with our Spark distribution.The two main ones do not use OpenMLDB's consistency engine to generate a completely consistent execution plan for computational logic.Therefore, it is recommended that you directly use the complete process of OpenMLDB to ensure the consistency of online and offline.
Q5: Can the algorithm of feature engineering be extended by UDF in SQL?
A: UDF will be supported in version 0.5.0 which will be released this month.Currently, C/C++ UDFs will be supported first, and Python UDFs will be supported in a later version.
Q6: What are the essential differences between OpenMLDB and Mysql?
A: MySQL is an OLTP database, and its positioning is very different from OpenMLDB.MySQL may also be able to complete some online feature computing tasks, but it does not have an online and offline consistent design architecture. In addition, for some important feature computing operations (such as OpenMLDB optimized cross-window aggregation, etc.)Sexual optimization.
Q7: What are the advantages of OpenMLDB compared with similar products or open source tools on the market?
A: At present, there are Feature Store products on the market, which are similar in positioning to OpenMLDB. They all provide feature platforms for machine learning.However, most Feature Store products, such as the most famous open source project Feast, do not provide real-time feature computing capabilities, and do not ensure the consistency of online and offline at the computing layer.They are more about opening up the features of offline computing and the ability to share online.The commercial version of Tecton provides similar real-time computing capabilities, but it is still pushed to Spark as described, so it is expected that the performance of real-time computing has not been optimized.
Q8: Feature ops refers to the feature itself, or the algorithm used to extract the feature, such as pca, fm, etc.What is the difference between feature ops and model ops
A: The feature ops here refers to the computational logic for extracting the feature itself, rather than some feature processing algorithms you mentioned.You can basically think of it as a data processing logic similar to database SQL.
3. OpenMLDB Community
Thank you for your strong support for this meetup. If you want to learn more about OpenMLDB or participate in community technical exchanges, you can get relevant information and interaction through the following channels.
● Github: https://link.zhihu.com/?target=https%3A//github.com/4paradigm/OpenMLDB
● Official website: https://openmldb.ai
● Email:mailto:[email protected]
● https://www.zhihu.com/people/xdai-xia-ke
● https://link.zhihu.com/?target=https%3A//memark.io/wp-content/uploads/2021/12/OpenMLDB-group.png
边栏推荐
猜你喜欢
随机推荐
net6 的Web MVC项目中事务功能的应用
C语言中switch的嵌套
Thesis unscramble TransFG: A Transformer Architecture for Fine - grained Recognition
Use c language to implement tic-tac-toe chess (with source code, you can run it directly)
构建面向特征工程的数据生态 ——拥抱开源生态,OpenMLDB全面打通MLOps生态工具链
JS进阶网页特效(pink老师笔记)
第四范式OpenMLDB优化创新论文被国际数据库顶会VLDB录用
10 个超好用的 DataGrip 快捷键,快加入收藏! | 实用技巧
开源之夏 2022 火热来袭 | 欢迎报名 OpenMLDB 社区项目~
mount命令--挂载出现只读,解决方案
promise 改变状态的方法和promise 的then方法
实时特征计算平台架构方法论和基于 OpenMLDB 的实践
支付牌照是什么意思
自定义形状seekbar学习
CMT2380F32模块开发4-UART例程
开源机器学习数据库OpenMLDB贡献者计划全面启动
Tinker接入全流程---编译篇
系统性能及并发数的一些计算公式
js写四位随机数能有多少种可能性?并列出所有可能性
智能风控中台设计与落地









