AI Flow is an open source framework that bridges big data and artificial intelligence.

Related tags

Deep Learningai-flow
Overview

Flink AI Flow

Introduction

Flink AI Flow is an open source framework that bridges big data and artificial intelligence. It manages the entire machine learning project lifecycle as a unified workflow, including feature engineering, model training, model evaluation, model service, model inference, monitoring, etc. Throughout the entire workflow, Flink is used as the general purpose computing engine.

In addition to the capability of orchestrating a group of batch jobs, by leveraging an event-based scheduler(enhanced version of Airflow), Flink AI Flow also supports workflows that contain streaming jobs. Such capability is quite useful for complicated real-time machine learning systems as well as other real-time workflows in general.

Features

You can use Flink AI Flow to do the following:

  1. Define the machine learning workflow including batch/stream jobs.

  2. Manage metadata(generated by the machine learning workflow) of date sets, models, artifacts, metrics, jobs etc.

  3. Schedule and run the machine learning workflow.

  4. Publish and subscribe events

To support online machine learning scenarios, notification service and event-based schedulers are introduced. Flink AI Flow's current components are:

  1. SDK: It defines how to build a machine learning workflow and includes the api of the Flink AI Flow.

  2. Notification Service: It provides event listening and notification functions.

  3. Meta Service: It saves the meta data of the machine learning workflow.

  4. Event-Based Scheduler: It is a scheduler that triggered jobs by some events happened.

Documentation

QuickStart

You can use Flink AI Flow according to the guidelines of Quick Start. Besides, you can also take a look at our Tutorial for writing your own workflow.

API

Please refer to the API Documentation to find the API supported by Flink AI Flow.

Design

If you are interested in design principle of Flink AI Flow, please see the Design Documentation for more details.

Examples

You can refer to some examples of Flink AI Flow to have a better understanding of how to write a workflow. Please see the Examples directory.

Reporting bugs

If you encounter any issues please open an issue in github and we encourage you to provide a patch through github pull request as well.

Contributing

We happily welcome contributions to Flink AI Flow. Please see our contribution guide for details.

Contact Us

For more information, you can join the Flink AI Flow Users Group on DingTalk to contact us. The number of the DingTalk group is 35876083.

You can also join the group by scanning the QR code below:

Comments
  • [Notification] Support CLI tool for notification server

    [Notification] Support CLI tool for notification server

    What is the purpose of the change

    Support CLI tool for notification server

    Brief change log

    (for example:)

    • Support CLI tool for notification server

    Verifying this change

    (Please pick either of the following options)

    This change added tests.

    opened by jiangxin369 2
  • [Notification] Add notification cli server command

    [Notification] Add notification cli server command

    What is the purpose of the change

    Add notification cli server command #170

    Brief change log

    • Add notification cli server command

    Verifying this change

    This change added tests.

    opened by Sxnan 2
  • [Airflow]Fix remote log handler bug

    [Airflow]Fix remote log handler bug

    What is the purpose of the change

    Fix #172

    Brief change log

    • Change the log_relative_path in remote log handler like S3, GCS, Azure when they init the context.
    • Add get_provider_info.py to pass airflow test cases.
    • Modified old test case in providers.

    Verifying this change

    This change has modified old test case to fit these new remote log handlers.

    opened by aqua7regia 2
  • [Airflow] The remote logging supports the current log mechanism

    [Airflow] The remote logging supports the current log mechanism

    FileTaskHandler, a python log handler that handles and reads task instance logs, supports the current log mechanism which uses seq_num and try_number to name the log file. The remote log handler like S3TaskHandler should support the log mechanism.

    bug Airflow 
    opened by SteNicholas 2
  • [hotfix] fix typo of spark operator

    [hotfix] fix typo of spark operator

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Daily uploading packages to PyPI

    Daily uploading packages to PyPI

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Add Tutorial Example

    Add Tutorial Example

    Brief change log

    (for example:)

    • Add tutorial example and docs
    • Each workflow execution should have its own working directory to avoid file overriding

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Add documentations

    Add documentations

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Refactor Notification Event

    Refactor Notification Event

    What is the purpose of the change

    Refactor Notification Event

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Fix unittests about existed namespace

    Fix unittests about existed namespace

    What is the purpose of the change

    Fix unittests about existed namespace

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Add docs about concepts

    Add docs about concepts

    What is the purpose of the change

    Add docs about concepts

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Unifying TaskStatusEvent and TaskStatusChangedEvent

    Unifying TaskStatusEvent and TaskStatusChangedEvent

    Describe the bug

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

    Steps to reproduce the behavior (if you can):

    1. Submit a '...'
    2. Click on '....'
    3. See error

    Expected behavior

    Actual behavior

    Screenshots

    Additional context

    opened by jiangxin369 0
  • Workflow Execution Status Incorrect

    Workflow Execution Status Incorrect

    Describe the bug

    task1 action_on_task_status(task2, success) when task1 failed, task2 would not be scheduled, but the status of workflow execution is still running. I think the status shoud be failed

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

    Steps to reproduce the behavior (if you can):

    1. Submit a '...'
    2. Click on '....'
    3. See error

    Expected behavior

    Actual behavior

    Screenshots

    Additional context

    opened by jiangxin369 0
  • AIFlow support python3.6

    AIFlow support python3.6

    Describe the feature

    AIFlow support python3.6

    Describe the solution you'd like

    Describe alternatives you've considered

    Additional context

    When run aiflow with python3.6, notification server blocks wich following logs:

    [2022-07-25 14:07:07,682 - server.py:68 [MainThread] - INFO: Notification server started.
    [2022-07-25 14:07:18,780 - server.py:189 [Thread-1] - ERROR: Lock is not acquired.
    Traceback (most recent call last):
      File "/root/venv_for_aiflow/lib64/python3.6/site-packages/notification_service/server.py", line 185, in _call_behavior_async
        return await behavior(argument, context), True
      File "/usr/lib64/python3.6/asyncio/coroutines.py", line 225, in coro
        res = yield from await_meth()
      File "/root/venv_for_aiflow/lib64/python3.6/site-packages/notification_service/service.py", line 221, in _list_all_events
        pass
      File "/usr/lib64/python3.6/asyncio/coroutines.py", line 212, in coro
        res = func(*args, **kw)
      File "/usr/lib64/python3.6/asyncio/locks.py", line 86, in __aexit__
        self.release()
      File "/usr/lib64/python3.6/asyncio/locks.py", line 207, in release
        raise RuntimeError('Lock is not acquired.')
    RuntimeError: Lock is not acquired.
    
    opened by jiangxin369 0
  • action_on_event_received only process events sent by current workflow execution

    action_on_event_received only process events sent by current workflow execution

    Describe the feature

    Currently, the event is broadcasting, once the wanted event is sent, all workflow executions of this workflow would be triggered.

    Describe the solution you'd like

    As the scheduler dispatcher would inspect the context of each event to figure out if it contains the workflow execution id, we can inject the runtime context of each task execution to the user-defined event to make sure the event will only effect on the current workflow execution.

    We need the following changes:

    • Add a global variable _CURRENT_TASK_CONTEXT in context.py, it is used to store the runtime context of each task execution.
    • To read and write the _CURRENT_TASK_CONTEXT, add get_runtime_task_context and set_runtime_task_context functions.
    • Add a public API called wrap_execution_context to inject the runtime info to the context of the event before sending it.
    _CURRENT_TASK_CONTEXT: TaskExecutionContext = None
    
    
    def set_runtime_task_context(context: TaskExecutionContext):
        global _CURRENT_TASK_CONTEXT
        _CURRENT_TASK_CONTEXT = context
    
    
    def get_runtime_task_context():
        return _CURRENT_TASK_CONTEXT
    
    
    def wrap_execution_info_to_context(event: Event):
        """
      The event whose context is wrapped with workflow execution info would only be processed by specific workflow execution.
      """
        pass
    

    How to use it?

    def func():
        notification_client = get_notification_client()
        event = Event(event_key=EVENT_KEY, message='This is a custom message.')
        
        // wrap event with context
        wrap_execution_info_to_context(event)
        // send event
        notification_client.send_event(event)
    
    with Workflow(name='workflow') as w1:
        task = PythonOperator(name='task', python_callable=func)
    

    Describe alternatives you've considered

    Additional context

    opened by jiangxin369 0
  • [NotificationService] Cannot use one EmbeddedNotificationClient instance in multiple threads

    [NotificationService] Cannot use one EmbeddedNotificationClient instance in multiple threads

    Describe the bug

    Cannot use one EmbeddedNotificationClient instance in multiple threads.

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

        def test_create_client_multiple_threads(self):
            import threading
            from notification_service.embedded_notification_client import EmbeddedNotificationClient
            from notification_service.event import Event, EventKey
    
            client = EmbeddedNotificationClient(server_uri="localhost:50052", namespace='default', sender='sender')
    
            def send_event():
                event = Event(event_key=EventKey('key1'), message='a')
                client.send_event(event)
                print(client.sequence_number)
            threads = []
            for i in range(100):
                thread = threading.Thread(target=send_event)
                threads.append(thread)
            for t in threads:
                t.start()
            for t in threads:
                t.join()
    

    Expected behavior

    100 events sent.

    Actual behavior

    less than 100 events sent.

    Screenshots

    Additional context

    opened by jiangxin369 0
Releases(release-0.3.1)
  • release-0.3.1(Feb 22, 2022)

    Features

    1. Flink job plugin supports the multiple Flink versions #236
    2. Support stopping/resuming job scheduling #241
    3. Support log view of job execution for frontend #251
    4. Improve documentation

    Bug Fixes

    1. Airflow operator context does not set correctly #250
    2. Failed to trigger workflow execution #260
    3. Fix update the deployed model version multiple times #269
    4. Fix register_model_version sending wrong event type bug #290

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
  • release-0.3.0(Dec 16, 2021)

    Features

    AIFlow

    1. Introduces the command-line interface to help operation.
    2. Support database version migration.
    3. Supports the worflow development on the Jupyter Notebook #140

    Notification Service

    1. Introduces the command-line interface to help operation.
    2. Support database version migration.

    Bug Fixes

    1. The remote logging supports the current log mechanism #172
    2. Unittest failed cause by init_ai_flow_context #185
    3. AIFlow Webserver cannot sort model version by version #207

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
    examples.tar.gz(32.95 MB)
  • release-0.2.2(Nov 19, 2021)

    Features

    AIFlow

    1. Add the documents of the AIFlow. #117

    Airflow

    1. Allow LocalExecutor to run with SQLite. #41
    2. Airflow webserver supports the Airflow and Notification databases. #49

    Notification Service

    1. Introduces the countEvents interface. #11

    Bug Fixes

    1. Fix oss blob manager download concurrently. #3
    2. Deepcopy tasks in celery executor to avoid race condition. #12
    3. Fix periodic workflow cannot run. #77
    4. Fix HDFSBlobManager failed to download existed file. #87
    5. Removes the uncompleted api action_on_dataset_event. #104
    6. The workflow directory is set incorrect in AIFlow runtime. #111

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
    examples.tar.gz(32.93 MB)
  • release-0.2.0(Feb 9, 2022)

    Features

    AIFlow

    1. Add workflow execution on event and ContextExtractor API #476
    2. Add task execution restful api #478
    3. AIFlow add WorkflowEventManager to listen and handle events #492
    4. Support start new workflow execution with context #479
    5. Introduce the workflow frontend of the AIFlow UI #509
    6. Add FlinkSqlProcessor #527
    7. Support job execution label #529
    8. Frontend support metadata ui #533
    9. Notification service supports the idempotence #553
    10. Add read-only job plugin #555

    Airflow

    1. Support celery executor on event based scheduler #482
    2. Add AirFlowScheduler with airflow restful api #486

    Bug Fixes

    1. Make AI Flow be able to use Notification Service with HA enabled #510
    2. Duplicated entry when create dagrun #570
    3. EventBaseScheduler catches and prints exceptions #586
    4. Scheduler should find schedulable tasks once dagrun finished #587
    5. EventBaseScheduler would trigger task multiple times incorrectly #598
    Source code(tar.gz)
    Source code(zip)
    ai-flow-examples.tar.gz(32.93 MB)
  • release-0.1.0(Feb 9, 2022)

Owner
A neutral organization to host ecosystem projects for Apache Flink
Deep Learning ❤️ OneFlow

Deep Learning with OneFlow made easy 🚀 ! Carefree? carefree-learn aims to provide CAREFREE usages for both users and developers. User Side Computer V

21 Oct 27, 2022
a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

pytorch-liteflownet This is a personal reimplementation of LiteFlowNet [1] using PyTorch. Should you be making use of this work, please cite the paper

Simon Niklaus 365 Dec 31, 2022
PyTorch ,ONNX and TensorRT implementation of YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4

4.2k Jan 01, 2023
Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

GUI for iVOS(interactive VOS) and GIS (Guided iVOS) GUI Implementation of CVPR2021 paper "Guided Interactive Video Object Segmentation Using Reliabili

Yuk Heo 13 Dec 09, 2022
✂️ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.

EyeLipCropper EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video. The whole process consists of three parts: frame extracti

Zi-Han Liu 9 Oct 25, 2022
A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

GDN A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021) Abstract In this paper, we consider an inverse problem i

4 Sep 13, 2022
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 04, 2023
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
202 Jan 06, 2023
TJU Deep Learning & Neural Network

Deep_Learning & Neural_Network_Lab 实验环境 Python 3.9 Anaconda3(官网下载或清华镜像都行) PyTorch 1.10.1(安装代码如下) conda install pytorch torchvision torchaudio cudatool

St3ve Lee 1 Jan 19, 2022
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or

Healthcare Intelligence Laboratory 71 Jan 07, 2023
Plug and play transformer you can find network structure and official complete code by clicking List

Plug-and-play Module Plug and play transformer you can find network structure and official complete code by clicking List The following is to quickly

8 Mar 27, 2022
Bytedance Inc. 2.5k Jan 06, 2023
Unoffical reMarkable AddOn for Firefox.

reMarkable for Firefox (Download) This repo converts the offical reMarkable Chrome Extension into a Firefox AddOn published here under the name "Unoff

Jelle Schutter 45 Nov 28, 2022
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

102 Dec 25, 2022
Differentiable Surface Triangulation

Differentiable Surface Triangulation This is our implementation of the paper Differentiable Surface Triangulation that enables optimization for any pe

61 Dec 07, 2022
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

CRF - Conditional Random Fields A library for dense conditional random fields (CRFs). This is the official accompanying code for the paper Regularized

Đ.Khuê Lê-Huu 21 Nov 26, 2022