AI Flow is an open source framework that bridges big data and artificial intelligence.

Related tags

Deep Learningai-flow
Overview

Flink AI Flow

Introduction

Flink AI Flow is an open source framework that bridges big data and artificial intelligence. It manages the entire machine learning project lifecycle as a unified workflow, including feature engineering, model training, model evaluation, model service, model inference, monitoring, etc. Throughout the entire workflow, Flink is used as the general purpose computing engine.

In addition to the capability of orchestrating a group of batch jobs, by leveraging an event-based scheduler(enhanced version of Airflow), Flink AI Flow also supports workflows that contain streaming jobs. Such capability is quite useful for complicated real-time machine learning systems as well as other real-time workflows in general.

Features

You can use Flink AI Flow to do the following:

  1. Define the machine learning workflow including batch/stream jobs.

  2. Manage metadata(generated by the machine learning workflow) of date sets, models, artifacts, metrics, jobs etc.

  3. Schedule and run the machine learning workflow.

  4. Publish and subscribe events

To support online machine learning scenarios, notification service and event-based schedulers are introduced. Flink AI Flow's current components are:

  1. SDK: It defines how to build a machine learning workflow and includes the api of the Flink AI Flow.

  2. Notification Service: It provides event listening and notification functions.

  3. Meta Service: It saves the meta data of the machine learning workflow.

  4. Event-Based Scheduler: It is a scheduler that triggered jobs by some events happened.

Documentation

QuickStart

You can use Flink AI Flow according to the guidelines of Quick Start. Besides, you can also take a look at our Tutorial for writing your own workflow.

API

Please refer to the API Documentation to find the API supported by Flink AI Flow.

Design

If you are interested in design principle of Flink AI Flow, please see the Design Documentation for more details.

Examples

You can refer to some examples of Flink AI Flow to have a better understanding of how to write a workflow. Please see the Examples directory.

Reporting bugs

If you encounter any issues please open an issue in github and we encourage you to provide a patch through github pull request as well.

Contributing

We happily welcome contributions to Flink AI Flow. Please see our contribution guide for details.

Contact Us

For more information, you can join the Flink AI Flow Users Group on DingTalk to contact us. The number of the DingTalk group is 35876083.

You can also join the group by scanning the QR code below:

Comments
  • [Notification] Support CLI tool for notification server

    [Notification] Support CLI tool for notification server

    What is the purpose of the change

    Support CLI tool for notification server

    Brief change log

    (for example:)

    • Support CLI tool for notification server

    Verifying this change

    (Please pick either of the following options)

    This change added tests.

    opened by jiangxin369 2
  • [Notification] Add notification cli server command

    [Notification] Add notification cli server command

    What is the purpose of the change

    Add notification cli server command #170

    Brief change log

    • Add notification cli server command

    Verifying this change

    This change added tests.

    opened by Sxnan 2
  • [Airflow]Fix remote log handler bug

    [Airflow]Fix remote log handler bug

    What is the purpose of the change

    Fix #172

    Brief change log

    • Change the log_relative_path in remote log handler like S3, GCS, Azure when they init the context.
    • Add get_provider_info.py to pass airflow test cases.
    • Modified old test case in providers.

    Verifying this change

    This change has modified old test case to fit these new remote log handlers.

    opened by aqua7regia 2
  • [Airflow] The remote logging supports the current log mechanism

    [Airflow] The remote logging supports the current log mechanism

    FileTaskHandler, a python log handler that handles and reads task instance logs, supports the current log mechanism which uses seq_num and try_number to name the log file. The remote log handler like S3TaskHandler should support the log mechanism.

    bug Airflow 
    opened by SteNicholas 2
  • [hotfix] fix typo of spark operator

    [hotfix] fix typo of spark operator

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Daily uploading packages to PyPI

    Daily uploading packages to PyPI

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Add Tutorial Example

    Add Tutorial Example

    Brief change log

    (for example:)

    • Add tutorial example and docs
    • Each workflow execution should have its own working directory to avoid file overriding

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Add documentations

    Add documentations

    What is the purpose of the change

    (For example: This pull request makes user can stop workflow externally.)

    Brief change log

    (for example:)

    • Send StopDagEvent to Airflow scheduler via notification service
    • Once received StopDagEvent, Airflow scheduler kills all running dag runs

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    (or)

    This change is already covered by existing tests, such as (please describe tests).

    (or)

    This change added tests.

    opened by jiangxin369 1
  • Refactor Notification Event

    Refactor Notification Event

    What is the purpose of the change

    Refactor Notification Event

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Fix unittests about existed namespace

    Fix unittests about existed namespace

    What is the purpose of the change

    Fix unittests about existed namespace

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Add docs about concepts

    Add docs about concepts

    What is the purpose of the change

    Add docs about concepts

    Verifying this change

    (Please pick either of the following options)

    This change is a trivial rework / code cleanup without any test coverage.

    opened by jiangxin369 1
  • Unifying TaskStatusEvent and TaskStatusChangedEvent

    Unifying TaskStatusEvent and TaskStatusChangedEvent

    Describe the bug

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

    Steps to reproduce the behavior (if you can):

    1. Submit a '...'
    2. Click on '....'
    3. See error

    Expected behavior

    Actual behavior

    Screenshots

    Additional context

    opened by jiangxin369 0
  • Workflow Execution Status Incorrect

    Workflow Execution Status Incorrect

    Describe the bug

    task1 action_on_task_status(task2, success) when task1 failed, task2 would not be scheduled, but the status of workflow execution is still running. I think the status shoud be failed

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

    Steps to reproduce the behavior (if you can):

    1. Submit a '...'
    2. Click on '....'
    3. See error

    Expected behavior

    Actual behavior

    Screenshots

    Additional context

    opened by jiangxin369 0
  • AIFlow support python3.6

    AIFlow support python3.6

    Describe the feature

    AIFlow support python3.6

    Describe the solution you'd like

    Describe alternatives you've considered

    Additional context

    When run aiflow with python3.6, notification server blocks wich following logs:

    [2022-07-25 14:07:07,682 - server.py:68 [MainThread] - INFO: Notification server started.
    [2022-07-25 14:07:18,780 - server.py:189 [Thread-1] - ERROR: Lock is not acquired.
    Traceback (most recent call last):
      File "/root/venv_for_aiflow/lib64/python3.6/site-packages/notification_service/server.py", line 185, in _call_behavior_async
        return await behavior(argument, context), True
      File "/usr/lib64/python3.6/asyncio/coroutines.py", line 225, in coro
        res = yield from await_meth()
      File "/root/venv_for_aiflow/lib64/python3.6/site-packages/notification_service/service.py", line 221, in _list_all_events
        pass
      File "/usr/lib64/python3.6/asyncio/coroutines.py", line 212, in coro
        res = func(*args, **kw)
      File "/usr/lib64/python3.6/asyncio/locks.py", line 86, in __aexit__
        self.release()
      File "/usr/lib64/python3.6/asyncio/locks.py", line 207, in release
        raise RuntimeError('Lock is not acquired.')
    RuntimeError: Lock is not acquired.
    
    opened by jiangxin369 0
  • action_on_event_received only process events sent by current workflow execution

    action_on_event_received only process events sent by current workflow execution

    Describe the feature

    Currently, the event is broadcasting, once the wanted event is sent, all workflow executions of this workflow would be triggered.

    Describe the solution you'd like

    As the scheduler dispatcher would inspect the context of each event to figure out if it contains the workflow execution id, we can inject the runtime context of each task execution to the user-defined event to make sure the event will only effect on the current workflow execution.

    We need the following changes:

    • Add a global variable _CURRENT_TASK_CONTEXT in context.py, it is used to store the runtime context of each task execution.
    • To read and write the _CURRENT_TASK_CONTEXT, add get_runtime_task_context and set_runtime_task_context functions.
    • Add a public API called wrap_execution_context to inject the runtime info to the context of the event before sending it.
    _CURRENT_TASK_CONTEXT: TaskExecutionContext = None
    
    
    def set_runtime_task_context(context: TaskExecutionContext):
        global _CURRENT_TASK_CONTEXT
        _CURRENT_TASK_CONTEXT = context
    
    
    def get_runtime_task_context():
        return _CURRENT_TASK_CONTEXT
    
    
    def wrap_execution_info_to_context(event: Event):
        """
      The event whose context is wrapped with workflow execution info would only be processed by specific workflow execution.
      """
        pass
    

    How to use it?

    def func():
        notification_client = get_notification_client()
        event = Event(event_key=EVENT_KEY, message='This is a custom message.')
        
        // wrap event with context
        wrap_execution_info_to_context(event)
        // send event
        notification_client.send_event(event)
    
    with Workflow(name='workflow') as w1:
        task = PythonOperator(name='task', python_callable=func)
    

    Describe alternatives you've considered

    Additional context

    opened by jiangxin369 0
  • [NotificationService] Cannot use one EmbeddedNotificationClient instance in multiple threads

    [NotificationService] Cannot use one EmbeddedNotificationClient instance in multiple threads

    Describe the bug

    Cannot use one EmbeddedNotificationClient instance in multiple threads.

    Your environment

    Operating system

    Database

    Python version

    To Reproduce

        def test_create_client_multiple_threads(self):
            import threading
            from notification_service.embedded_notification_client import EmbeddedNotificationClient
            from notification_service.event import Event, EventKey
    
            client = EmbeddedNotificationClient(server_uri="localhost:50052", namespace='default', sender='sender')
    
            def send_event():
                event = Event(event_key=EventKey('key1'), message='a')
                client.send_event(event)
                print(client.sequence_number)
            threads = []
            for i in range(100):
                thread = threading.Thread(target=send_event)
                threads.append(thread)
            for t in threads:
                t.start()
            for t in threads:
                t.join()
    

    Expected behavior

    100 events sent.

    Actual behavior

    less than 100 events sent.

    Screenshots

    Additional context

    opened by jiangxin369 0
Releases(release-0.3.1)
  • release-0.3.1(Feb 22, 2022)

    Features

    1. Flink job plugin supports the multiple Flink versions #236
    2. Support stopping/resuming job scheduling #241
    3. Support log view of job execution for frontend #251
    4. Improve documentation

    Bug Fixes

    1. Airflow operator context does not set correctly #250
    2. Failed to trigger workflow execution #260
    3. Fix update the deployed model version multiple times #269
    4. Fix register_model_version sending wrong event type bug #290

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
  • release-0.3.0(Dec 16, 2021)

    Features

    AIFlow

    1. Introduces the command-line interface to help operation.
    2. Support database version migration.
    3. Supports the worflow development on the Jupyter Notebook #140

    Notification Service

    1. Introduces the command-line interface to help operation.
    2. Support database version migration.

    Bug Fixes

    1. The remote logging supports the current log mechanism #172
    2. Unittest failed cause by init_ai_flow_context #185
    3. AIFlow Webserver cannot sort model version by version #207

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
    examples.tar.gz(32.95 MB)
  • release-0.2.2(Nov 19, 2021)

    Features

    AIFlow

    1. Add the documents of the AIFlow. #117

    Airflow

    1. Allow LocalExecutor to run with SQLite. #41
    2. Airflow webserver supports the Airflow and Notification databases. #49

    Notification Service

    1. Introduces the countEvents interface. #11

    Bug Fixes

    1. Fix oss blob manager download concurrently. #3
    2. Deepcopy tasks in celery executor to avoid race condition. #12
    3. Fix periodic workflow cannot run. #77
    4. Fix HDFSBlobManager failed to download existed file. #87
    5. Removes the uncompleted api action_on_dataset_event. #104
    6. The workflow directory is set incorrect in AIFlow runtime. #111

    Welcome to use this release version and give us the feedback of the AIFlow.

    Source code(tar.gz)
    Source code(zip)
    examples.tar.gz(32.93 MB)
  • release-0.2.0(Feb 9, 2022)

    Features

    AIFlow

    1. Add workflow execution on event and ContextExtractor API #476
    2. Add task execution restful api #478
    3. AIFlow add WorkflowEventManager to listen and handle events #492
    4. Support start new workflow execution with context #479
    5. Introduce the workflow frontend of the AIFlow UI #509
    6. Add FlinkSqlProcessor #527
    7. Support job execution label #529
    8. Frontend support metadata ui #533
    9. Notification service supports the idempotence #553
    10. Add read-only job plugin #555

    Airflow

    1. Support celery executor on event based scheduler #482
    2. Add AirFlowScheduler with airflow restful api #486

    Bug Fixes

    1. Make AI Flow be able to use Notification Service with HA enabled #510
    2. Duplicated entry when create dagrun #570
    3. EventBaseScheduler catches and prints exceptions #586
    4. Scheduler should find schedulable tasks once dagrun finished #587
    5. EventBaseScheduler would trigger task multiple times incorrectly #598
    Source code(tar.gz)
    Source code(zip)
    ai-flow-examples.tar.gz(32.93 MB)
  • release-0.1.0(Feb 9, 2022)

Owner
A neutral organization to host ecosystem projects for Apache Flink
TakeInfoatNistforICS - Take Information in NIST NVD for ICS

Take Information in NIST NVD for ICS This project developed with Python. When yo

5 Sep 05, 2022
Equivariant CNNs for the sphere and SO(3) implemented in PyTorch

Equivariant CNNs for the sphere and SO(3) implemented in PyTorch

Jonas Köhler 893 Dec 28, 2022
Rocket-recycling with Reinforcement Learning

Rocket-recycling with Reinforcement Learning Developed by: Zhengxia Zou I have long been fascinated by the recovery process of SpaceX rockets. In this

Zhengxia Zou 202 Jan 03, 2023
Replication of Pix2Seq with Pretrained Model

Pretrained-Pix2Seq We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and c

peng gao 51 Nov 22, 2022
A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.

Layer-wise Relevance Propagation (LRP) in PyTorch Basic unsupervised implementation of Layer-wise Relevance Propagation (Bach et al., Montavon et al.)

Kai Fabi 28 Dec 26, 2022
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

IELab@ Korea University 74 Dec 28, 2022
Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.

FixMyPose / फिक्समाइपोज़ Code and dataset for AAAI 2021 paper "FixMyPose: Pose Correctional Describing and Retrieval" Hyounghun Kim*, Abhay Zala*, Grah

4 Sep 19, 2022
Code release of paper Improving neural implicit surfaces geometry with patch warping

NeuralWarp: Improving neural implicit surfaces geometry with patch warping Project page | Paper Code release of paper Improving neural implicit surfac

François Darmon 167 Dec 30, 2022
The "breathing k-means" algorithm with datasets and example notebooks

The Breathing K-Means Algorithm (with examples) The Breathing K-Means is an approximation algorithm for the k-means problem that (on average) is bette

Bernd Fritzke 75 Nov 17, 2022
Image marine sea litter prediction Shiny

MARLITE Shiny app for floating marine litter detection in aerial images. This directory contains the instructions and software needed to install the S

19 Dec 22, 2022
A curated list of neural network pruning resources.

A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.

Yang He 1.7k Jan 09, 2023
Writeups for the challenges from DownUnderCTF 2021

cloud Challenge Author Difficulty Release Round Bad Bucket Blue Alder easy round 1 Not as Bad Bucket Blue Alder easy round 1 Lost n Found Blue Alder m

DownUnderCTF 161 Dec 31, 2022
FS2KToolbox FS2K Dataset Towards the translation between Face

FS2KToolbox FS2K Dataset Towards the translation between Face -- Sketch. Download (photo+sketch+annotation): Google-drive, Baidu-disk, pw: FS2K. For

Deng-Ping Fan 5 Jan 03, 2023
UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering

UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering This repository holds all the code and data for our recent work on

Mohamed El Banani 118 Dec 06, 2022
Resources complimenting the Machine Learning Course led in the Faculty of mathematics and informatics part of Sofia University.

Machine Learning and Data Mining, Summer 2021-2022 How to learn data science and machine learning? Programming. Learn Python. Basic Statistics. Take a

Simeon Hristov 8 Oct 04, 2022
Neural Network to colorize grayscale images

#colornet Neural Network to colorize grayscale images Results Grayscale Prediction Ground Truth Eiji K used colornet for anime colorization Sources Au

Pavel Hanchar 3.6k Dec 24, 2022
Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

Trieu 6.1k Jan 04, 2023
A collection of IPython notebooks covering various topics.

ipython-notebooks This repo contains various IPython notebooks I've created to experiment with libraries and work through exercises, and explore subje

John Wittenauer 2.6k Jan 01, 2023
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson Estécio Marcílio Júnior 160 Jan 06, 2023