Airflow Operator for running Soda SQL scans

Overview

Soda SQL Airflow Operator

Airflow Operator for running Soda SQL scans

Example Usage

# src/soda/scans/my_scan.yml
# Note that the Airflow rendered templates are accessible (e.g. {{ params.client_id }})
table_name: tmp_{{ params.client_id }}_{{ ds_nodash }}
sql_metrics:
  - sql: |
      SELECT
        SUM(value1) AS staged_value1,
        SUM(value2) AS staged_value2
      FROM tmp_{{ params.client_id }}_{{ ds_nodash }}
  - sql: |
      SELECT
        SUM(value1) AS final_value1,
        SUM(value2) AS final_value2
      FROM final_table
      WHERE
        date = '{{ ds }}'
        AND client_id = {{ params.client_id }}
tests:
  - staged_value1 == final_value1
  - staged_value2 > final_value2
# my_airflow_dag.py
from pathlib import Path
from soda_util import build_soda_warehouse, convert_templated_yml_to_dict

SODA_PATH = Path(os.getenv("PYTHON_PATH", "/code/src")) / "/soda/scans/"  # Matches where my_scan.yml is saved

validate_staged_data = SodaSqlOperator(
    task_id="validate_staged_data",
    warehouse=build_soda_warehouse("warehouse_name", "database_name"),  # Could also pass a file path to a yml file
    scan=convert_templated_yml_to_dict(SODA_PATH, "my_scan.yml"),
    params={"client_id": 12345},  # Params are rendered by Airflow and accessible in the yaml file
)

Notes

  • Unlike Soda itself, a builder pattern is not used to define the warehouse and scan argument. Rather, the warehouse and scan parameters are instance checked and the relevant Soda methods are set. This provides a much simpler API, where we can just pass in the args to the Operator
  • As we are passing over all rendering of Jinga templates to Airflow, the native Soda templates are not accessible. So always use Airflow templates
  • Soft failures (i.e. the Airflow task doesn't fail, it just alerts) have been implemented, but alerting of soft failures has not. So soft failures will essentially just mean the Airflow task passes. Alerting to be implemented
Owner
Todd de Quincey
Data Engineer, Chartered Accountant and all round nice guy (or so I like to think). I believe that quality, simplicity and focus are the keys to success
Todd de Quincey
Odoo modules related to website/webshop

Website Apps related to Odoo it's website/webshop features: webshop_public_prices: allow configuring to hide or show product prices and add to cart bu

Yenthe Van Ginneken 9 Nov 04, 2022
LOL英雄联盟云顶之弈挂机刷代币脚本,全自动操作,智能逻辑,功能齐全。

LOL云顶之弈挂机刷代币脚本 这是2019年全球总决赛写的一个云顶挂机脚本,python完成的。 功能: 自动拿牌卖牌 策略是高星策略,非固定阵容 自动登陆账号、打码、异常重启 战利品截图上传百度云 web中控发号,改密码,查看信息等 代码是三天赶出来的,所以有点混乱,WEB中控代码也不知道扔哪去了

77 Oct 10, 2022
Pylexa - Artificial Assistant made with Python

Pylexa - Artificial Assistant made with Python Alexa is a famous artificial assistant used massively across the world. It is a substitute of Alexa whi

\_PROTIK_/ 4 Nov 03, 2021
A python trivium implemention

A python trivium implemention

tnt2402 1 Nov 12, 2021
Node editor view image node

A Blender addon to quickly view images from image nodes in Blender's image viewer.

5 Nov 27, 2022
Estimate the Market Size for Electic and Plug-In Hybrid Vehicles In Africa

Estimate the Market Size for Electic and Plug-In Hybrid Vehicles In Africa The goal of this repository is to use open data repositories to answer the

Leonce Nshuti 0 Feb 21, 2022
Blender addon that enables exporting of xmodels from blender. Great for custom asset creation for cod games

Birdman's XModel Tools For Blender Greetings everyone in the custom cod community. This blender addon should finally enable exporting of custom assets

wast 2 Jul 02, 2022
One Ansible Module for using LINE notify API to send notification. It can be required in the collection list.

Ansible Collection - hazel_shen.line_notify Documentation for the collection. ansible-galaxy collection install hazel_shen.line_notify --ignore-certs

Hazel Shen 4 Jul 19, 2021
p5 is a Python package based on the core ideas of Processing.

p5 p5 is a Python library that provides high level drawing functionality to help you quickly create simulations and interactive art using Python. It c

p5py 645 Jan 04, 2023
Repository for 2021 Computer Vision Class @ Chulalongkorn University

2110443 - Computer Vision (2021/2) Computer Vision @ Chulalongkorn University Anaconda Download Link https://www.anaconda.com/download/ Miniconda and

Chula PIC Lab 5 Jul 19, 2022
Earth-to-orbit ballistic trajectories with atmospheric resistance

Earth-to-orbit ballistic trajectories with atmospheric resistance Overview Space guns are a theoretical technology that reduces the cost of getting bu

1 Dec 03, 2021
My solutions to Advent of Code 2021 (written in Python)

Advent of Code 2021 This repository contains my solutions for the 2021 edition of Advent of Code. Please do not expect perfectly polished solutions, m

Nils 2 May 29, 2022
You will need to install a few python packages for this one.

Features Bait support Auto repair will repair every 10 catches Anti detection (still a work in progress) but using random times and click positions Pr

12 Sep 21, 2022
This is a calculator of strike price distance for options.

Calculator-of-strike-price-distance-for-options This is a calculator of strike price distance for options. Options are a type of derivative. One strat

André Luís Lopes da Silva 4 Dec 30, 2022
A few of my adventures with Devito.

Devito-playbox A few of my adventures with Devito. This repository contains a few notebooks and scripts that will lead me in the road of learning this

Átila Saraiva Quintela Soares 1 Feb 08, 2022
Un Assistente Vocale scritto in Python e altamente personalizzabile

Un Assistente Vocale scritto in Python e altamente personalizzabile

Marco 2 May 06, 2022
Python program that generates random user from API

RandomUserPy Author kirito sate #modules used requests, json, tkinter, PIL, urllib, io, install requests and PIL modules from pypi pip install Pillow

kiritosate 1 Jan 05, 2022
Glyph Metadata Palette

This plugin for Glyphs3 allows you to associate arbitrary structured metadata to each glyph in your font.

Simon Cozens 4 Jan 26, 2022
Python library to natively send files to Trash (or Recycle bin) on all platforms.

Send2Trash -- Send files to trash on all platforms Send2Trash is a small package that sends files to the Trash (or Recycle Bin) natively and on all pl

Andrew Senetar 224 Jan 04, 2023
Discord's own Dumbass made for shits n' Gigs!

FWB3 Discord's own Dumbass made for shits n' Gigs! Please note: This bot is made to be stupid and funny, If you want to get into bot development you'r

1 Dec 06, 2021