whylogs Workshop

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

whylogs - The open source standard for data logging (Don't forget to give it a star!)

Workshop

In this hands-on workshop, we’ll learn how to set up a system for monitoring your data pipelines, ensuring data quality and detecting changes in your data.

Without data monitoring, it’s impossible to guarantee to your stakeholders that the data that they are using for their analytics and machine learning use cases is trustworthy. By setting up a data observability system, you’ll be able to get visibility into the health of your data pipelines, thus building your customers’ trust in your work.

We’ll cover the following:

Introduction to data observability and monitoring
whylogs — the open source standard for data logging
How to monitor batch Python or Spark data pipelines with whylogs
How to monitor Kafka streaming pipelines with whylogs

By the end of this workshop, you’ll be able to set up such a system yourself.

Code

This repository contains files that are needed for the workshop:

ccloud_lib.py - file for connecting to confluent cloud
confluent_credentials.txt - template for configuration (put your credentials there - but don't commit them!)
producer.py - the code for putting events to Kafka
requirements.txt - all the dependencies for the workshop

Confluent cloud

For this workshop, you'll need

Account in Deepnote
Account in Confluent cloud (instructions)

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

Related tags

Overview

whylogs Workshop

Workshop

Code

Confluent cloud

Owner

DataTalksClub

Nested Named Entity Recognition for Chinese Biomedical Text

Nmt - TensorFlow Neural Machine Translation Tutorial

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Demo programs for the Talking Head Anime from a Single Image 2: More Expressive project.

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Count the frequency of letters or words in a text file and show a graph.

Score-Based Point Cloud Denoising (ICCV'21)

Adversarial Examples for Extreme Multilabel Text Classification

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

ETM - R package for Topic Modelling in Embedding Spaces

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

GooAQ 🥑 : Google Answers to Google Questions!

An attempt to map the areas with active conflict in Ukraine using open source twitter data.

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction