Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

Overview

time-series-kafka-demo

img

Mock stream producer for time series data using Kafka.

I walk through this tutorial and others here on GitHub and on my Medium blog. Here is a friend link for open access to the article on Towards Data Science: Make a mock “real-time” data stream with Python and Kafka. I'll always add friend links on my GitHub tutorials for free Medium access if you don't have a paid Medium membership (referral link).

If you find any of this useful, I always appreciate contributions to my Saturday morning fancy coffee fund!

This repo demos how to convert a csv file of timestamped data into a real-time stream useful for testing streaming analytics. An example input file with random time series data and a script for generating the file are included in the data directory.

The producer and consumer Python scripts use Confluent's Kafka client for Python, which is installed in the Docker image built with the accompanying Dockerfile, if you choose to use it.

Requires Docker and Docker Compose.

Usage

Clone repo and cd into directory.

git clone https://github.com/mtpatter/time-series-kafka-demo.git
cd time-series-kafka-demo

Start the Kafka broker

docker compose up --build

Build a Docker image (optionally, for the producer and consumer)

From the main root directory:

docker build -t "kafkacsv" .

If you want to use Docker for the python scripts, this should now work:

docker run -it --rm kafkacsv python bin/sendStream.py -h

Start a consumer

To start a consumer for printing all messages in real-time from the stream "my-stream":

python bin/processStream.py my-stream

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/processStream.py my-stream

Produce a time series stream

Send time series from data/data.csv to topic “my-stream”, and speed it up by a factor of 10.

python bin/sendStream.py data/data.csv my-stream --speed 10

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/sendStream.py data/data.csv my-stream --speed 10

Shut down and clean up

Stop the consumer with Return and Ctrl+C.

Shutdown Kafka broker system:

docker compose down
Project documentation with Markdown.

MkDocs Project documentation with Markdown. View the MkDocs documentation. Project release notes. Visit the MkDocs wiki for community resources, inclu

MkDocs 15.6k Jan 02, 2023
300+ Python Interview Questions

300+ Python Interview Questions

Pradeep Kumar 1.1k Jan 02, 2023
A Python Package To Generate Strong Passwords For You in Your Projects.

shPassGenerator Version 1.0.6 Ready To Use Developed by Shervin Badanara (shervinbdndev) on Github Language and technologies used in This Project Work

Shervin 11 Dec 19, 2022
In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqtrade so much yet.

My Freqtrade stuff In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqt

Simon Kebekus 104 Dec 31, 2022
A markdown wiki and dashboarding system for Datasette

datasette-notebook A markdown wiki and dashboarding system for Datasette This is an experimental alpha and everything about it is likely to change. In

Simon Willison 19 Apr 20, 2022
A website for courses of Major Computer Science, NKU

A website for courses of Major Computer Science, NKU

Sakura 0 Oct 06, 2022
Template repo to quickly make a tested and documented GitHub action in Python with Poetry

Python + Poetry GitHub Action Template Getting started from the template Rename the src/action_python_poetry package. Globally replace instances of ac

Kevin Duff 89 Dec 25, 2022
Parser manager for parsing DOC, DOCX, PDF or HTML files

Parser manager Description Parser gets PDF, DOC, DOCX or HTML file via API and saves parsed data to the database. Implemented in Ruby 3.0.1 using Acti

Эдем 4 Dec 04, 2021
Data science on SDGs - Udemy Online Course Material: Data Science on Sustainable Development Goals

Data Science on Sustainable Development Goals (SDGs) Udemy Online Course Material: Data Science on Sustainable Development Goals https://bit.ly/data_s

Frank Kienle 1 Jan 04, 2022
Swagger UI is a collection of HTML, JavaScript, and CSS assets that dynamically generate beautiful documentation from a Swagger-compliant API.

Introduction Swagger UI allows anyone — be it your development team or your end consumers — to visualize and interact with the API’s resources without

Swagger 23.2k Dec 29, 2022
PowerApps-docstring is a console based, pipeline ready application that automatically generates user and technical documentation for Power Apps.

powerapps-docstring PowerApps-docstring is a console based, pipeline ready application that automatically generates user and technical documentation f

Sebastian Muthwill 30 Nov 23, 2022
Sms Bomber, Tool Encryptor

ɴᴏʙɪᴛᴀシ︎ ғᴏʀ ᴀɴʏ ʜᴇʟᴘシ︎ Install pkg install git -y pkg install python -y pip install requests git clone https://github.com/AK27HVAU/akash Run cd Akash

ɴᴏʙɪᴛᴀシ︎ 4 May 23, 2022
Some code that takes a pipe-separated input and converts that into a table!

tablemaker A program that takes an input: a | b | c # With comments as well. e | f | g h | i |jk And converts it to a table: ┌───┬───┬────┐ │ a │ b │

CodingSoda 2 Aug 30, 2022
Use Brainf*ck with python!

Brainfudge Run Brainf*ck code with python! Classes Interpreter(array_len): encapsulate all functions into class __init__(self, array_len: int=30000) -

1 Dec 14, 2021
A course-planning, course-map rendering and GPA-calculation web service, designed for the SFU (Simon Fraser University) student.

SFU Course Planner What is the overall goal of the project (i.e. what does it do, or what problem is it solving)? As the title suggests, this project

Ash Peng 1 Oct 21, 2021
Flask-Rebar combines flask, marshmallow, and swagger for robust REST services.

Flask-Rebar Flask-Rebar combines flask, marshmallow, and swagger for robust REST services. Features Request and Response Validation - Flask-Rebar reli

PlanGrid 223 Dec 19, 2022
Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

data-science-on-gcp Source code accompanying book: Data Science on the Google Cloud Platform, 2nd Edition Valliappa Lakshmanan O'Reilly, Jan 2022 Bran

Google Cloud Platform 1.2k Dec 28, 2022
The blazing-fast Discord bot.

Wavy Wavy is an open-source multipurpose Discord bot built with pycord. Wavy is still in development, so use it at your own risk. Tools and services u

Wavy 7 Dec 27, 2022
Speed up Sphinx builds by selectively removing toctrees from some pages

Remove toctrees from Sphinx pages Improve your Sphinx build time by selectively removing TocTree objects from pages. This is useful if your documentat

Executable Books 8 Jan 04, 2023
Documentation generator for C++ based on Doxygen and mosra/m.css.

mosra/m.css is a Doxygen-based documentation generator that significantly improves on Doxygen's default output by controlling some of Doxygen's more unruly options, supplying it's own slick HTML+CSS

Mark Gillard 109 Dec 07, 2022