Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Last update: Jan 20, 2022

Overview

Streaming Data Pipeline - Kafka + ELK Stack

Streaming weather data using Apache Kafka and Elastic Stack.

Data source: https://openweathermap.org/api

Objectives: Develop a streaming data pipeline to extract weather data from OpenWeather API using Apache Kafka, Logstash, Elasticserach and Kibana (Kafka + ELK Stack).

To summarize, Python was used to develop a Kakfa producer that requests weather data from OpenWeather API every minute and sends it as a message to Apache Kafka. Logstash, as a Kafka consumer, consumes the data and stores it into Elasticsearch. Kibana uses the data from Elasticsearch to display the dashboard.

Kibana Weather Dashboard

Steps:

bash elk/start_elastic_docker.sh
bash kafka/start_kafka_docker.sh
Create a topic using kafka manager: localhost:9000

Logstash installed locally*

$LOGSTASH_HOME/bin/logstash -f $LOGSTASH_HOME/config/pipeline.conf

Before running Kafka Producer, is needed to set the API key inside the weather_api_key.ini file*

python3 weather_kfk_producer.py
Access Kibana: localhost:5601
Create an index pattern: must match with your index name inside pipeline.conf
Develop your dashboard.

Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Related tags

Overview

Streaming Data Pipeline - Kafka + ELK Stack

Kibana Weather Dashboard

Steps:

Owner

Felipe Demenech Vasconcelos

Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

Very useful and necessary functions that simplify working with data

Synthetic Data Generation for tabular, relational and time series data.

Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

This cosmetics generator allows you to generate the new Fortnite cosmetics, Search pak and search cosmetics!

Developed for analyzing the covariance for OrcVIO

Option Pricing Calculator using the Binomial Pricing Method (No Libraries Required)

Desafio 1 ~ Bantotal

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Lale is a Python library for semi-automated data science.

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

A computer algebra system written in pure Python

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

A set of functions and analysis classes for solvation structure analysis

A columnar data container that can be compressed.

Geospatial data-science analysis on reasons behind delay in Grab ride-share services

Get mutations in cluster by querying from LAPIS API

Pyspark Spotify ETL