Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Overview

Modern Data Lake Storage Layers

This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache Iceberg, and Delta Lake.

Specifically, there's a CloudFormation template to create an EMR cluster and EMR Studio with the necessary requirements and Jupyter notebooks with the example walkthroughs.

You can view the corresponding blog post and video

Pre-requisites

You'll need an AWS Account in which you have administrator privileges and the ability to deploy a CloudFormation template. The template will create an EMR Cluster and S3 bucket that will incur charges - be sure to either shut down the cluster when done or delete the CloudFormation stack. In order to delete the CloudFormation stack, you'll need to:

  • Manually delete any EMR Studio Workspaces you created
  • Manually empty the S3 bucket created by CloudFormation
  • Manually delete the VPC created by CloudFormation due to auto-created rules

Overview

The included CloudFormation template creates a new VPC and EMR Cluster for you to be able to run the notebooks. An EMR Studio is also created and you can find the Studio URL in the Outputs tab of your CloudFormation Stack.

Once the stack is done creating, you'll need to navigate to EMR Studio and create a new workspace attached to the "data-lakes" cluster.

Inside the workspace you either upload each notebook individually from the notebooks/ folder or simply connect to this repository by using the "Git" icon on the left-hand side.

An Open-Source Discord bot created to provide basic functionality which should be in every discord guild. We use this same bot with additional configurations for our guilds.

A Discord bot completely written to be taken from the source and built according to your own custom needs. This bot supports some core features and is

Tesseract Coding 14 Jan 11, 2022
🦊 Powerfull Discord Nitro Generator

🦊 Follow me here 🦊 Discord | YouTube | Github β˜• Usage πŸ’» Downloading git clone https://github.com/KanekiWeb/Nitro-Generator/new/main pip insta

Kaneki 104 Jan 02, 2023
Multi-Branch CI/CD Pipeline using CDK Pipelines.

Using AWS CDK Pipelines and AWS Lambda for multi-branch pipeline management and infrastructure deployment. This project shows how to use the AWS CDK P

AWS Samples 36 Dec 23, 2022
A Discord bot that generates inspirational quotes & motivating messages whenever a user is sad

Encourage bot is a discord bot that allows users to randomly get Inspirational quotes messages and gives motivational encouragements whenever someone says that he's sad/depressed.

1 Nov 25, 2021
ARKHAM X GOD MULTISPAM BOT

ARKHAM-X-GOD-MULTISPAM-BOT π——π—˜π—£π—Ÿπ—’π—¬ 𝗨𝗣𝗧𝗒 30 𝗕𝗒𝗧𝗦 π—œπ—‘ 𝗔 π—¦π—œπ—‘π—šπ—Ÿ?

ArkhamXGod 2 Jan 08, 2022
A collection of discord tools I've made.

Discord A collection of discord tools i've made. What's in here? Basically every discord related project i've worked on can be found here, i'll try an

?? ?? ?? 6 Nov 13, 2021
Python client library for Bigcommerce API

Bigcommerce API Python Client Wrapper over the requests library for communicating with the Bigcommerce v2 API. Install with pip install bigcommerce or

BigCommerce 81 Dec 26, 2022
A taskbar clock for secondary taskbars on Windows 11

ElevenClock A taskbar clock for secondary taskbars on Windows 11. When microsoft's engineers were creating Windows 11, they forgot to add a clock on t

MartΓ­ Climent 1.7k Jan 07, 2023
Bringing Ethereum Virtual Machine to StarkNet at warp speed!

Warp Warp brings EVM compatible languages to StarkNet, making it possible to transpile Ethereum smart contracts to Cairo, and use them on StarkNet. Ta

Nethermind 700 Dec 26, 2022
discord token grabber using python

Discord Token Grabber A Discord token grabber written in Python 3. This version of the grabber only supports Windows. Features No local caching Transf

1 Oct 28, 2021
Discord Crypto Payment Cards Selfbot

A Discord selfbot that serves the purpose of displaying text and QR versions of your BTC, LTC & ETH payment information for easy and simple commercial or personal transactions.

2 Apr 12, 2022
Discord Bot for League of Legends live match tracker

SABot Dicord Bot for League of Legends match auto tracker Features: Search Summoners statistics in League of Legends. Auto-notifications provide when

Jungyu Choi 4 Sep 27, 2022
A discord webhook client written in Python.

DiscordWebhook A discord webhook client written in Python. Installation pip install webhook-client Example from webhook_client import WebhookClient, E

Elijah 4 Nov 28, 2022
Live Weather Updates using Flask and OpenWeather

AuraX Live Weather Updates using Flask and OpenWeather Installation To setup this project on your local machine, first clone this repository and insta

Ayush Gupta 3 Nov 02, 2021
Bavera is an extensive and extendable Python 3.x library for the Discord API

Bavera is an extensive and extendable Python 3.x library for the Discord API. Bavera boasts the following major features: Expressive, functiona

Bavera 1 Nov 17, 2021
A Python Client to View F1TV Content the right way

F1Hub is a terminal application running directly on your computer -- no connection to the website needed* *In theory. As of now, the F1TV website is needed for some content

kodos 3 Jun 14, 2022
Discord-Bot - Bot using nextcord for beginners

Discord-Bot Bot using nextcord for beginners! Requirements: 1 :- Install nextcord by typing "pip install nextcord" Thats it! You can use this code any

INFINITE_. 3 Jan 10, 2022
Framework to collect and process weather data from wttr.in.

Weathercrawler Automatic extraction and processing framework for weather data from wttr.in Installation tested with: Python 3.7.3 Python 3.9.4 git clo

Maurice GΓΌnder 0 Jul 26, 2021
Busty - A bot for the Busty Discord server

Busty Discord bot used for the Busty server. Install You'll need at least Python

Andrew Morgan 7 Dec 05, 2022
Home Assistant custom integration for controlling Powered by Tuya (PBT) devices using Tuya Open API, officially maintained by the Tuya Developer Team.

Tuya Home Assistant Integration Home Assistant custom integration for controlling Powered by Tuya (PBT) devices using Tuya Open API, officially mainta

Tuya 704 Jan 03, 2023