Investigating EV charging data

Introduction:

Got an opportunity to work with a home monitoring technology company over the last 6 months whose goal was to help people understand their household energy consumption, know what is going on in their homes, and ultimately reduce their energy footprint.

With the company business goal in mind formulated the following Problem statement:- Help equip customers with insights about their EVs energy consumption and help predict future charging behavior.

Dataset provided consisted of 1200 Households across the USA with one or more EVs of various brands at each Household. For each EV hourly energy consumption was provided. The dataset lacked consistency and was not uniformly populated for each EV during the time range. Some EVs had data for 6 months, while some EVs had data for just 3-4 hours.

The inconsistency started by deep diving into the dataset with two EDAs to understand the dataset better and explore factors that could be impacting the energy consumption of an individual car.

Some interesting insights from the EDA performed:

EDA Part 1: https://public.flourish.studio/story/1113717/

EDA Part 2: https://public.flourish.studio/story/1113715/

Please refer to EV Consumption PDF attached to see the EDA summary and Modeling Approach.

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior. Created energy consumption-wise clusters and LSTM model for future consumption insights. Designed sample dashboard views with insights and recommendations for customers.

Investigating EV charging data

Related tags

Overview

Investigating EV charging data

Owner

Yash

Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

Statistical Rethinking course winter 2022

Snakemake workflow for converting FASTQ files to self-contained CRAM files with maximum lossless compression.

Pipetools enables function composition similar to using Unix pipes.

A Python adaption of Augur to prioritize cell types in perturbation analysis.

A pipeline that creates consensus sequences from a Nanopore reads. I

Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Functional tensors for probabilistic programming

DaCe is a parallel programming framework that takes code in Python/NumPy and other programming languages

Employee Turnover Analysis

Bearsql allows you to query pandas dataframe with sql syntax.

Show you how to integrate Zeppelin with Airflow

Generate lookml for views from dbt models

A simplified prototype for an as-built tracking database with API

Flood modeling by 2D shallow water equation

PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

Data collection, enhancement, and metrics calculation.

A Python module for clustering creators of social media content into networks