We're Team Arson and we're using the power of predictive modeling to combat wildfires.

Overview

Logo We're Team Arson and we're using the power of predictive modeling to combat wildfires.

Arson Map

Inspiration

There’s been a lot of wildfires in California in recent years, and a lot of the most recent wildfires have been uncontained. The government does not have the capacity to deal with such a huge amount of wildfires so it has to pick and choose which fires to bring under control. This picking and choosing should be done based on wildfire and wind data in order to minimize the damage caused by wildfires We should also prioritize mitigating fires that can spread across many counties/ have a large chance of spreading further

What it does

Our project consists of a web app with an interactive map. We represent our wildfire as a MDP and determine how at risk counties are based on the fire location(s).

How we built it

We split the project into 2 main parts: web app and AI

Website

Artificial Intelligence

  • Represent the wildfire as a MDP (Markov Decision Process)
    • States: Counties
    • Actions: Traversing Counties
    • Probability distribution: generated from wind data
    • Transition Model: generated from wind data
    • Reward function: Uniform for every county burned (prone to change if scaled up)
  • Use bellman equation to iterate through counties and propagate the fire
    • Utility values ranging between 0 and 1 represent how at risk a county is
    • Screenshot
    • Run until utility values reach equilibrium or until 100 iterations are run
    • Gamma = 0.8
  • Represent the map as a graph
    • Counties are nodes
    • Wind speeds are edges
    • Assign each county with a risk (for reward function)
    • Spawn fires on specific counties

Challenges we ran into

Our project has a pretty large scope. We needed to develop a model and integrate it with a web app. This required extensive knowledge on AWS utilities and crisp communication between team members. The machine learning portion of this hackathon was difficult as we had to decide on what type of model to use for the wildfire and how to assign reward and utility values.

Accomplishments that we're proud of

We were able to integrate the web app with the model really quickly. This was surprising since usually connecting the pieces together will have a lot of bugs. It was also Austin, Raaj, and Romuz's first hackathons and this was a fairly complex project compared to a standard web app.

What we learned

This hackathon was a first for many of us. This was the first time any of us had implemented a machine learning model and connected it to a web app.

This was my first time at a hackathon and I couldn't have asked for better teammates than Jerry, Raaj, and Romuz. I learned so much over the last two days about machine learning, data science, React, and working as a team to help tackle some of California's greatest challenges. - Austin Rivard

As a first-year student, I have learned a lot of new skill sets while working with this team. I was happy to be a member of such an agile team. I learned numerous of new concepts, such as working with AWS, writing algorithms, and the graph data structures. - Romuz Abdulhamidov

What's next for Arson

  • Scale up to entire California to generate a better map during wildfire season
  • Generate more accurate Reward values for each county burned
  • Incorporate type 2 rewards based on R(state, action)
    • Wildfire gets bigger as it burns more land
    • Wildfire gets smaller in the presence of firefighters
  • Automatically train and deploy models by integrating real-time data for wind and wildfires

Demo

Screenshot

Owner
Jerry Lee
software engineer
Jerry Lee
For making Tagtog annotation into csv dataset

tagtog_relation_extraction for making Tagtog annotation into csv dataset How to Use On Tagtog 1. Go to Project Downloads 2. Download all documents,

hyeong 4 Dec 28, 2021
scikit-survival is a Python module for survival analysis built on top of scikit-learn.

scikit-survival scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizi

Sebastian Pölsterl 876 Jan 04, 2023
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather

Tuplex 791 Jan 04, 2023
Visions provides an extensible suite of tools to support common data analysis operations

Visions And these visions of data types, they kept us up past the dawn. Visions provides an extensible suite of tools to support common data analysis

168 Dec 28, 2022
Stitch together Nanopore tiled amplicon data without polishing a reference

Stitch together Nanopore tiled amplicon data using a reference guided approach Tiled amplicon data, like those produced from primers designed with pri

Amanda Warr 14 Aug 30, 2022
CS50 pset9: Using flask API to create a web application to exchange stocks' shares.

C$50 Finance In this guide we want to implement a website via which users can “register”, “login” “buy” and “sell” stocks, like below: Background If y

1 Jan 24, 2022
Python for Data Analysis, 2nd Edition

Python for Data Analysis, 2nd Edition Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media Buy

Wes McKinney 18.6k Jan 08, 2023
An easy-to-use feature store

A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.

ByteHub AI 48 Dec 09, 2022
Cleaning and analysing aggregated UK political polling data.

Analysing aggregated UK polling data The tweet collection & storage pipeline used in email-service is used to also collect tweets from @britainelects.

Ajay Pethani 0 Dec 22, 2021
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

PandaPy "I came across PandaPy last week and have already used it in my current project. It is a fascinating Python library with a lot of potential to

Derek Snow 527 Jan 02, 2023
A Python and R autograding solution

Otter-Grader Otter Grader is a light-weight, modular open-source autograder developed by the Data Science Education Program at UC Berkeley. It is desi

Infrastructure Team 93 Jan 03, 2023
Pandas and Dask test helper methods with beautiful error messages.

beavis Pandas and Dask test helper methods with beautiful error messages. test helpers These test helper methods are meant to be used in test suites.

Matthew Powers 18 Nov 28, 2022
Leverage Twitter API v2 to analyze tweet metrics such as impressions and profile clicks over time.

Tweetmetric Tweetmetric allows you to track various metrics on your most recent tweets, such as impressions, retweets and clicks on your profile. The

Mathis HAMMEL 29 Oct 18, 2022
A notebook to analyze Amazon Recommendation Review Dataset.

Amazon Recommendation Review Dataset Analyzer A notebook to analyze Amazon Recommendation Review Dataset. Features Calculates distinct user count, dis

isleki 3 Aug 22, 2022
Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine Intro This repo contains the python/stan version of the Statistical Rethinking

Andrés Suárez 3 Nov 08, 2022
Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

Nikolas Petrou 1 Feb 13, 2022
Tools for the analysis, simulation, and presentation of Lorentz TEM data.

ltempy ltempy is a set of tools for Lorentz TEM data analysis, simulation, and presentation. Features Single Image Transport of Intensity Equation (SI

McMorran Lab 1 Dec 26, 2022
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is

6 Sep 07, 2022
Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List.

tldextract Python Module tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and s

John Kurkowski 1.6k Jan 03, 2023
Methylation/modified base calling separated from basecalling.

Remora Methylation/modified base calling separated from basecalling. Remora primarily provides an API to call modified bases for basecaller programs s

Oxford Nanopore Technologies 72 Jan 05, 2023