This is an analysis and prediction project for house prices in King County, USA based on certain features of the house

Last update: Jan 21, 2022

Overview

This is a project for analysis and estimation  of House Prices in King County USA
The .csv file contains the data of the house and the .ipynb file contians the analysis and code 
This project is done on Jupyter notebook
The project uses Linear Regression and Pipeline() to fit and predict the prices.

Owner

Amit Prakash

CSE Student at SRM Institute of Science and Technology

GitHub Repository

Spaghetti: an open-source Python library for the analysis of network-based spatial data

pysal/spaghetti SPAtial GrapHs: nETworks, Topology, & Inference Spaghetti is an open-source Python library for the analysis of network-based spatial d

203 Jan 03, 2023

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

gg I wasn't satisfied with any of the other available Gemini clients, so I wrote my own. Requires Python 3.9 (maybe older, I haven't checked) and opti

5 Jan 03, 2023

Get mutations in cluster by querying from LAPIS API

Cluster Mutation Script Get mutations appearing within user-defined clusters. Usage Clusters are defined in the clusters dict in main.py: clusters = {

1 Oct 22, 2021

Picka: A Python module for data generation and randomization.

Picka: A Python module for data generation and randomization. Author: Anthony Long Version: 1.0.1 - Fixed the broken image stuff. Whoops What is Picka

108 Nov 30, 2021

A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

2 Dec 12, 2021

Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

1 Feb 13, 2022

Shot notebooks resuming the main functions of GeoPandas

Shot notebooks resuming the main functions of GeoPandas, 2 notebooks written as Exercises to apply these functions.

1 Jan 12, 2022

Generate lookml for views from dbt models

dbt2looker Use dbt2looker to generate Looker view files automatically from dbt models. Features Column descriptions synced to looker Dimension for eac

126 Dec 28, 2022

Python utility to extract differences between two pandas dataframes.

8 Jan 07, 2023

AWS Glue ETL Code Samples

AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit

1.2k Jan 03, 2023

A DSL for data-driven computational pipelines

"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne

1.9k Jan 03, 2023

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

2.9k Jan 06, 2023

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

6 Nov 30, 2022

This is an analysis and prediction project for house prices in King County, USA based on certain features of the house

Related tags

Overview

Owner

Amit Prakash

Spaghetti: an open-source Python library for the analysis of network-based spatial data

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

Get mutations in cluster by querying from LAPIS API

Picka: A Python module for data generation and randomization.

A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Vaex library for Big Data Analytics of an Airline dataset

Shot notebooks resuming the main functions of GeoPandas

Generate lookml for views from dbt models

Python utility to extract differences between two pandas dataframes.

AWS Glue ETL Code Samples

A DSL for data-driven computational pipelines

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

This tool parses log data and allows to define analysis pipelines for anomaly detection.

The micro-framework to create dataframes from functions.

SparseLasso: Sparse Solutions for the Lasso

Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Pipeline and Dataset helpers for complex algorithm evaluation.

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.

Airflow ETL With EKS EFS Sagemaker

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications