This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file contians the analysis and code This project is done on Jupyter notebook The project uses Linear Regression and Pipeline() to fit and predict the prices.
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house
Overview
Spaghetti: an open-source Python library for the analysis of network-based spatial data
pysal/spaghetti SPAtial GrapHs: nETworks, Topology, & Inference Spaghetti is an open-source Python library for the analysis of network-based spatial d
vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models
gg I wasn't satisfied with any of the other available Gemini clients, so I wrote my own. Requires Python 3.9 (maybe older, I haven't checked) and opti
Get mutations in cluster by querying from LAPIS API
Cluster Mutation Script Get mutations appearing within user-defined clusters. Usage Clusters are defined in the clusters dict in main.py: clusters = {
Picka: A Python module for data generation and randomization.
Picka: A Python module for data generation and randomization. Author: Anthony Long Version: 1.0.1 - Fixed the broken image stuff. Whoops What is Picka
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data
Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi
Vaex library for Big Data Analytics of an Airline dataset
Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics
Shot notebooks resuming the main functions of GeoPandas
Shot notebooks resuming the main functions of GeoPandas, 2 notebooks written as Exercises to apply these functions.
Generate lookml for views from dbt models
dbt2looker Use dbt2looker to generate Looker view files automatically from dbt models. Features Column descriptions synced to looker Dimension for eac
Python utility to extract differences between two pandas dataframes.
Python utility to extract differences between two pandas dataframes.
AWS Glue ETL Code Samples
AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit
A DSL for data-driven computational pipelines
"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
This tool parses log data and allows to define analysis pipelines for anomaly detection.
logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit
The micro-framework to create dataframes from functions.
The micro-framework to create dataframes from functions.
SparseLasso: Sparse Solutions for the Lasso
SparseLasso: Sparse Solutions for the Lasso Introduction SparseLasso provides a Scikit-Learn based estimation of the Lasso with cross-validation tunin
Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production
Numerics Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production Use procedure: Initialise a new i
Pipeline and Dataset helpers for complex algorithm evaluation.
tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst
Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.
Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day. Correlate the market activity with the Apple Keynote presentations.
Airflow ETL With EKS EFS Sagemaker
Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp
MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications
A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications