Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Overview

Feature-Engineering

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared.

When the dataset is passed through this script, the modeling starts. expected to be ready.

Dataset Story

The data set is the data set of the people who were in the Titanic shipwreck. It consists of 768 observations and 12 variables. The target variable is specified as "Survived"; 1: one's survival, 0: indicates the person's inability to survive.

Variables

PassengerId: ID of the passenger

  • Survived: Survival status (0: not survived, 1: survived)
  • Pclass: Ticket class (1: 1st class (upper), 2: 2nd class (middle), 3: 3rd class(lower))
  • Name: Name of the passenger
  • Sex: Gender of the passenger (male, female)
  • Age: Age in years
  • Sibsp: Number of siblings/spouses aboard the Titanic
    • Sibling = Brother, sister, stepbrother, stepsister
    • Spouse = Husband, wife (mistresses and fiances were ignored) Parch: Number of parents/children aboard the Titanic
    • Parent = Mother, father
    • Child = Daughter, son, stepdaughter, stepson
    • Some children travelled only with a nanny , therefore Parch = 0 for them.
  • Ticket: Ticket number
  • Fare: Passenger fare
  • Cabin: Cabin number
  • Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

REFERENCE: Data Science and ML Boot Camp, 2021, Veri Bilimi Okulu (https://www.veribilimiokulu.com/)

Owner
kemalgunay
Ph.D | Data Science Researcher
kemalgunay
Uplift modeling and causal inference with machine learning algorithms

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Uber Open Source 3.7k Jan 07, 2023
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

2 Jun 14, 2022
CobraML: Completely Customizable A python ML library designed to give the end user full control

CobraML: Completely Customizable What is it? CobraML is a python library built on both numpy and numba. Unlike other ML libraries CobraML gives the us

Sriram Govindan 14 Dec 19, 2021
Client - πŸ”₯ A tool for visualizing and tracking your machine learning experiments

Weights and Biases Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to produ

Weights & Biases 5.2k Jan 03, 2023
Extreme Learning Machine implementation in Python

Python-ELM v0.3 --- ARCHIVED March 2021 --- This is an implementation of the Extreme Learning Machine [1][2] in Python, based on scikit-learn. From

David C. Lambert 511 Dec 20, 2022
Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.

Tangram Website | Discord Tangram makes it easy for programmers to train, deploy, and monitor machine learning models. Run tangram train to train a mo

Tangram 1.4k Jan 05, 2023
Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

SDK: Overview of the Kubeflow pipelines service Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on

Kubeflow 3.1k Jan 06, 2023
2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

Fluid Simulation Usage Download this repo and store it in your computer. Open a terminal and go to the root directory of this folder. Make sure you ha

Mariana Ávalos Arce 5 Dec 02, 2022
MLR - Machine Learning Research

Machine Learning Research 1. Project Topic 1.1. Exsiting research Benmark: https://paperswithcode.com/sota ACL anthology for NLP papers: http://www.ac

Charles 69 Oct 20, 2022
Decision Weights in Prospect Theory

Decision Weights in Prospect Theory It's clear that humans are irrational, but how irrational are they? After some research into behavourial economics

Cameron Davidson-Pilon 32 Nov 08, 2021
Data Efficient Decision Making

Data Efficient Decision Making

Microsoft 197 Jan 06, 2023
Scikit-Garden or skgarden is a garden for Scikit-Learn compatible decision trees and forests.

Scikit-Garden or skgarden (pronounced as skarden) is a garden for Scikit-Learn compatible decision trees and forests.

260 Dec 21, 2022
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Model Serving Made Easy BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. Supports multi

BentoML 4.4k Jan 04, 2023
Learn how to responsibly deliver value with ML.

Made With ML Applied ML Β· MLOps Β· Production Join 30K+ developers in learning how to responsibly deliver value with ML. πŸ”₯ Among the top MLOps reposit

Goku Mohandas 32k Dec 30, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 03, 2023
PyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors.

PyNNDescent PyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors. It provides a python implementation of Nearest Neighbo

Leland McInnes 699 Jan 09, 2023
Python-based implementations of algorithms for learning on imbalanced data.

ND DIAL: Imbalanced Algorithms Minimalist Python-based implementations of algorithms for imbalanced learning. Includes deep and representational learn

DIAL | Notre Dame 220 Dec 13, 2022
A simple machine learning python sign language detection project.

SST Coursework 2022 About the app A python application that utilises the tensorflow object detection algorithm to achieve automatic detection of ameri

Xavier Koh 2 Jun 30, 2022
Traingenerator πŸ§™ A web app to generate template code for machine learning ✨

Traingenerator πŸ§™ A web app to generate template code for machine learning ✨ πŸŽ‰ Traingenerator is now live! πŸŽ‰

Johannes Rieke 1.2k Jan 07, 2023
Distributed deep learning on Hadoop and Spark clusters.

Note: we're lovingly marking this project as Archived since we're no longer supporting it. You are welcome to read the code and fork your own version

Yahoo 1.3k Dec 28, 2022