Diabetes Prediction with Logistic Regression

Overview

Diabetes Prediction with Logistic Regression

  1. Exploratory Data Analysis
  2. Data Preprocessing
  3. Model & Prediction
  4. Model Evaluation
  5. Model Validation: Holdout
  6. Model Validation: 10-Fold Cross Validation
  7. Prediction for A New Observation

Business Problem

Characteristics of people with diabetes will be able to predict whether they have a patient or not it is desirable to develop a machine learning model.

Dataset Story

The data set is part of a large data set maintained at the National Institutes of Diabetes-dIgestive-Kidney Diseases in the United States. this data used for a diabetes study conducted on Pima Indian women aged 21 years and older living in the city of Phoenix, which is their city. The data consists of 768 observations and 8 numerical independent variables. The target variable is specified as "output";

1 diabetes test result is positive, 0 indicates that it is negative.

Variables

  • Pregnancies: Number of pregnancies
  • Glucose: 2 Hours plasma glucose concentration in the oral glucose tolerance test
  • Blood Pressure: mm Hg
  • SkinThickness:
  • Insulin: 2 Hours serum insulin (mu U/ml)
  • DiabetesPedigreeFunction
  • Age: years
  • Outcome: Having diabete (1) or not (0)

In this study, the diabetes data set was reviewed and it was tried to predict whether a person has diabetes with a Logistic Regression model. Firstly, the dependent variable "outcome" was reviewed in the study. In the last step, new variables were produced and the success of the model was tried to be increased. The accuracy rate and F1 score of the established model were determined as 0.63 and the AUC value was determined as 0.84. Finally, it was estimated by the established model whether a randomly selected person has diabetes or not.

Owner
AZİZE SULTAN PALALI
Doping Hafıza | Data Analyst | Data Science and Machine Learning Bootcamp Participant at Veri Bilimi Okulu
AZİZE SULTAN PALALI
Responsible Machine Learning with Python

Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.

ph_ 624 Jan 06, 2023
ETNA – time series forecasting framework

ETNA Time Series Library Predict your time series the easiest way Homepage | Documentation | Tutorials | Contribution Guide | Release Notes ETNA is an

Tinkoff.AI 675 Jan 08, 2023
CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

ZhihuiYangCS 8 Jun 07, 2022
This is a Machine Learning model which predicts the presence of Diabetes in Patients

Diabetes Disease Prediction This is a machine Learning mode which tries to determine if a person has a diabetes or not. Data The dataset is in comma s

Edem Gold 4 Mar 16, 2022
InfiniteBoost: building infinite ensembles with gradient descent

InfiniteBoost Code for a paper InfiniteBoost: building infinite ensembles with gradient descent (arXiv:1706.01109). A. Rogozhnikov, T. Likhomanenko De

Alex Rogozhnikov 183 Jan 03, 2023
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort

Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort

2.3k Jan 04, 2023
LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms

LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms Based on the work by Smith et al. (2021) Query

5 Aug 06, 2022
Simple Machine Learning Tool Kit

Getting started smltk (Simple Machine Learning Tool Kit) package is implemented for helping your work during data preparation testing your model The g

Alessandra Bilardi 1 Dec 30, 2021
The Ultimate FREE Machine Learning Study Plan

The Ultimate FREE Machine Learning Study Plan

Patrick Loeber (Python Engineer) 2.5k Jan 05, 2023
A GitHub action that suggests type annotations for Python using machine learning.

Typilus: Suggest Python Type Annotations A GitHub action that suggests type annotations for Python using machine learning. This action makes suggestio

40 Sep 18, 2022
Factorization machines in python

Factorization Machines in Python This is a python implementation of Factorization Machines [1]. This uses stochastic gradient descent with adaptive re

Corey Lynch 892 Jan 03, 2023
pandas, scikit-learn, xgboost and seaborn integration

pandas, scikit-learn and xgboost integration.

299 Dec 30, 2022
Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Mert Sezer Ardal 1 Jan 31, 2022
Pytools is an open source library containing general machine learning and visualisation utilities for reuse

pytools is an open source library containing general machine learning and visualisation utilities for reuse, including: Basic tools for API developmen

BCG Gamma 26 Nov 06, 2022
Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application (with docker-compose).

Philip May 2 Dec 03, 2021
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 07, 2023
Repositório para o #alurachallengedatascience1

1° Challenge de Dados - Alura A Alura Voz é uma empresa de telecomunicação que nos contratou para atuar como cientistas de dados na equipe de vendas.

Sthe Monica 16 Nov 10, 2022
Official code for HH-VAEM

HH-VAEM This repository contains the official Pytorch implementation of the Hierarchical Hamiltonian VAE for Mixed-type Data (HH-VAEM) model and the s

Ignacio Peis 8 Nov 30, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learn

Vowpal Wabbit 8.1k Dec 30, 2022
Machine Learning Algorithms

Machine-Learning-Algorithms In this project, the dataset was created through a survey opened on Google forms. The purpose of the form is to find the p

Göktuğ Ayar 3 Aug 10, 2022