Credit Card Fraud Detection

Overview

Credit Card Fraud Detection

image


For this project, I used the datasets from the kaggle competition called IEEE-CIS Fraud Detection. The competition aims to improve fraud prevention system by building fraud detection models based on Vesta Corporation's real-world e-commerce transactional data, which contains information from device type to product features. My personal goal for this project is to not only explore the data and build models, but to also build an API server with retrainable model. To achieve this goal, I used fastapi, lightgbm and ray tune.

Also, I decided to develop this project to be the same as how data-related projects are developed in real-world scenarios, wherein the end goal of development is a project that is feasible for production. Therefore, I have put efforts on creating:

  1. Exploratory Data Analysis (EDA) in the notebooks/ folder;
  2. An API Server inside the api/ folder;
  3. Files for deployment such as Dockerfile and docker-compose.yml;
  4. Documentations in the docs/ folder; and
  5. Some necessary scripts in scripts/ folder.

Services

I have two services: app and script.

App

image The app service is a machine learning API that is open on port 8000. I used fastapi for the API server, so you can check it on http://localhost:8000/docs after you run the app service.

Scripts

image The script sends the request for the predictions on new sets of data, such as the kaggle testing data. After the script get all the responses, files will be written on /tmp/submission.csv (on host and container), but this part can take a lot of time. It is suggested to use docker logs -f lightgbm-project-demo_script_1 to check the progress of the process.

How to run this demo

1. Install the requirements

  • docker
  • docker-compose
  • make

2. Download the datasets

Option 1.

Setup kaggle API and use

make init-data

Option 2.

  1. Create a data folder: i.e.
mkdir data
  1. Download the data from kaggle IEEE-CIS Fraud Detection.

  2. Put the ieee-fraud-detection.zip inside the data/ folder.

  3. Unzip ieee-fraud-detection.zip.

3. Build the image

make build

4. Start the services

Start both the two services

docker-compose up

or only start the app service using

docker-compose up app

image

Here are some documentations

How to set up the working environment for this project

API example

Note

If you want to change the hyperparameters search space, you can go to config.py. Or you even want to use other framework the build the model, I think this demo is detail enough as a reference for your project.

About the hyperparameter search algorithm, I am using the random search for this demo but if you want to try other searching algorithm, you can change train.py.

I have another project use Tensorflow as the back bone model. Take a look about the project lightgbm-project-demo

Owner
RayWu
RayWu
a really simple bot that send you memes from reddit to whatsapp

a really simple bot that send you memes from reddit to whatsapp want to use use it? install the dependencies with pip3 install -r requirements.txt the

pai 10 Nov 28, 2021
A simple language for new programmers and a toy language ;)

Yell An extremely simple, yet powerful language for new programmers, as well as a toy language ;) Explore the docs ยป Report Bug ยท Request Feature Yell

Yell 4 Dec 28, 2021
Random pass word generator made with python. PyQt5 module is used to design GUI.

Differences in this GUI program : Default titlebar removed Custom Minimize,Maximize and Close Buttons Drag & move window from any point Program work l

Dimuth De Zoysa 1 Jan 26, 2022
This repo created to complete the task HACKTOBER 2021, contribute now and get your special T-Shirt & Sticker. TO SUPPORT OWNER PLEASE PRESS STAR BUTTON

โค THIS REPO WILL CLOSED IN 31 OCT 00:00 โค This repository will automatically assign the hacktoberfest and hacktoberfest-accepted labels to all submitt

Rajendra Rakha 307 Dec 27, 2022
banking system with python, beginner friendly, preadvanced level

banking-system-python banking system with python, beginner friendly, preadvanced level Used topics Functions else/if/elif dicts methods parameters hol

Razi Falah 1 Feb 03, 2022
Data Science Course at Dept. of Computer Engineering, Chula 2022

2110446 Data Science Course at Chula 2022 Short links for exercises: Week1: Intro to Numpy, Pandas Numpy: https://colab.research.google.com/github/kao

Kao Panboonyuen 17 Nov 27, 2022
Paimon is a pixie (or script) who was made for anyone from {EPITECH} who are struggling with the Coding Style.

Paimon Paimon is a pixie (or script) who was made for anyone from {EPITECH} who are struggling with the Coding Style. Her goal is to assist you in you

Lyy 2 Oct 17, 2021
A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

Marshall Conover 2 Jan 12, 2022
A Klipper plugin for accurate Z homing

Stable Z Homing for Klipper A Klipper plugin for accurate Z homing This plugin provides a new G-code command, STABLE_Z_HOME, which homes Z repeatedly

Matthew Lloyd 24 Dec 28, 2022
Semantic Data Management - Property Graphs ๐Ÿ“ˆ

SDM - Lab 1 @ UPC ๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Table of contents Introduction Property Graph Dataset 1. Introduction This repo is all about what we have done in SDM lab 1

Mohammad Zain Abbas 1 Mar 20, 2022
A inspector to be able to view and edit Qt style sheet while an application is running

Qt Style Sheet Inspector An inspector widget to view and modify the style sheet of a Qt app at runtime. Usage In order to use the inspector widget on

ESSS 46 Dec 10, 2022
Cirq is a Python library for writing, manipulating, and optimizing quantum circuits and running them against quantum computers and simulators

Cirq is a Python library for writing, manipulating, and optimizing quantum circuits and running them against quantum computers and simulators. Install

quantumlib 3.6k Jan 07, 2023
Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.

Heart_Disease_Diagnostic_Analysis Objective ๐ŸŽฏ The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics

Sultan Shaikh 4 Jan 28, 2022
A python package to adjust the bias of probabilistic forecasts/hindcasts using "Mean and Variance Adjustment" method.

Documentation A python package to adjust the bias of probabilistic forecasts/hindcasts using "Mean and Variance Adjustment" method. Read documentation

1 Feb 02, 2022
Python3 Interface to numa Linux library

py-libnuma is python3 interface to numa Linux library so that you can set task affinity and memory affinity in python level for your process which can help you to improve your code's performence.

Dalong 13 Nov 10, 2022
Set named timers for cooking, watering plants, brewing tea and more.

Timer Set named timers for cooking, watering plants, brewing tea and more. About Use Mycroft when your hands are messy or you need more that the one t

OpenVoiceOS 3 Nov 02, 2022
Simple Python tool to check if there is an Office 365 instance linked to a domain.

o365chk.py Simple Python script to check if there is an Office365 instance linked to a particular domain.

Steven Harris 37 Jan 02, 2023
Blender-3D-SH-Dma-plugin - Import and export Sonic Heroes Delta Morph animations (.anm) into Blender 3D

io_scene_sonic_heroes_dma This plugin for Blender 3D allows you to import and ex

Psycrow 3 Mar 22, 2022
Solutions for the Advent of Code 2021 event.

About ๐Ÿ“‹ This repository holds all of the solution code for the Advent of Code 2021 event. All solutions are done in Python 3.9.9 and done in non-real

robert yin 0 Mar 21, 2022
Vaccine for STOP/DJVU ransomware, prevents encryption

STOP/DJVU Ransomware Vaccine Prevents STOP/DJVU Ransomware from encrypting your files. This tool does not prevent the infection itself. STOP ransomwar

Karsten Hahn 16 May 31, 2022