Retail-Sim is python package to easily create synthetic dataset of retaile store.

Overview

Retailer's Sale Data Simulation

Retail-Sim is python package to easily create synthetic dataset of retaile store.

Simulation Model

Simulator consists of env, that generates retailer store simulated data.

Modelling PLAN

Products

Create fake products and relationship between them. Relationship between products (Cateogries, to be more precise) consists of "exchangability", "complementarity". Products have many attributes, such as

  • Base Price
  • Base Cost
  • Volume
  • Attractiveness
  • Category
  • Price elasticity
  • Relative Consumption rate
  • Loyalty

Volume implies how much satisfaction it provieds to the customer (How much of a need it subtracts). Volume is proportional to price, which can be set with vol_price_corr.

Products are discretely grouped by some category. Each category has attribute "consumption rate", "general trend", and "seasonal trend". In real life, products such as fresh food, tissues, bottled water would have high consumption rate. General trend is random linear-like trend, seasonal trend is trend of sales that has period of 1 year. In real life, product like icecream would have winter-oriented seasonal trend.

Customers

Every customer has random set of "needs". Just as real life, you might need shampoo, pair of scissors, and some spagetti souce(All of these are considered as one category) Customers will try to fill those needs. As it happens in real life, customers are encourged to buy the product that both satisfy the needs and has a high preference.

Product's Total Attractiveness

Every product comes with the Attractiveness attribute. If it has higher attractiveness, it is more likely to sell. However,

  • If the product is on discount, it will become more attractive.
  • If the product is on discount and it is advertised to be, it will become even more attractive.
  • If the product has high loyalty, it will have very high attractiveness to some customers.
  • There might be some general trend on the attractiveness.

Therefore during simulation, total attractiveness will be defined as:

$$Total = max(\text{Attractiveness} + \text{elasticity} * \text{discounted rate}, B(loyalty) * infty)$$

Customer's state transition

Customers will buy with n budget, where n is pareto distibuted among all customers. They will randomly pick a category depending on their current need distribution. After that, they will buy a product in that category, based on the products' total attractiveness. Buying that product will subtract the customer's need of that category by Volume's amount.

Owner
Corca AI
AI B2B Consulting Company
Corca AI
Minimal working example of data acquisition with nidaqmx python API

Data Aquisition using NI-DAQmx python API Based on this project It is a minimal working example for data acquisition using the NI-DAQmx python API. It

Pablo 1 Nov 05, 2021
This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP.

Overview Welcome to the Step-X repository. This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP. Be

Keanu Pang 0 Jan 20, 2022
Tools for the analysis, simulation, and presentation of Lorentz TEM data.

ltempy ltempy is a set of tools for Lorentz TEM data analysis, simulation, and presentation. Features Single Image Transport of Intensity Equation (SI

McMorran Lab 1 Dec 26, 2022
A set of tools to analyse the output from TraDIS analyses

QuaTradis (Quadram TraDis) A set of tools to analyse the output from TraDIS analyses Contents Introduction Installation Required dependencies Bioconda

Quadram Institute Bioscience 2 Feb 16, 2022
Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible

Salad Dais 6 Sep 01, 2022
Vectorizers for a range of different data types

Vectorizers for a range of different data types

Tutte Institute for Mathematics and Computing 69 Dec 29, 2022
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
Improving your data science workflows with

Make Better Defaults Author: Kjell Wooding [email protected] This is the git re

Kjell Wooding 18 Dec 23, 2022
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

Lawrence Livermore National Laboratory 14 Aug 19, 2022
A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenari

epispot 12 Dec 28, 2022
Validation and inference over LinkML instance data using souffle

Translates LinkML schemas into Datalog programs and executes them using Souffle, enabling advanced validation and inference over instance data

Linked data Modeling Language 7 Aug 07, 2022
Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Covid County Executive summary Setup Install miniconda, then in the command line, run conda create -n covid-county conda activate covid-county conda i

Ahmed Fasih 1 Dec 22, 2021
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

Nguyễn Quang Huy 5 Sep 28, 2022
Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 01, 2023
Bigdata Simulation Library Of Dream By Sandman Books

BIGDATA SIMULATION LIBRARY OF DREAM BY SANDMAN BOOKS ================= Solution Architecture Description In the realm of Dreaming, its ruler SANDMAN,

Maycon Cypriano 3 Jun 30, 2022
Very useful and necessary functions that simplify working with data

Additional-function-for-pandas Very useful and necessary functions that simplify working with data random_fill_nan(module_name, nan) - Replaces all sp

Alexander Goldian 2 Dec 02, 2021
Processo de ETL (extração, transformação, carregamento) realizado pela equipe no projeto final do curso da Soul Code Academy.

Processo de ETL (extração, transformação, carregamento) realizado pela equipe no projeto final do curso da Soul Code Academy.

Débora Mendes de Azevedo 1 Feb 03, 2022
:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges: Optimus is the missing framework to prof

Iron 1.3k Dec 30, 2022
This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

1 Dec 28, 2021
Geospatial data-science analysis on reasons behind delay in Grab ride-share services

Grab x Pulis Detailed analysis done to investigate possible reasons for delay in Grab services for NUS Data Analytics Competition 2022, to be found in

Keng Hwee 6 Jun 07, 2022