The Master's in Data Science Program run by the Faculty of Mathematics and Information Science

Last update: Jun 17, 2022

Related tags

Overview

In August 2020 Granted Ignacy Łukasiewicz Scholarship for Master Study under Polish National Agency for Academic Exchange (NAWA). Currently, I am in Warsaw and doing a Master's Degree in Data Science at Warsaw University of Technology. https://sites.google.com/view/amir-ali

The Master's in Data Science Program run by the Faculty of Mathematics and Information Science is among the first European programs in Data Science and is fully focused on data engineering and data analytics.

Semester 0 (Summer 2021)

Jupyter Markdown
Programming
- Python
- R
Numpy
Pandas
Matplotlib
Seaborn
Data Preprocessing

Semester 1 (Winter 2021/22)

Group Project
Data Transmission
Computer Statistics
Electronic Principles
UNIX Fundamentals
Business Intelligence Analyst
Data Processing in R and Python
Introduction to Machine Learning
Introduction to Image Processing and Computer Vision

Owner

Amir Ali

Scientific Researcher

GitHub Repository https://sites.google.com/view/amir-ali

Calculate multilateral price indices in Python (with Pandas and PySpark).

IndexNumCalc Calculate multilateral price indices using the GEKS-T (CCDI), Time Product Dummy (TPD), Time Dummy Hedonic (TDH), Geary-Khamis (GK) metho

3 Apr 27, 2022

The micro-framework to create dataframes from functions.

762 Jan 07, 2023

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

VarDACAE This module is used to create Convolutional AutoEncoders for Variational Data Assimilation. A user can define, create and train an AE for Dat

23 Dec 16, 2022

Employee Turnover Analysis

Employee Turnover Analysis Submission to the DataCamp competition "Can you help reduce employee turnover?"

1 Feb 13, 2022

Random dataframe and database table generator

Random database/dataframe generator Authored and maintained by Dr. Tirthajyoti Sarkar, Fremont, USA Introduction Often, beginners in SQL or data scien

249 Jan 08, 2023

Full automated data pipeline using docker images

Create postgres tables from CSV files This first section is only relate to creating tables from CSV files using postgres container alone. Just one of

1 Nov 21, 2021

Ejercicios Panda usando Pandas

Readme Below we add configuration details to locally test your application To co

1 Jan 22, 2022

Data processing with Pandas.

Processing-data-with-python This is a simple example showing how to use Pandas to create a dataframe and the processing data with python. The jupyter

1 Jan 23, 2022

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible

6 Sep 01, 2022

NumPy aware dynamic Python compiler using LLVM

Numba A Just-In-Time Compiler for Numerical Functions in Python Numba is an open source, NumPy-aware optimizing compiler for Python sponsored by Anaco

8.2k Jan 07, 2023

Cleaning and analysing aggregated UK political polling data.

Analysing aggregated UK polling data The tweet collection & storage pipeline used in email-service is used to also collect tweets from @britainelects.

0 Dec 22, 2021

Includes all files needed to satisfy hw02 requirements

HW 02 Data Sets Mean Scale Score for Asian and Hispanic Students, Grades 3 - 8 This dataset provides insights into the New York City education system

7 Oct 28, 2021

PyClustering is a Python, C++ data mining library.

pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each

1k Jan 05, 2023

Active Learning demo using two small datasets

ActiveLearningDemo How to run step one put the dataset folder and use command below to split the dataset to the required structure run utils.py For ea

3 Nov 10, 2021

BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems

Mathematical modeling is a powerful method for the analysis of complex biological systems. Although there are many researches devoted on produ

22 Dec 27, 2022

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

7 Nov 20, 2022

PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

PandaPy "I came across PandaPy last week and have already used it in my current project. It is a fascinating Python library with a lot of potential to

527 Jan 02, 2023

Investigating EV charging data

Investigating EV charging data Introduction: Got an opportunity to work with a home monitoring technology company over the last 6 months whose goal wa

2 Apr 07, 2022

Extract data from a wide range of Internet sources into a pandas DataFrame.

pandas-datareader Up to date remote data access for pandas, works for multiple versions of pandas. Installation Install using pip pip install pandas-d

2.5k Jan 09, 2023

Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons. Check out the binder link above for a sample c

42 Nov 09, 2022