Hitchhikers-guide - The Hitchhiker's Guide to Data Science for Social Good

Overview

Welcome to the Hitchhiker's Guide to Data Science for Social Good.

What is the Data Science for Social Good Fellowship?

The Data Science for Social Good Fellowship (DSSG) is a hands-on and project-based summer program that launched in 2013 at the University of Chicago and has now expanded to multiple locations globally. It brings data science fellows, typically graduate students, from across the world to work on machine learning, artificial intelligence, and data science projects that have a social impact. From a pool of typically over 800 applicants, 20-40 fellows are selected from diverse computational and quantitative disciplines including computer science, statistics, math, engineering, psychology, sociology, economics, and public policy.

The fellows work in small, cross-disciplinary teams on social good projects spanning education, health, energy, transportation, criminal justice, social services, economic development and international development in collaboration with global government agencies and non-profits. This work is done under close and hands-on mentorship from full-time, dedicated data science mentors as well as dedicated project managers, with industry experience. The result is highly trained fellows, improved data science capacity of the social good organization, and a high quality data science project that is ready for field trial and implementation at the end of the program.

In addition to hands-on project-based training, the summer program also consists of workshops, tutorials, and ethics discussion groups based on our data science for social good curriculum designed to train the fellows in doing practical data science and artificial intelligence for social impact.

Who is this guide for?

The primary audience for this guide is the set of fellows coming to DSSG but we want everything we create to be open and accessible to larger world. We hope this is useful to people beyond the summer fellows coming to DSSG.

If you are applying to the program or have been accepted as a fellow, check out the manual to see how you can prepare before arriving, what orientation and training will cover, and what to expect from the summer.

If you are interested in learning at home, check out the tutorials and teach-outs developed by our staff and fellows throughout the summer, and to suggest or contribute additional resources.

*Another one of our goals is to encourage collaborations. Anyone interested in doing this type of work, or starting a DSSG program, to build on what we've learned by using and contributing to these resources.

What is in this guide?

Our number one priority at DSSG is to train fellows to do data science for social good work. This curriculum includes many things you'd find in a data science course or bootcamp, but with an emphasis on solving problems with social impact, integrating data science with the social sciences, discussing ethical implications of the work, as well as privacy, and confidentiality issues.

We have spent many (sort of) early mornings waxing existential over Dunkin' Donuts while trying to define what makes a "data scientist for social good," that enigmatic breed combining one part data scientist, one part consultant, one part educator, and one part bleeding heart idealist. We've come to a rough working definition in the form of the skills and knowledge one would need, which we categorize as follows:

  • Programming, because you'll need to tell your computer what to do, usually by writing code.
  • Computer science, because you'll need to understand how your data is - and should be - structured, as well as the algorithms you use to analyze it.
  • Math and stats, because everything else in life is just applied math, and numerical results are meaningless without some measure of uncertainty.
  • Machine learning, because you'll want to build predictive or descriptive models that can learn, evolve, and improve over time.
  • Social science, because you'll need to know how to design experiments to validate your models in the field, and to understand when correlation can plausibly suggest causation, and sometimes even do causal inference.
  • Problem and Project Scoping, because you'll need to be able to go from a vague and fuzzy project description to a problem you can solve, understand the goals of the project, the interventions you are informing, the data you have and need, and the analysis that needs to be done.
  • Project management, to make progress as a team, to work effectively with your project partner, and work with a team to make that useful solution actually happen.
  • Privacy and security, because data is people and needs to be kept secure and confidential.
  • Ethics, fairness, bias, and transparency, because your work has the potential to be misused or have a negative impact on people's lives, so you have to consider the biases in your data and analyses, the ethical and fairness implications, and how to make your work interpretable and transparent to the users and to the people impacted by it.
  • Communications, because you'll need to be able to tell the story of why what you're doing matters and the methods you're using to a broad audience.
  • Social issues, because you're doing this work to help people, and you don't live or work in a vacuum, so you need to understand the context and history surrounding the people, places and issues you want to impact.

All material is licensed under CC-BY 4.0 License: CC BY 4.0

Table of Contents

The links below will help you find things quickly.

DSSG Manual

Summer Overview

This sections covers general information on projects, working with partners, presentations, orientation information, and the following schedules:

Conduct, Culture, and Communications

This section details the DSSG anti-harassment policy, goals of the fellowship, what we hope fellows get out of the experience, the expectations of the fellows, and the DSSG environment. A slideshow version of this can also be found here.

Curriculum

This section details the various topics we will be covering throughout the summer. This includes:

Wiki

In the wiki, you will find a bunch of helpful information and instructions that people have found helpful along the way. It covers topics like:

  • Accessing S3 from the command line
  • Creating an alias to make Python3 your default (rather than python2)
  • Installing RStudio on your EC2
  • Killing your query
  • Creating a custom jupyter setup
  • Mounting box from ubuntu
  • Pretty Print psql and less output
  • Remotely editing text files in your favorite text editor
  • SQL Server to Postgres
  • Using rpy2
  • VNC Viewer
Owner
Data Science for Social Good
Data Science for Social Good
Bitflip Fault Simulation Platform by Daniele Rizzieri (2021)

SEE Injection Framework 2021 This repository contains two Single Event Effect (SEE) injection platforms. The first one is called BFSP - "Bitflip Fault

Daniele Rizzieri 2 Nov 05, 2022
A simple flashcard app built as a final project for a databases class.

CS2300 Final Project - Flashcard app 'FlashStudy' Tech stack Backend Python (Language) Django (Web framework) SQLite (Database) Frontend HTML/CSS/Java

Christopher Spencer 2 Feb 03, 2022
Simple cash register system made with guizero

Eje-Casher なにこれ guizeroで作った簡易レジシステムです。実際にコミケで使う予定です。 これを誰かがそのまま使うかどうかというよりは、guiz

Akira Ouchi 4 Nov 07, 2022
PwnDatas-DB-Project(PDDP)

PwnDatas-DB-Project PwnDatas-DB-Project(PDDP) 安裝依賴: pip3 install pymediawiki 使用: cd /opt git https://github.com/JustYoomoon/PwnDatas-DB-Project.git c

21 Jul 16, 2021
An Airflow operator to call the main function from the dbt-core Python package

airflow-dbt-python An Airflow operator to call the main function from the dbt-core Python package Motivation Airflow running in a managed environment

Tomás Farías Santana 93 Jan 08, 2023
Render your templates using .txt files

PizzaX About Run Run tests To run the tests, open your terminal and type python tests.py (WIN) or python3 tests.py (UNX) Using the function To use the

Marcello Belanda 2 Nov 24, 2021
Bring A Trailer(BAT) is a popular online auction website for enthusiast cars. This traverse auction results and saves them as CSV

BaT Data Grabber Bring A Trailer(BAT) is a popular online auction website for enthusiast cars. This traverse auction results and saves them as CSV Bri

Elliot Weil 2 Oct 31, 2021
CountBoard 是一个基于Tkinter简单的,开源的桌面日程倒计时应用。

CountBoard 是一个基于Tkinter简单的,开源的桌面日程倒计时应用。 基本功能 置顶功能 是否使窗体一直保持在最上面。 简洁模式 简洁模式使窗体更加简洁。 此模式下不可调整大小,请提前在普通模式下调整大小。 设置功能 修改主窗体背景颜色,修改计时模式。 透明设置 调整窗体的透明度。 修改

gaoyongxian 130 Dec 01, 2022
Earth-to-orbit ballistic trajectories with atmospheric resistance

Earth-to-orbit ballistic trajectories with atmospheric resistance Overview Space guns are a theoretical technology that reduces the cost of getting bu

1 Dec 03, 2021
This is a method to build your own qgis configuration packages using osgeo4W.

This is a method to build your own qgis configuration packages using osgeo4W. Then you can automate deployment in your organization with a controled and trusted environnement.

Régis Haubourg 26 Dec 05, 2022
Some ideas and tools to develop Python 3.8 plugins for GIMP 2.99.4

gimp-python-development Some ideas and tools to develop Python 3.8 plugins for GIMP 2.99.4. GIMP 2.99.4 is the latest unstable pre-release of GIMP 3.

Ismael Benito 53 Sep 25, 2022
XlvnsScriptTool - Tool for decompilation and compilation of scripts .SDT from the visual novel's engine xlvns

XlvnsScriptTool English Dual languaged (rus+eng) tool for decompiling and compiling (actually, this tool is more than just (dis)assenbler, but less th

Tester 3 Sep 15, 2022
Small tool to use hero .json files created with Optolith for The Dark Eye/ Das Schwarze Auge 5 to perform talent probes.

DSA5-ProbeMaker A little tool for The Dark Eye 5th Edition (Das Schwarze Auge 5) to load .json from Optolith character generation and easily perform t

2 Jan 06, 2022
a really simple bot that send you memes from reddit to whatsapp

a really simple bot that send you memes from reddit to whatsapp want to use use it? install the dependencies with pip3 install -r requirements.txt the

pai 10 Nov 28, 2021
Customisable coding font with alternates, ligatures and contextual positioning

Guide Ligature Support Links Log License Guide Live Preview + Download larsenwork.com/monoid Install Quit your editor/program. Unzip and open the fold

Andreas Larsen 7.6k Dec 30, 2022
Python-Course-V1 - This Repo contains a series of Python Jupyter Notebooks and assignments

This Repo contains a series of Python Jupyter Notebooks and assignments. The assignments are taken from Python Crash Course book by Eric Matthes.

2 Nov 15, 2022
Originally used during Marketplace.tf's open period, this program was used to get the profit of items bought with keys and sold for dollars.

Originally used during Marketplace.tf's open period, this program was used to get the profit of items bought with keys and sold for dollars. Practically useless for me now, but can be used as an exam

BoggoTV 1 Dec 11, 2021
Random pass word generator made with python. PyQt5 module is used to design GUI.

Differences in this GUI program : Default titlebar removed Custom Minimize,Maximize and Close Buttons Drag & move window from any point Program work l

Dimuth De Zoysa 1 Jan 26, 2022
Python MQTT v5.0 async client

gmqtt: Python async MQTT client implementation. Installation The latest stable version is available in the Python Package Index (PyPi) and can be inst

Gurtam 306 Jan 03, 2023
This is a simple web interface for SimplyTranslate

SimplyTranslate Web This is a simple web interface for SimplyTranslate List of Instances You can find a list of instances here: SimplyTranslate Projec

4 Dec 14, 2022