Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Overview

data-preprocessing_toogoodtogo_threatlines

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz, Stef van Buuren, and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Synchronous visualisation of email threads

Publications from the FTM "Dossier SHELL papers" https://www.ftm.nl/dossier/shell-papers suggest that timing of events is critical in the interactions between actors. It would therefore be useful if we could visualise the mail exchanges in time.

The idea is to visualise threads of mail exchanges between actors over time. When this is done for multiple threads, the display would give rapid insight into the structure and timing of exchanges between actors. For example, suppose we are able to construct a single thread from "RE:" and "FW:" mails in the data. A simple visualisation would be

See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9825&rep=rep1&type=pdf for variations on this display, for example by adding the interactions between the actors by fancy arcs and resorting the mails according to actor pairs.

A generalisation to multiple simulataneous threads would stack multiple lines, similar to a dot plot. Such a design calls for relatively simple thread displays that are synchronised in time. Therefore we will concentrate on using a simple thread line that plots mail chronology against calender time.

A somewhat grander idea would be to create a "film of events". The user would place a cursor on the time axis, and scroll through time. The new information per mail is displayed as the cursor passes the send time of the email.

Issues to resolve

We need complex/advanced text processing. Some of the issues include:

  1. How can we split multiple emails in a RE/FW into a set of elementary mails, each corresponding to just one sender?
  2. How well can we form threads by matching on subject lines?
  3. Do duplicates extracted from RE/FW serve any useful purpose?
  4. What is the percentage of threads for which we can find the parent mail (the mail that started the thread)?

Experiment 1

The first design plots all thread lines between 2016 and 2020 on one chart.

Experiment 2

The second design uses trelliscopejs to plot the same information in smaller pieces.

The user can switch between 27 panes, each containing about 20 threads.

Try out the interactive version

Experiment 3

Back to figure 1, but now plotted with rbokeh, so that we may zoom and use tooltips (interaction not supported by GitHub markdown)

Owner
ASReview hackathon for Follow the Money
ASReview hackathon for Follow the Money
Python Monopoly Simulator

Monopoly simulator Original creator: Games Computer Play YouTube: https://www.youtube.com/channel/UCTrp88f-QJ1SqKX8o5IDhWQ Config file (optional) conf

Games Computers Play 37 Jan 03, 2023
Download and archive entire usenet newsgroups over NNTP.

Usenet Archiving Tool This code is for archiving Usenet discussions, not downloading files. Newsgroup posts are saved under the authors name and email

Corey White 2 Dec 23, 2021
Static bytecode simulator

SEA Static bytecode simulator for creating dependency/dependant based experimental bytecode format for CPython. Example a = random() if a = 5.0:

Batuhan Taskaya 23 Jun 10, 2022
Slotscheck - Find mistakes in your slots definitions

🎰 Slotscheck Adding __slots__ to a class in Python is a great way to reduce mem

Arie Bovenberg 67 Dec 31, 2022
Supercharge your NFTs with new behaviours and superpowers!

WrapX Supercharge your NFTs with new behaviours and superpowers! WrapX is a collection of Wrappers (currently one - WrapXSet) to decorate your NTFs ad

Emiliano Bonassi 9 Jun 13, 2022
It really seems like Trump is trying to get his own social media started. Not a huge fan tbh.

FuckTruthSocial It really seems like Trump is trying to get his own social media started. Not a huge fan tbh. (When TruthSocial actually releases, I'l

0 Jul 18, 2022
Performance data for WASM SIMD instructions.

WASM SIMD Data This repository contains code and data which can be used to generate a JSON file containing information about the WASM SIMD proposal. F

Evan Nemerson 5 Jul 24, 2022
VCC-Generator is a python script that generate VCC for testing purposes only

VCC-Generator is a python script that generate VCC for testing purposes only

Spider Anongreyhat 10 Oct 23, 2022
Collaboration project to creating bank application maded by Anzhelica Sakun and Yuriy Konyukh

Collaboration project to creating bank application maded by Anzhelica Sakun and Yuriy Konyukh

Yuriy 1 Jan 08, 2022
Automatically deletes Capital One Eno virtual cards for when you've made a couple too many.

eno-delete Automatically deletes Capital One Eno virtual cards for when you've made a couple too many. Warning: Program will delete ALL virtual cards

h3x 3 Sep 29, 2022
A small project of two newbies, who wanted to learn something about Python language programming, via fun way.

HaveFun A small project of two newbies, who wanted to learn something about Python language programming, via fun way. What's this project about? Well.

Patryk Sobczak 2 Nov 24, 2021
Research on how Gboard Stickers work.

Google-Sticker-Mashup-Research Research on how Gboard Stickers work. Contribute Contributing is nice, and you will be listed below for contributing. C

Jeremiah 45 Oct 28, 2022
Generating rent availability info from Effort rent

Rent-info Generating rent availability info from Effort rent Pre-Installation Latest version of python Pip module json, os, requests, datetime, time i

Laixuan 1 Oct 20, 2021
Python program to start your zoom meetings

zoomstarter Python programm to start your zoom meetings More about Initially this was a bash script for starting zoom meetings, but as i started devel

Viktor Cvetanovic 2 Nov 24, 2021
Coded in Python 3 - I make for education, easily clone simple website.

Simple Website Cloner - Single Page Coded in Python 3 - I make for education, easily clone simple website. How to use ? Install Python 3 first. Instal

Phạm Đức Thanh 2 Jan 13, 2022
Streamlit — The fastest way to build data apps in Python

Welcome to Streamlit 👋 The fastest way to build and share data apps. Streamlit lets you turn data scripts into sharable web apps in minutes, not week

Streamlit 22k Jan 06, 2023
Simple Python API for the Ergo Platform Explorer

Ergo is a "Resilient Platform for Contractual Money." It is designed to be a platform for applications with the main focus to provide an efficient, se

7 Jul 06, 2021
Cup Noodle Vending Maching Ordering Queue

Noodle-API Cup Noodle Vending Machine Ordering Queue Install dependencies in virtual environment python3 -m venv

Jonas Kazlauskas 1 Dec 09, 2021
Sudoku-Solver

Sudoku-Solver This is a personal project, that put all my today knowledges to the test, is a project that im developing alone with a lot of effort and

Carlos Ismael Gitto Bernales 5 Nov 08, 2021
Flight Reservation App With Python

Flight Reservation App With Python

victor-h. 1 Nov 21, 2021