Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Overview

data-preprocessing_toogoodtogo_threatlines

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz, Stef van Buuren, and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

Synchronous visualisation of email threads

Publications from the FTM "Dossier SHELL papers" https://www.ftm.nl/dossier/shell-papers suggest that timing of events is critical in the interactions between actors. It would therefore be useful if we could visualise the mail exchanges in time.

The idea is to visualise threads of mail exchanges between actors over time. When this is done for multiple threads, the display would give rapid insight into the structure and timing of exchanges between actors. For example, suppose we are able to construct a single thread from "RE:" and "FW:" mails in the data. A simple visualisation would be

See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.9825&rep=rep1&type=pdf for variations on this display, for example by adding the interactions between the actors by fancy arcs and resorting the mails according to actor pairs.

A generalisation to multiple simulataneous threads would stack multiple lines, similar to a dot plot. Such a design calls for relatively simple thread displays that are synchronised in time. Therefore we will concentrate on using a simple thread line that plots mail chronology against calender time.

A somewhat grander idea would be to create a "film of events". The user would place a cursor on the time axis, and scroll through time. The new information per mail is displayed as the cursor passes the send time of the email.

Issues to resolve

We need complex/advanced text processing. Some of the issues include:

  1. How can we split multiple emails in a RE/FW into a set of elementary mails, each corresponding to just one sender?
  2. How well can we form threads by matching on subject lines?
  3. Do duplicates extracted from RE/FW serve any useful purpose?
  4. What is the percentage of threads for which we can find the parent mail (the mail that started the thread)?

Experiment 1

The first design plots all thread lines between 2016 and 2020 on one chart.

Experiment 2

The second design uses trelliscopejs to plot the same information in smaller pieces.

The user can switch between 27 panes, each containing about 20 threads.

Try out the interactive version

Experiment 3

Back to figure 1, but now plotted with rbokeh, so that we may zoom and use tooltips (interaction not supported by GitHub markdown)

Owner
ASReview hackathon for Follow the Money
ASReview hackathon for Follow the Money
Sardana integration into the Jupyter ecosystem.

sardana-jupyter Sardana integration into the Jupyter ecosystem.

Marc Espín 1 Dec 23, 2021
Gives you more advanced math in python.

AdvancedPythonMath Gives you more advanced math in python. Functions .simplex(args: {number}) .circ(args: {raidus}) .pytha(args: {leg_a + leg_2}) .slo

Voidy Devleoper 1 Dec 25, 2021
OCR-ID-Card VietNamese (new id-card)

OCR-ID-Card VietNamese (new id-card) run project: download 2 file weights and pu

12 Jun 15, 2022
A simple Python script for generating a variety of hashes from safe urandom entropy.

Hashgen A simple Python script for generating a variety of hashes from safe urandom entropy. For whenever you need a random hash (e.g. generating an a

Xanspie 1 Feb 17, 2022
Simple Python script I use to manage and build my Reflux themes.

Simple Python script I use to manage and build my Reflux themes. Built for personal use, but anyone can easily fork and tweak to suit thier needs.

Ire 3 Jan 25, 2022
My attempt at this years Advent of Code!

Advent-of-code-2021 My attempt at this years Advent of Code! day 1: ** day 2: ** day 3: ** day 4: ** day 5: ** day 6: ** day 7: ** day 8: * day 9: day

1 Jul 06, 2022
Repo created for the purpose of adding any kind of programs and projects

Programs and Project Repository A repository for adding programs and projects of any kind starting from beginners level to expert ones Contributing to

Unicorn Dev Community 3 Nov 02, 2022
Github Star Tracking app with Streamlit

github-star-tracking-python-app Github Star Tracking app with Streamlit #8daysofstreamlit How to run it locally? Clone or Download & Unzip the Repo En

amrrs 4 Sep 22, 2022
Zotero references script (and app)

A little script (and PyInstaller build) for a very specific, somewhat hack-ish purpose: managing and exporting project references with Zotero and its API.

Marius Rödder 0 Dec 05, 2021
SuperCollider library for Python

SuperCollider library for Python This project is a port of core features of SuperCollider's language to Python 3. It is intended to be the same librar

Lucas Samaruga 65 Dec 22, 2022
decorator

Decorators for Humans The goal of the decorator module is to make it easy to define signature-preserving function decorators and decorator factories.

Michele Simionato 734 Dec 30, 2022
An easy-to-learn, dynamic, interpreted, procedural programming language

Gen Programming Language WARNING!! THIS LANGUAGE IS IN DEVELOPMENT. ANYTHING CAN CHANGE AT ANY MOMENT. Gen is a dynamic, interpreted, procedural progr

Gen Programming Language 7 Oct 17, 2022
Cylc: a workflow engine for cycling systems

Cylc: a workflow engine for cycling systems. Repository master branch: core meta-scheduler component of cylc-8 (in development); Repository 7.8.x branch: full cylc-7 system.

The Cylc Workflow Engine 205 Dec 20, 2022
Open slidebook .sldy files in Python

Work in progress slidebook-python Open slidebook .sldy files in Python To install slidebook-python requires Python = 3.9 pip install slidebook-python

The Institute of Cancer Research 2 May 04, 2022
Supercharge your NFTs with new behaviours and superpowers!

WrapX Supercharge your NFTs with new behaviours and superpowers! WrapX is a collection of Wrappers (currently one - WrapXSet) to decorate your NTFs ad

Emiliano Bonassi 9 Jun 13, 2022
🌌A Python library to exhaustively enumerate a combinatorial space represented by a function

exhaust A Python library to exhaustively enumerate a combinatorial space represented by a function. The API is modelled after Python's random module a

Maik Riechert 1 Dec 05, 2021
Digitales Raumbuch

Helios Digitales Raumbuch Settings Moved to settings. Basic Commands Setting Up Your Users To create a normal user account, just go to Sign Up and fil

1 Nov 19, 2021
Paprika is a python library that reduces boilerplate. Heavily inspired by Project Lombok.

Image courtesy of Anna Quaglia (Photographer) Paprika Paprika is a python library that reduces boilerplate. It is heavily inspired by Project Lombok.

Rayan Hatout 55 Dec 26, 2022
Generate Azure Blob Storage account authentication headers for Munki

Azure Blob Storage Authentication for Munki The Azure Blob Storage Middleware allows munki clients to connect securely, and directly to a munki repo h

Oliver Kieselbach 10 Apr 12, 2022
Comprehensive OpenAPI schema generator for Django based on pydantic

🗡️ Djagger Automated OpenAPI documentation generator for Django. Djagger helps you generate a complete and comprehensive API documentation of your Dj

13 Nov 26, 2022