Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Related tags

Data Analysisdamast
Overview

Damast

This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World". The project was funded by the VolkswagenFoundation within the scope of the "Mixed Methods" initiative. The project was a collaboration between the Institute for Medieval History II of the Goethe University in Frankfurt/Main, Germany, and the Institute for Visualization and Interactive Systems at the University of Stuttgart, and took place there from 2018 to 2021.

The objective of this joint project was to develop a novel visualization approach in order to gain new insights on the multi-religious landscapes of the Middle East under Muslim rule during the Middle Ages (7th to 14th century). In particular, information on multi-religious communities were researched and made available in a database accessible through interactive visualization as well as through a pilot web-based geo-temporal multi-view system to analyze and compare information from multiple sources. A publicly explorable version of the research will soon be available, and will be linked here. An export of the data collected in the project can be found in the data repository of the University of Stuttgart (DaRUS) (draft, not yet public).

Database

The historical data is collected in a relational PostgreSQL database. For the project, we have used PostgreSQL version 10. Since the project also deals with geographical data, we additionally use the PostGIS extension. A suitable database setup is to use the postgis/postgis:10-3.1 Docker image. An SQL script for creating the database schema is located in util/postgres/schema.sql, and in DaRUS. An overview of the interplay between tables of the database, and a general explanation, can be found in the docs/ directory.

Software

The server is programmed in Python using Flask. Functionalities are split up into a hierarchy of Flask blueprints; for example, there is a blueprints for the landing page, one for the visualization, and a nested hierarchy of blueprints for the REST API. The server provides multiple pages, as well as a HTTP interface for reading from and writing to the PostgreSQL database. The server is built and deployed as a Docker container that contains all necessary dependencies.

An overview and explanation of the different pages and functionalities is located in the docs/ directory. The web pages consist of HTML, CSS and JavaScript. The HTML content is in most cases served via Jinja2 templates that are processed by Flask. The JavaScript code is compiled from TypeScript source, and the CSS is compiled from SCSS.

Getting Started

Basic knowledge with build tools and Docker are required. The instructions below assume a Linux machine with the Bash shell are used.

Installing Dependencies

On the build system, Docker and NodeJS need to be installed. If the Makefile is used, the build-essentials package is required as well. In the root of the repository, run the following code to install the build dependencies:

$ npm install

Building the Frontend

To build all required files for the frontend (JavaScript, CSS, documentation), the Makefile can be used, or consulted for the appropriate commands:

$ make prod

Building the Docker Image

All frontend content and backend Flask code and contents are bundled within a Docker image. In that image, the required software dependencies are also installed in the correct versions. A few configuration options need to be baked into the Docker image on creation, which are dependent on the setup in which the Docker container will later run. Please refer to the deploy.sh shell script for details and examples, as well as to the section on running the server. The Dockerfile is constructed from parts from util/docker/, and then enriched with runtime information to ensure that certain steps are repeated when data changes. An exemplary creation of a Docker image (fictional values, please refer to deploy.sh before copying) could look as follows:

Dockerfile # build Dockerfile (warning: dummy parameters!) $ sudo docker build -t damast:latest \ --build-arg=USER_ID=50 \ --build-arg=GROUP_ID=50 \ --build-arg=DHIMMIS_ENVIRONMENT=PRODUCTION \ --build-arg=DHIMMIS_VERSION=v1.0.0 \ --build-arg=DHIMMIS_PORT=8000 \ .">
# calculate hash of server files to determine if the COPY instruction should be repeated
$ fs_hash=$(find dhimmis -type f \
    | xargs sha1sum \
    | awk '{print $1}' \
    | sha1sum - \
    | awk '{print $1}')

# assemble Dockerfile
$ cat util/docker/{base,prod}.in \
    | sed "s/@REBUILD_HASH@/$fs_hash/g" \
    > Dockerfile

# build Dockerfile (warning: dummy parameters!)
$ sudo docker build -t damast:latest \
    --build-arg=USER_ID=50 \
    --build-arg=GROUP_ID=50 \
    --build-arg=DHIMMIS_ENVIRONMENT=PRODUCTION \
    --build-arg=DHIMMIS_VERSION=v1.0.0 \
    --build-arg=DHIMMIS_PORT=8000 \
    .

The resulting Docker image can then be transferred to the host machine, for example, using docker save and docker load. Of course, the image can be built directly on the host machine as well.

Running the Server

The server infrastructure consists of three components:

  1. The Flask server in its Docker container,
  2. the PostgreSQL database, for example in the form of a postgis/postgis:10-3.1 Docker container, and
  3. a reverse HTTP proxy on the host machine that handles traffic from the outside and SSL.

The util/ directory contains configuration templates for an NGINX reverse proxy, cron, the start script, the systemd configuration, and the user authentication file. The documentation also goes into more details about the setup. A directory on the host machine is mapped as a volume to the /data directory in the docker container. The /data directory contains runtime configuration files (users.db, reports.db, as well as log files). The main Docker container requires some additional runtime configuration, for example for the PostgreSQL password, which can be passed as environment variables to Docker using the --env and --env-file flags. The following configuration environment variables exist, although most have a sensible default:

Environment Variable Default Value Description
DHIMMIS_ENVIRONMENT Server environment (PRODUCTION, TESTING, or PYTEST). This decides with which PostgreSQL database to connect (ocn, testing, and pytest (on Docker container) respectively. This is usually set via the Docker image.
DHIMMIS_VERSION Software version. This is usually set via the Docker image.
DHIMMIS_USER_FILE /data/users.db Path to SQLite3 file with users, passwords, roles.
DHIMMIS_REPORT_FILE /data/reports.db File to which reports are stored during generation.
DHIMMIS_SECRET_FILE /dev/null File with JWT and app secret keys. These are randomly generated if not passed, but that is impractical for testing with hot reload (user sessions do not persist). For a production server, this should be empty.
DHIMMIS_PROXYCOUNT 1 How many reverse proxies the server is behind. This is necessary for proper HTTP redirection and cookie paths.
DHIMMIS_PROXYPREFIX / Reverse proxy prefix.
FLASK_ACCESS_LOG /data/access_log Path to access_log (for logging).
FLASK_ERROR_LOG /data/error_log Path to error_log (for logging).
DHIMMIS_PORT 8000 Port at which gunicorn serves the content. Note: This is set via the Dockerfile, and also only used in the Dockerfile.
PGHOST localhost PostgreSQL hostname.
PGPASSWORD PostgreSQL password. This is important to set and depends on how the database is set up.
PGPORT 5432 PostgreSQL port
PGUSER api PostgreSQL user
Comments
  • What data to include in Life Version?

    What data to include in Life Version?

    Initially opened by gs108488 in [email protected]. History:

    gs108488 Dec 15, 2021:

    Similar to issue #172, we need to define which data is included in the Life Version?

    help wanted question 
    opened by mfranke93 40
  • Check texts in Help/Info menus

    Check texts in Help/Info menus

    Initially opened by gs108488 in [email protected]. History:

    gs108488 Dec 16, 2021:

    All texts in info boxes when clicking on ? symbols in visualization need to be checked, especially in respect to publishing the life version.

    gs108488 Dec 16, 2021:

    I put this on halt until the state of the visualization for the life version is really final.

    documentation enhancement 
    opened by mfranke93 37
  •  Change text and appearence for start page of Life Version

    Change text and appearence for start page of Life Version

    Initially opened by gs108488 in [email protected]. Relevant history:

    gs108488 Dec 16, 2021:

    For the Life Version, the start page needs a complete make over. Apart from "English only" (see ~~#210~~), the start page will need different sections. For the design, the styles inherited from the overall base.htm need to be considered.

    enhancement 
    opened by mfranke93 31
  • Explanation of

    Explanation of "Aggregation of religious groups"

    In the info text of the map, I find the following sentences:

    If in all shown map glyphs no more than four nodes of a lower part in the religion hierarchy would be present, the data is aggregated on that lower level. For example, if the map would only show two glyphs; one with the Latin Church, the Coptic Church, and the Georgian Orthodox Church; and the other with the Latin Church and the Rabbanites; each of these religious denominations could be represented by an individual circle. This will only happen if it is possible in all glyphs, as doing otherwise would skew the perceived variety of religions.

    I hardly understand this. Is there something wrong, e.g. with the first phrase in the <em> tag?

    help wanted question 
    opened by tutebatti 23
  • Include section

    Include section "How to Cite" in Report

    • Where would I suggest what the section needs to look like? This is not as easy compared to changing the html of the info texts, is it?
    • What is more, I am not sure who the "authors" of a report are when it comes to citation. Probably, instead of some standard bibliographical data, it should be something like:

    We suggest to cite this report in the following way: "Report ###UUID### created by ###YOUR NAME### based on the visualization and data of Weltecke, Dorothea et al. "Dhimmis and Muslims. A Tool [...]" accessible via ###URI###."

    enhancement 
    opened by tutebatti 20
  • Problems with cookie dialogue in Firefox?

    Problems with cookie dialogue in Firefox?

    Yesterday, someone from our team in Berlin tried visiting the public instance at damast.geschichte.hu-berlin.de with Firefox (on Windows) and the "Accept" button in the cookie dialogue did not respond. On my PC, with Firefox (on Linux) I got a grayed-out start page but not dialogue at all.

    Happy about any recommendations to track these bugs.

    bug 
    opened by tutebatti 19
  • Example when explaining regular expressions for

    Example when explaining regular expressions for "Place search"

    In the current example in the info text for the search of places, one reads:

    The search field supports JavaScript-style regular expressions. For example, to search for locations with an Arabic definite article, the query \ba([tdrzsṣḍṭẓln]|[tds]h)- can be used.

    If I understand correctly from the list of places, we do not use the DMG notation for Arabic articles (cf. https://de.wikipedia.org/wiki/DIN_31635). That example makes little sense, then. Any better suggestions. @rpbarczok, you probably no the data itself better than @mfranke93?

    help wanted discussion 
    opened by tutebatti 16
  • "NULL" vs. empty in column "comment" of Place table

    I'm currently looking through the Place table, especially to check for major inconsistencies in the column comment, which is part of the Place URI page. On the one hand, there were 3 or for rows with linebreaks. On the other hand, some rows have nothing in the column comment, some have NULL. Is that normal?

    Note: I am editing a csv downloaded via pgAdmin in LibreOffice Calc.

    help wanted question 
    opened by tutebatti 15
  • Listing pieces of evidences in place URI pages?

    Listing pieces of evidences in place URI pages?

    Connected to #144 and #146, a reason to generate reports is that it is the only way to have access to the detailed pieces of evidences. The visitor seemingly wanted to create reports to read about the different pieces of evidence of a specific place. What would you say, @rpbarczok?

    opened by tutebatti 12
  • Access to place URI page from map

    Access to place URI page from map

    Another comment by the same visitor giving feedback reported in #144:

    It would be great to access the place URI page directly from the map.

    Right now, the "mechanics" are either hovering displaying a tooltip. One can probably not use that as a link to the place URI page as it vanishes when moving the mouse. Clicking on the place (or the glyph) brushs and links, which then allows to go to the place URI page from the location list. So there is probably no easy solution. Maybe one could open the same popup as with the hovering with a right-click and make the tool tip persistent? Just an idea.

    enhancement 
    opened by tutebatti 12
  • Changes on the automatically generated report

    Changes on the automatically generated report

    I looked through the record and I like to to suggest the following improvements:

    1.) Query report: The first three filters of the query report “evidence must be visible”, “place must be visible” and “place type must be visible” is to my knowledge not part of the filter possibilities of the visualization. Someone has to know the data structure quite well to understand these statements. Therefore, I believe the users will be confused by them. Additionally, to my knowledge, no data in the DhiMu project is hidden. Therefore, I would like these statements to be removed if there are no other objections.

    2.) Evidence a) I would rather have the evidences sorted by a) city name, b) religion name, c) start of the time span b) In the English lexicography there is not plural form of Evidence. An English speaker would rather use the term “Pieces of Evidence”. I think we have to change that. c) The evidence fragment of the report distinguishes between hidden and visible data. Is there a case when hidden data is shown in the report? To my knowledge hidden data is not part of the publicly available filtering system? In any case: No hidden evidences, places or region should be part of the DhiMu-data, so it does not need to be included in the report. d) I would like to move the footnote with the evidence comment to the end of the sentence. e) We should use “…religious group…” instead of Religion f) If I understand the source code correctly, a place with the category “unknown” is displayed as “place”. Maybe it would be better to add an “undefined” to it: “undefined place”. What do you think? g) I would rather have a colon between the short title of the source and the content of the source_instance.source_page cell. h) I would like to move the source comment into a footnote at the end of the sentence.

    3.) Places a) According the place type "unknown", I suggest to add "undefined" to "place" as above. b) I would like to have the list items in the „Linked to“ section in the following way:

    <place.name> is linked to: Digital Atlas of the Roman Empire (DARE): http://imperium.ahlfeldt.se/places/22329

    i.e. without the short title and the ID at the beginning.

    @tutebatti, @mfranke93: what do you think?

    enhancement discussion 
    opened by rpbarczok 12
Releases(v1.1.5+history)
Owner
University of Stuttgart Visualization Research Center
University of Stuttgart Visualization Research Center
Pipetools enables function composition similar to using Unix pipes.

Pipetools Complete documentation pipetools enables function composition similar to using Unix pipes. It allows forward-composition and piping of arbit

186 Dec 29, 2022
AWS Glue ETL Code Samples

AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit

AWS Samples 1.2k Jan 03, 2023
Mining the Stack Overflow Developer Survey

Mining the Stack Overflow Developer Survey A prototype data mining application to compare the accuracy of decision tree and random forest regression m

1 Nov 16, 2021
Python Project on Pro Data Analysis Track

Udacity-BikeShare-Project: Python Project on Pro Data Analysis Track Basic Data Exploration with pandas on Bikeshare Data Basic Udacity project using

Belal Mohammed 0 Nov 10, 2021
CS50 pset9: Using flask API to create a web application to exchange stocks' shares.

C$50 Finance In this guide we want to implement a website via which users can “register”, “login” “buy” and “sell” stocks, like below: Background If y

1 Jan 24, 2022
An Integrated Experimental Platform for time series data anomaly detection.

Curve Sorry to tell contributors and users. We decided to archive the project temporarily due to the employee work plan of collaborators. There are no

Baidu 486 Dec 21, 2022
Meltano: ELT for the DataOps era. Meltano is open source, self-hosted, CLI-first, debuggable, and extensible.

Meltano is open source, self-hosted, CLI-first, debuggable, and extensible. Pipelines are code, ready to be version c

Meltano 625 Jan 02, 2023
Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

Nikolas Petrou 1 Feb 13, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

weightedcalcs weightedcalcs is a pandas-based Python library for calculating weighted means, medians, standard deviations, and more. Features Plays we

Jeremy Singer-Vine 98 Dec 31, 2022
A DSL for data-driven computational pipelines

"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne

1.9k Jan 03, 2023
Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List.

tldextract Python Module tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and s

John Kurkowski 1.6k Jan 03, 2023
BErt-like Neurophysiological Data Representation

BENDR BErt-like Neurophysiological Data Representation This repository contains the source code for reproducing, or extending the BERT-like self-super

114 Dec 23, 2022
A set of procedures that can realize covid19 virus detection based on blood.

A set of procedures that can realize covid19 virus detection based on blood.

Nuyoah-xlh 3 Mar 07, 2022
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Uber Open Source 1.6k Dec 29, 2022
Pipeline and Dataset helpers for complex algorithm evaluation.

tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst

Machine Learning and Data Analytics Lab FAU 3 Dec 07, 2022
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020] by Kaisiyuan Wang, Qianyi Wu, Linsen Song, Zhuoqian Yang, Wa

112 Dec 28, 2022
A 2-dimensional physics engine written in Cairo

A 2-dimensional physics engine written in Cairo

Topology 38 Nov 16, 2022
MoRecon - A tool for reconstructing missing frames in motion capture data.

MoRecon - A tool for reconstructing missing frames in motion capture data.

Yuki Nishidate 38 Dec 03, 2022
Projects that implement various aspects of Data Engineering.

DATAWAREHOUSE ON AWS The purpose of this project is to build a datawarehouse to accomodate data of active user activity for music streaming applicatio

2 Oct 14, 2021