Taking the fight to the establishment.

Overview

Throwdown

Taking the fight to the establishment.

Wat?

I wanted a simple markdown interpreter in python and/or javascript to output html for my website. Python does not have a bug-free official distribution, javascript only has things you install through npm and I don't want to have anything to do with node and the 100 MB of dependencies you end up uploading to your FTP server in order to do the most basic tasks.

So writing my own parser it is then eh? I tried to trudge through the commonmark markdown spec and had a heart attack at the complexity. 24722 words and 181 pages of complicated language explaining features I absolutely don't need.

I just want minimal, well defined, syntactical elements with maximum payoff, so here is throwdown. Taking the fight to the establishment to have a stupidly minimal markup language in both definition and capability.

Goals

Keep it a subset for markdown so we can use existing IDEs & plugins. Support HTML tags in line with text. Write a well defined language spec in the manual to ease creating new interpreters for it.

The spec

Tokenization

Given a piece of text we tokenize the following concepts (these are regular expressions using DOTALL and MULTILINE modifiers):

blank_line(s): (\r | \n | \r\n){2,}
html tag: <.*?>
code: ```.*?```
unescaped italic: ^_|(?

any characters inbetween matching tokens are flagged content.

Content itself gets an additional treatment where we replace this regex

\\(?)

for escaped characters, with whatever was in the matched group. I currently do this in the generation stage but it could move to any stage.

I am not sure if in the UTF8 2/3/4 byte characters any of these elements may match, so make sure to perform these single-characetr checks per unicode char, not per byte.

Parsing

We then have a parsing pass that tries to group matching tokens:

In this example:

This *word* is bold but this* is wrong.

We have the following tokens:

content, bold, content, bold, content, bold, content

The parser simply finds any content block surrounded by matching code|italic|bold neighbours, and then 'consumes' these neighbours so they can not be picked up more than once. Reading from left to right this means we get (note we search outwards from content recursively to support *_content_* notations, instead of holding on to the boundary tokens as soon as we encounter them):

content group content bold content

Then, any token outside of a group gets merged into it's content, any consecutive content gets merged into 1 content. The first step reduced the bold into it's left neighbour:

content group content content

The next step reduces the two content blocks into one:

content group content

The above step should include html tags.

A final step is to remove the blank line tokens, but first we must make sure to merge consecutive group and content blocks, because after this any consecutive content and/or group tokens are known unique paragraphs (or headers) so the blank lines are no longer necessary to imply this separation.

Generation

Then there is the generation step. We simply walk the resulting tokens and output a html document.

  • If a content group is preceded by a heading, the node gets wrapped into tags where n is the number of #.
  • Every other content node gets wrapped into

    tags.

  • Every group gets wrapped based on the first and last tokens (which are identical).
    • italic becomes In this case the wrapping is recursive, a bold group in an intalic group may exist.
    • bold becomes In this case the wrapping is recursive, an italic group in a bold group may exist.
    • code becomes

Write
to insert single line breaks manually.

TODO:

Consider bullet points and numbered lists, though the html is not super invasive.

Owner
Trevor van Hoof
I write tools & shaders TropicalTrevor in the Demoscene
Trevor van Hoof
Sabe is a python framework written for easy web server setup.

Sabe is a python framework written for easy web server setup. Sabe, kolay web sunucusu kurulumu için yazılmış bir python çerçevesidir. Öğrenmesi kola

2 Jan 01, 2022
My Solutions to 120 commonly asked data science interview questions.

Data_Science_Interview_Questions Introduction 👋 Here are the answers to 120 Data Science Interview Questions The above answer some is modified based

Milaan Parmar / Милан пармар / _米兰 帕尔马 181 Dec 31, 2022
An extremely configurable markdown reverser for Python3.

🔄 Unmarkd A markdown reverser. Unmarkd is a BeautifulSoup-powered Markdown reverser written in Python and for Python. Why This is created as a StackS

ThatXliner 5 Jun 27, 2022
Integer sets where all subsets have unique sums

Evil Sums Generation of sets of numbers where all constituents are recoverable from a partial sum.

Charlotte 5 Sep 24, 2022
Width-customizer-for-streamlit-apps - Width customizer for Streamlit Apps

🎈 Width customizer for Streamlit Apps As of now, you can only change your Strea

Charly Wargnier 5 Aug 09, 2022
This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

Daniel Caine 1 Jan 31, 2022
Density is a open-sourced multi-purpose tool for ROBLOX with some cool

Density is a open-sourced multi-purpose tool for ROBLOX with some cool

ssl 5 Jul 16, 2022
An example repository for how to generate results using PyBaMM

PyBaMM results This repository provides a template for generating results (for example, for a paper) using PyBaMM Installation Install PyBaMM using a

PyBaMM Team 7 Oct 09, 2022
A Python script made for the Python Discord Pixels event.

Python Discord Pixels A Python script made for the Python Discord Pixels event. Usage Create an image.png RGBA image with your pattern. Transparent pi

Stanisław Jelnicki 4 Mar 23, 2022
The first Python 1v1.lol triggerbot working with colors !

1v1.lol TriggerBot Afin d'utiliser mon triggerbot, vous devez activer le plein écran sur 1v1.lol sur votre naviguateur (quelque-soit ce dernier). Vous

Venax 5 Jul 25, 2022
IPO Checker for NEPSE

IPO Checker Checks more than one account for an IPO. Usage: ipo_checker.py [-h] --file FILE IPO Checker for a list. optional arguments: -h, --help

Sagar Tamang 4 Sep 20, 2022
Tools for dos (denial-of-service) website / web server

DoS Attack Tools Tools for dos (denial-of-service) website / web server di buat olah NurvySec How to install on debian / ubuntu $ apt update $ apt ins

nurvy 1 Feb 10, 2022
Automatic certificate unpinning for Android apps

What is this? Script used to perform automatic certificate unpinning of an APK by adding a custom network security configuration that permits user-add

Antoine Neuenschwander 5 Jul 28, 2021
A Company Management System For Python

campany-management Getting started To make it easy for you to get started with GitLab, here's a list of recommended next steps. Already a pro? Just ed

hatice akpınar 3 Aug 29, 2022
A python library for writing parser-based interactive fiction.

About IntFicPy A python library for writing parser-based interactive fiction. Currently in early development. IntFicPy Docs Parser-based interactive f

Rita Lester 31 Nov 23, 2022
Cairo hooks for pre-commit

pre-commit-cairo Cairo hooks for pre-commit. See pre-commit for more details Using pre-commit-cairo with pre-commit Add this to your .pre-commit-confi

Fran Algaba 16 Sep 21, 2022
Originally used during Marketplace.tf's open period, this program was used to get the profit of items bought with keys and sold for dollars.

Originally used during Marketplace.tf's open period, this program was used to get the profit of items bought with keys and sold for dollars. Practically useless for me now, but can be used as an exam

BoggoTV 1 Dec 11, 2021
One destination for all the developer's learning resources.

DevResources One destination for all the developer's learning resources. Find all of your learning resources under one roof and add your own. Live ✨ Y

Gaurav Sharma 33 Oct 21, 2022
Vaccine for STOP/DJVU ransomware, prevents encryption

STOP/DJVU Ransomware Vaccine Prevents STOP/DJVU Ransomware from encrypting your files. This tool does not prevent the infection itself. STOP ransomwar

Karsten Hahn 16 May 31, 2022
Urban Big Data Centre Housing Sensor Project

Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi

Jeremy Singer 2 Dec 13, 2021