Returns unicode slugs

Overview

Python Slugify

A Python slugify application that handles unicode.

status-image version-image coverage-image

Overview

Best attempt to create slugs from unicode strings while keeping it DRY.

Notice

This module, by default installs and uses text-unidecode (GPL & Perl Artistic) for its decoding needs.

However, there is an alternative decoding package called Unidecode (GPL). It can be installed as python-slugify[unidecode] for those who prefer it.

How to install

easy_install python-slugify |OR| easy_install python-slugify[unidecode]
-- OR --
pip install python-slugify |OR| pip install python-slugify[unidecode]

Options

def slugify(
    text,
    entities=True,
    decimal=True,
    hexadecimal=True,
    max_length=0,
    word_boundary=False,
    separator='-',
    save_order=False,
    stopwords=(),
    regex_pattern=None,
    lowercase=True,
    replacements=()
  ):
  """
  Make a slug from the given text.
  :param text (str): initial text
  :param entities (bool): converts html entities to unicode (foo & bar -> foo-bar)
  :param decimal (bool): converts html decimal to unicode (Ž -> Ž -> z)
  :param hexadecimal (bool): converts html hexadecimal to unicode (Ž -> Ž -> z)
  :param max_length (int): output string length
  :param word_boundary (bool): truncates to end of full words (length may be shorter than max_length)
  :param save_order (bool): if parameter is True and max_length > 0 return whole words in the initial order
  :param separator (str): separator between words
  :param stopwords (iterable): words to discount
  :param regex_pattern (str): regex pattern for allowed characters
  :param lowercase (bool): activate case sensitivity by setting it to False
  :param replacements (iterable): list of replacement rules e.g. [['|', 'or'], ['%', 'percent']]
  :return (str): slugify text
  """

How to use

from slugify import slugify

txt = "This is a test ---"
r = slugify(txt)
self.assertEqual(r, "this-is-a-test")

txt = '影師嗎'
r = slugify(txt)
self.assertEqual(r, "ying-shi-ma")

txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEqual(r, "c-est-deja-l-ete")

txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEqual(r, "nin-hao-wo-shi-zhong-guo-ren")

txt = 'Компьютер'
r = slugify(txt)
self.assertEqual(r, "kompiuter")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=9)
self.assertEqual(r, "jaja-lol")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=15, word_boundary=True)
self.assertEqual(r, "jaja-lol-a")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=20, word_boundary=True, separator=".")
self.assertEqual(r, "jaja.lol.mememeoo.a")

txt = 'one two three four five'
r = slugify(txt, max_length=13, word_boundary=True, save_order=True)
self.assertEqual(r, "one-two-three")

txt = 'the quick brown fox jumps over the lazy dog'
r = slugify(txt, stopwords=['the'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'the quick brown fox jumps over the lazy dog in a hurry'
r = slugify(txt, stopwords=['the', 'in', 'a', 'hurry'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'thIs Has a stopword Stopword'
r = slugify(txt, stopwords=['Stopword'], lowercase=False)
self.assertEqual(r, 'thIs-Has-a-stopword')

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, regex_pattern=regex_pattern)
self.assertEqual(r, "___this-is-a-test___")

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, separator='_', regex_pattern=regex_pattern)
self.assertNotEqual(r, "_this_is_a_test_")

txt = '10 | 20 %'
r = slugify(txt, replacements=[['|', 'or'], ['%', 'percent']])
self.assertEqual(r, "10-or-20-percent")

txt = 'ÜBER Über German Umlaut'
r = slugify(txt, replacements=[['Ü', 'UE'], ['ü', 'ue']])
self.assertEqual(r, "ueber-ueber-german-umlaut")

For more examples, have a look at the test.py file.

Command Line Options

With the package, a command line tool called slugify is also installed.

It allows convenient command line access to all the features the slugify function supports. Call it with -h for help.

The command can take its input directly on the command line or from STDIN (when the --stdin flag is passed):

$ echo "Taking input from STDIN" | slugify --stdin
taking-input-from-stdin
$ slugify taking input from the command line
taking-input-from-the-command-line

Please note that when a multi-valued option such as --stopwords or --replacements is passed, you need to use -- as separator before you start with the input:

$ slugify --stopwords the in a hurry -- the quick brown fox jumps over the lazy dog in a hurry
quick-brown-fox-jumps-over-lazy-dog

Running the tests

To run the tests against the current environment:

python test.py

Contribution

Please read the (wiki) page prior to raising any PRs.

License

Released under a (MIT) license.

Version

X.Y.Z Version

`MAJOR` version -- when you make incompatible API changes,
`MINOR` version -- when you add functionality in a backwards-compatible manner, and
`PATCH` version -- when you make backwards-compatible bug fixes.

Sponsors

Neekware Inc.

Owner
Val Neekman
Val is a Solutions Architect with a passion for Typescript-Angular, Python-Django, UX and Performant Software. He is the Principal Consultant at Neekware Inc.
Val Neekman
A program that looks through entered text and replaces certain commands with mathematical symbols

TextToSymbolConverter A program that looks through entered text and replaces certain commands with mathematical symbols Example: Syntax: Enter text in

1 Jan 02, 2022
text-to-speach bot - You really do NOT have time for read a newsletter? Now you can listen to it

NewsletterReader You really do NOT have time for read a newsletter? Now you can listen to it The Newsletter of Filipe Deschamps is a great place to re

ItanuRomero 8 Sep 18, 2021
Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

SeatGeek 1.2k Jan 01, 2023
This is an AI that is supposed to say you if your text is formal or not

This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the te

1 Jan 12, 2022
🍋 A Python package to process food

Pyfood is a simple Python package to process food, in different languages. Pyfood's ambition is to be the go-to library to deal with food, recipes, on

Local Seasonal 8 Apr 04, 2022
Implementation of hashids (http://hashids.org) in Python. Compatible with Python 2 and Python 3

hashids for Python 2.7 & 3 A python port of the JavaScript hashids implementation. It generates YouTube-like hashes from one or many numbers. Use hash

David Aurelio 1.4k Jan 02, 2023
基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

Pytex-for-MCM 基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端。

3 May 17, 2021
This repository contains scripts to control a RGB text fan attached to a Raspberry Pi.

RGB Text Fan Controller This repository contains scripts to control a RGB text fan attached to a Raspberry Pi. Setup The Raspberry Pi and RGB text fan

Luke Prior 1 Oct 01, 2021
Hotpotato is a recipe portfolio App that assists users to discover and comment new recipes.

Hotpotato Hotpotato is a recipe portfolio App that assists users to discover and comment new recipes. It is a fullstack React App made with a Redux st

Nico G Pierson 13 Nov 05, 2021
Question answering on russian with XLMRobertaLarge as a service

QA Roberta Ru SaaS Question answering on russian with XLMRobertaLarge as a service. Thanks for the model to Alexander Kaigorodov. Stack Flask Gunicorn

Gladkikh Prohor 21 Jul 04, 2022
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

Life4 3k Jan 02, 2023
Skype export archive to text converter for python

Skype export archive to text converter This software utility extracts chat logs

Roland Pihlakas open source projects 2 Jun 30, 2022
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

Niklas 29 Dec 20, 2022
This repos is auto action which generating a wordcloud made by Twitter.

auto_tweet_wordcloud This repos is auto action which generating a wordcloud made by Twitter. Preconditions Install Python dependencies pip install -r

tubone(Yu Otsubo) 0 Apr 29, 2022
Hspell, the free Hebrew spellchecker and morphology engine.

Hspell, the free Hebrew spellchecker and morphology engine.

16 Sep 15, 2022
Python Q&A for Network Engineers

Q & A I am often asked questions about how to solve this or that problem, and I decided to post these questions and solutions here, in case it is also

Natasha Samoylenko 30 Nov 15, 2022
Paranoid text spacing in Python

pangu.py Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width charact

Vinta Chen 194 Nov 19, 2022
strbind - lapidary text converter for translate an text file to the C-style string

strbind strbind - lapidary text converter for translate an text file to the C-style string. My motivation is fast adding large text chunks to the C co

Mihail Zaytsev 1 Oct 22, 2021
This is a text summarizing tool written in Python

Summarize Written by: Ling Li Ya This is a text summarizing tool written in Python. User Guide Some things to note: The application is accessible here

Marcus Lee 2 Feb 18, 2022
This project aims to test check if your RegExp are being matched by grep.

Bash RegExp This project aims to test check if your RegExp are being matched by grep. It's a local server that starts on the port 8080. It runs the se

Quatrecentquatre 1 Feb 28, 2022