Returns unicode slugs

Overview

Python Slugify

A Python slugify application that handles unicode.

status-image version-image coverage-image

Overview

Best attempt to create slugs from unicode strings while keeping it DRY.

Notice

This module, by default installs and uses text-unidecode (GPL & Perl Artistic) for its decoding needs.

However, there is an alternative decoding package called Unidecode (GPL). It can be installed as python-slugify[unidecode] for those who prefer it.

How to install

easy_install python-slugify |OR| easy_install python-slugify[unidecode]
-- OR --
pip install python-slugify |OR| pip install python-slugify[unidecode]

Options

def slugify(
    text,
    entities=True,
    decimal=True,
    hexadecimal=True,
    max_length=0,
    word_boundary=False,
    separator='-',
    save_order=False,
    stopwords=(),
    regex_pattern=None,
    lowercase=True,
    replacements=()
  ):
  """
  Make a slug from the given text.
  :param text (str): initial text
  :param entities (bool): converts html entities to unicode (foo & bar -> foo-bar)
  :param decimal (bool): converts html decimal to unicode (Ž -> Ž -> z)
  :param hexadecimal (bool): converts html hexadecimal to unicode (Ž -> Ž -> z)
  :param max_length (int): output string length
  :param word_boundary (bool): truncates to end of full words (length may be shorter than max_length)
  :param save_order (bool): if parameter is True and max_length > 0 return whole words in the initial order
  :param separator (str): separator between words
  :param stopwords (iterable): words to discount
  :param regex_pattern (str): regex pattern for allowed characters
  :param lowercase (bool): activate case sensitivity by setting it to False
  :param replacements (iterable): list of replacement rules e.g. [['|', 'or'], ['%', 'percent']]
  :return (str): slugify text
  """

How to use

from slugify import slugify

txt = "This is a test ---"
r = slugify(txt)
self.assertEqual(r, "this-is-a-test")

txt = '影師嗎'
r = slugify(txt)
self.assertEqual(r, "ying-shi-ma")

txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEqual(r, "c-est-deja-l-ete")

txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEqual(r, "nin-hao-wo-shi-zhong-guo-ren")

txt = 'Компьютер'
r = slugify(txt)
self.assertEqual(r, "kompiuter")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=9)
self.assertEqual(r, "jaja-lol")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=15, word_boundary=True)
self.assertEqual(r, "jaja-lol-a")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=20, word_boundary=True, separator=".")
self.assertEqual(r, "jaja.lol.mememeoo.a")

txt = 'one two three four five'
r = slugify(txt, max_length=13, word_boundary=True, save_order=True)
self.assertEqual(r, "one-two-three")

txt = 'the quick brown fox jumps over the lazy dog'
r = slugify(txt, stopwords=['the'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'the quick brown fox jumps over the lazy dog in a hurry'
r = slugify(txt, stopwords=['the', 'in', 'a', 'hurry'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'thIs Has a stopword Stopword'
r = slugify(txt, stopwords=['Stopword'], lowercase=False)
self.assertEqual(r, 'thIs-Has-a-stopword')

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, regex_pattern=regex_pattern)
self.assertEqual(r, "___this-is-a-test___")

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, separator='_', regex_pattern=regex_pattern)
self.assertNotEqual(r, "_this_is_a_test_")

txt = '10 | 20 %'
r = slugify(txt, replacements=[['|', 'or'], ['%', 'percent']])
self.assertEqual(r, "10-or-20-percent")

txt = 'ÜBER Über German Umlaut'
r = slugify(txt, replacements=[['Ü', 'UE'], ['ü', 'ue']])
self.assertEqual(r, "ueber-ueber-german-umlaut")

For more examples, have a look at the test.py file.

Command Line Options

With the package, a command line tool called slugify is also installed.

It allows convenient command line access to all the features the slugify function supports. Call it with -h for help.

The command can take its input directly on the command line or from STDIN (when the --stdin flag is passed):

$ echo "Taking input from STDIN" | slugify --stdin
taking-input-from-stdin
$ slugify taking input from the command line
taking-input-from-the-command-line

Please note that when a multi-valued option such as --stopwords or --replacements is passed, you need to use -- as separator before you start with the input:

$ slugify --stopwords the in a hurry -- the quick brown fox jumps over the lazy dog in a hurry
quick-brown-fox-jumps-over-lazy-dog

Running the tests

To run the tests against the current environment:

python test.py

Contribution

Please read the (wiki) page prior to raising any PRs.

License

Released under a (MIT) license.

Version

X.Y.Z Version

`MAJOR` version -- when you make incompatible API changes,
`MINOR` version -- when you add functionality in a backwards-compatible manner, and
`PATCH` version -- when you make backwards-compatible bug fixes.

Sponsors

Neekware Inc.

Owner
Val Neekman
Val is a Solutions Architect with a passion for Typescript-Angular, Python-Django, UX and Performant Software. He is the Principal Consultant at Neekware Inc.
Val Neekman
CowExcept - Spice up those exceptions with cowexcept!

CowExcept - Spice up those exceptions with cowexcept!

James Ansley 41 Jun 30, 2022
The Scary Story - A Text Adventure

This is a text adventure which I made in python 3. This is one of my first big projects so any feedback would be greatly appreciated.

2 Feb 20, 2022
PyNews 📰 Simple newsletter made with python 🐍🗞️

PyNews 📰 Simple newsletter made with python Install dependencies This project has some dependencies (see requirements.txt) that are not included in t

Luciano Felix 4 Aug 21, 2022
Search for terms(word / table / field name or any) under Snowflake schema names

snowflake-search-terms-in-ddl-views Search for terms(word / table / field name or any) under Snowflake schema names Version : 1.0v How to use ? Run th

Igal Emona 1 Dec 15, 2021
An implementation of figlet written in Python

All of the documentation and the majority of the work done was by Christopher Jones ([emai

Peter Waller 1.1k Jan 02, 2023
ChirpText is a collection of text processing tools for Python 3.

ChirpText is a collection of text processing tools for Python 3. It is not meant to be a powerful tank like the popular NTLK but a small package which

Le Tuan Anh 5 Nov 30, 2022
基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

Pytex-for-MCM 基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端。

3 May 17, 2021
Adventura is an open source Python Text Adventure Engine

Adventura Adventura is an open source Python Text Adventure Engine, Not yet uplo

5 Oct 02, 2022
This repository contains scripts to control a RGB text fan attached to a Raspberry Pi.

RGB Text Fan Controller This repository contains scripts to control a RGB text fan attached to a Raspberry Pi. Setup The Raspberry Pi and RGB text fan

Luke Prior 1 Oct 01, 2021
Repositori untuk belajar pemrograman Python dalam bahasa Indonesia

Python Repositori ini berisi kumpulan dari berbagai macam contoh struktur data, algoritma dan komputasi matematika yang diimplementasikan dengan mengg

Bellshade 111 Dec 19, 2022
utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags.

utoken utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresse

Ulf Hermjakob 11 Jan 05, 2023
This script has been created in order to find what are the most common demanded technologies in Data Engineering field.

This is a Python script that given a whole corpus of job descriptions and a file with keywords it extracts the number of number of ocurrences of these keywords and write it to a file. This script it

Antonio Bri Pérez 0 Jul 17, 2022
PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages

PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different

Pablo Pizarro R. 19 Dec 26, 2022
Little python script + dictionary to help solve Wordle puzzles

Wordle Solver Little python script + dictionary to help solve Wordle puzzles Usage Usage: ./wordlesolver.py [letters in word] [letters not in word] [p

Luke Stephens (hakluke) 4 Jul 24, 2022
一个可以可以统计群组用户发言,并且能将聊天内容生成词云的机器人

当前版本 v2.2 更新维护日志 更新维护日志 有问题请加群组反馈 Telegram 交流反馈群组 点击加入 演示 配置要求 内存:1G以上 安装方法 使用 Docker 安装 Docker官方安装

机器人总动员 117 Dec 29, 2022
Getting git-style versioning working on RDFlib

Getting git-style versioning working on RDFlib

Gabe Fierro 1 Feb 01, 2022
A simple text editor for linux

wolf-editor A simple text editor for linux Installing using Deb Package Download newest package from releases CD into folder where the downloaded acka

Focal Fossa 5 Nov 30, 2021
Migrates translations to the REDCap native Multi-Language Management system

Automates much of the process of moving translations from the old Multilingual external module to the newer built-in Multi-Language Management (MLM) page.

UCI MIND 3 Sep 27, 2022
Making simplex testing clean and simple

Making Simplex Project Testing - Clean and Simple What does this repo do? It organizes the python stack for the coding project What do I need to do in

Mohit Mahajan 1 Jan 30, 2022