Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
An instagram bot developed in Python with Selenium that helps you get more Instagram followers.

instabot An instagram bot developed in Python with Selenium that helps you get more Instagram followers. Install You’ll need to have: Python Selenium

65 Nov 22, 2022
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python

Definitive Guide to Creating a SQL Database on Cloud with AWS and Python An easy-to-follow comprehensive guide on integrating Amazon RDS, MySQL Workbe

Kenneth Leung 6 Aug 17, 2022
this synchronizes my appearances with my calendar

Josh's Schedule Synchronizer Here's the "problem:" I use a spreadsheet to maintain all my public appearances. I check the spreadsheet as often as poss

Josh Long 2 Oct 18, 2021
Search twitter by address.

Twitter Geolocate Twitter Geolocation is a console app that generates twitter search querries for a certain geolocation and opens them in your standar

David J. Kowalk 28 Dec 06, 2022
An advanced telegram language translator bot

Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https://github.com/FayasNoushad/Translator-Bot-V3/blob/main/LICE

Fayas Noushad 19 Dec 24, 2022
칼만 필터는 어렵지 않아(저자 김성필) 파이썬 코드(Unofficial)

KalmanFilter_Python 칼만 필터는 어렵지 않아(저자 김성필) 책을 공부하면서, Matlab 코드를 Python으로 변환한 것입니다. Contents Part01. Recursive Filter Chapter01. Average Filter Chapter0

Donghun Park 20 Oct 28, 2022
Yes, it's true :purple_heart: This repository has 353 stars.

Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If

510 Dec 28, 2022
Instagram bot that upload images for you which scrape posts from 9gag meme website or other Instagram users , which is 24/7 Automated Runnable.

Autonicgram Automates your Instagram posts by taking images from sites like 9gag or other Instagram accounts and posting it onto your page. Features A

Mastermind 20 Sep 17, 2022
Quickly visualize docker networks with graphviz.

Docker Network Graph Visualize the relationship between Docker networks and containers as a neat graphviz graph. Example Usage usage: docker-net-graph

Leo Verto 43 Dec 12, 2022
Pluggable Telethon - Telegram UserBot

A stable pluggable Telegram userbot, based on Telethon.

Team Ultroid 2.3k Dec 30, 2022
Telegram File Renamer Bot

RENAMER_BOT Telegram File Renamer Bot Configs TG_BOT_TOKEN - Get bot token from @BotFather API_ID - From my.telegram.org API_HASH - From my.telegram.o

Lntechnical 37 Dec 27, 2022
Exporta archivos masivamente del TEC Digital.

TEC Digital Files Exporter Script que permite exportar los archivos de cursos del TEC Digital del Instituto Tecnológico de Costa Rica, debido al borra

Joseph Vargas 22 Apr 08, 2021
Python library for interacting with the Wunderlist 2 REST API

Overview Wunderpy2 is a thin Python library for accessing the official Wunderlist 2 API. What does a thin library mean here? Only the bare minimum of

mieubrisse 24 Dec 29, 2020
An async-ready Python wrapper around FerrisChat's API.

FerrisWheel An async-ready Python wrapper around FerrisChat's API. Installation Instructions Linux: $ python3.9 -m pip install -U ferriswheel Python 3

FerrisChat 8 Feb 08, 2022
A minimalistic, modern Discord bot for roles and polls using dropdowns

DropBot A minimalistic, modern Discord bot for roles and polls using dropdowns Made by ThatOneCalculator Technologies used Instructions Type /, and na

ModernBots 1 Jun 27, 2022
TG-Url-Uploader-Bot - Telegram RoBot to Upload Links

MW-URL-Uploader Bot Telegram RoBot to Upload Links. Features: 👉 Only Auth Users

Aadhi 3 Jun 27, 2022
Bot to notify when vaccine appointments are available

Vaccine Watch Bot to notify when vaccine appointments are available. Supports checking Hy-Vee, Walgreens, CVS, Walmart, Cosentino's stores (KC), and B

Peter Carnesciali 37 Aug 13, 2022
Male' Map Telegram Bot

Male' Map TelegramBot A simple TelegramBot to fetch residential addresses in Male', Maldives. The bot can be queried inline or directly. sample .env f

Naail Abdul Rahman 12 Nov 25, 2022
A multipurpose bot designed to make Discord better for everyone, written in Python.

Hadum A multipurpose bot that makes Discord better for everyone Features A Fully Functional Moderation component: manage your staff, members and permi

1 Jan 25, 2022
A melhor maneira de atender seus clientes no Telegram!

Clientes.Chat Sobre o serviço Configuração Banco de Dados Variáveis de Ambiente Docker Python Heroku Contribuição Sobre o serviço A maneira mais organ

Gabriel R F 10 Oct 12, 2022