Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Last update: Dec 26, 2022

Overview

TextBlob: Simplified Text Processing

Homepage: https://textblob.readthedocs.io/

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

from textblob import TextBlob

text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''

blob = TextBlob(text)
blob.tags           # [('The', 'DT'), ('titular', 'JJ'),
                    #  ('threat', 'NN'), ('of', 'IN'), ...]

blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

for sentence in blob.sentences:
    print(sentence.sentiment.polarity)
# 0.060
# -0.341

TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

Features

Noun phrase extraction
Part-of-speech tagging
Sentiment analysis
Classification (Naive Bayes, Decision Tree)
Tokenization (splitting text into words and sentences)
Word and phrase frequencies
Parsing
n-grams
Word inflection (pluralization and singularization) and lemmatization
Spelling correction
Add new models or languages through extensions
WordNet integration

Get it now

$ pip install -U textblob
$ python -m textblob.download_corpora

Examples

See more examples at the Quickstart guide.

Documentation

Full documentation is available at https://textblob.readthedocs.io/.

Requirements

Python >= 2.7 or >= 3.5

Project Links

Docs: https://textblob.readthedocs.io/
Changelog: https://textblob.readthedocs.io/en/latest/changelog.html
PyPI: https://pypi.python.org/pypi/TextBlob
Issues: https://github.com/sloria/TextBlob/issues

License

MIT licensed. See the bundled LICENSE file for more details.

Comments

HTTP Error 503: Service Unavailable while using detect_language() and translate() from textblob

python:3.5 textblob:0.15.1

seems it happened before and fixed in #148

the detail logs File "/usr/local/lib/python3.5/site-packages/textblob/blob.py", line 562, in detect_language return self.translator.detect(self.raw) File "/usr/local/lib/python3.5/site-packages/textblob/translate.py", line 72, in detect response = self._request(url, host=host, type_=type_, data=data) File "/usr/local/lib/python3.5/site-packages/textblob/translate.py", line 92, in _request resp = request.urlopen(req) File "/usr/local/lib/python3.5/urllib/request.py", line 163, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/local/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/lib/python3.5/urllib/request.py", line 504, in error result = self._call_chain(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 696, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "/usr/local/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/local/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/lib/python3.5/urllib/request.py", line 510, in error return self._call_chain(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 590, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp)
bug please-help

opened by craigchen1990 21
Language Detection Not Working (HTTP Error 503: Service Unavailable)

from textblob import TextBlob txt = u"Test Language Detection" b = TextBlob(txt) b.detect_language()

It is giving "HTTPError: HTTP Error 503: Service Unavailable"

Python Version: 2.7.6 TextBlob Version: 0.11.1 OS: Ubuntu 14.04 LTS & CentOS 6.8

opened by manurajhada 20
ModuleNotFoundError: No module named '_sqlite3'

Hello,

I'm migrating my script from my Mac to a AWS Linux instance. I upgraded the AWS instance to Python 3.6 before importing packages, including textbook. Now I get this error and cannot find where it's coming from. I'm not the greatest python programmer, but I did have it running perfectly on my Mac before installing it on AWS.

Here's the entire Traceback:

Traceback (most recent call last): File "wikiparser20170801.py", line 8, in from textblob import TextBlob File "/usr/local/lib/python3.6/site-packages/textblob/init.py", line 9, in from .blob import TextBlob, Word, Sentence, Blobber, WordList File "/usr/local/lib/python3.6/site-packages/textblob/blob.py", line 28, in import nltk File "/usr/local/lib/python3.6/site-packages/nltk/init.py", line 137, in from nltk.stem import * File "/usr/local/lib/python3.6/site-packages/nltk/stem/init.py", line 29, in from nltk.stem.snowball import SnowballStemmer File "/usr/local/lib/python3.6/site-packages/nltk/stem/snowball.py", line 26, in from nltk.corpus import stopwords File "/usr/local/lib/python3.6/site-packages/nltk/corpus/init.py", line 66, in from nltk.corpus.reader import * File "/usr/local/lib/python3.6/site-packages/nltk/corpus/reader/init.py", line 105, in from nltk.corpus.reader.panlex_lite import * File "/usr/local/lib/python3.6/site-packages/nltk/corpus/reader/panlex_lite.py", line 15, in import sqlite3 File "/usr/local/lib/python3.6/sqlite3/init.py", line 23, in from sqlite3.dbapi2 import * File "/usr/local/lib/python3.6/sqlite3/dbapi2.py", line 27, in from _sqlite3 import * ModuleNotFoundError: No module named '_sqlite3

opened by arnieadm35 17
correct() returns empty object
I tried using spell checking but correct() method returns an empty object. Following shows the method call on a terminal:

>>> from textblob import TextBlob >>> b = TextBlob("I havv goood speling!") >>> b.correct() TextBlob("") >>> print(b.correct()) >>>

I couldn't find a fix to this. I'm running Python 2.7.6 on Linux.
opened by shubhams 14
Translation not working - NotTranslated: Translation API returned the input string unchanged.

Hi, The translation is not working. thanks in advance,

In [1]: from textblob import TextBlob

In [2]: en_blob = TextBlob(u'Simple is better than complex.')

In [3]: en_blob.translate(to='es')

NotTranslated Traceback (most recent call last) in () ----> 1 en_blob.translate(to='es')

/usr/local/lib/python2.7/dist-packages/textblob-0.11.0-py2.7.egg/textblob/blob.pyc in translate(self, from_lang, to) 507 from_lang = self.translator.detect(self.string) 508 return self.class(self.translator.translate(self.raw, --> 509 from_lang=from_lang, to_lang=to)) 510 511 def detect_language(self):

/usr/local/lib/python2.7/dist-packages/textblob-0.11.0-py2.7.egg/textblob/translate.pyc in translate(self, source, from_lang, to_lang, host, type_) 43 return self.get_translation_from_json5(json5) 44 else: ---> 45 raise NotTranslated('Translation API returned the input string unchanged.') 46 47 def detect(self, source, host=None, type=None):

NotTranslated: Translation API returned the input string unchanged.

opened by edgaralts 13
Add Greedy Average Perceptron POS tagger

Hi,

I'm preparing a pull request for you, for a new POS tagger. This is the first time I've tried to contribute to someone else's project, so probably there'll be some weird teething pain stuff. Also I spend all day writing research code, so maybe parts of my style are atrocious :p.

The two main files are:

https://github.com/syllog1sm/TextBlob/blob/feature/greedy_ap_tagger/text/taggers.py https://github.com/syllog1sm/TextBlob/blob/feature/greedy_ap_tagger/text/_perceptron.py

I'm not quite done, but it's passing tests and its numbers are much better than the taggers you currently have hooks for:

NLTKTagger: 94.0 / 3m52 PatternTagger: 93.5 / 26s PerceptronTagger: 96.8 / 16s

Accuracy figures refer to sections 22-24 of the Wall Street Journal, a common English evaluation. There's a table of some accuracies from the literature here: http://aclweb.org/aclwiki/index.php?title=POS_Tagging_(State_of_the_art) . Speeds refer to time taken to tag the 129,654 words of input, including initialisation, on my Macbook Air.

If you check out that link, you'll see that the tagger's about 1% short of the pace for state-of-the-art accuracy. My Cython implementation has slightly better results, about 97.1, and it's a fair bit faster too. It's not very difficult to add some of the extra features to the Python implementation, or to improve its efficiency. Or we could hook in the Cython implementation, although that comes with much more baggage.

I think it's nice having the tagger in ~200 lines of pure Python though, with no dependencies. It should be fairly language independent too --- I'll run some tests to see how it does.

opened by syllog1sm 13
error intranslation

url not found error sometime:

File "/usr/local/lib/python3.8/dist-packages/textblob/blob.py", line 546, in translate return self.class(self.translator.translate(self.raw, File "/usr/local/lib/python3.8/dist-packages/textblob/translate.py", line 54, in translate response = self.request(url, host=host, type=type_, data=data) File "/usr/local/lib/python3.8/dist-packages/textblob/translate.py", line 92, in _request resp = request.urlopen(req) File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.8/urllib/request.py", line 531, in open response = meth(req, response) File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response response = self.parent.error( File "/usr/lib/python3.8/urllib/request.py", line 569, in error return self._call_chain(*args) File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(*args) File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

opened by mannan291 12
NaiveBayesClassifier taking too long

Hi, I've a small dataset of 1000 tweets which I've classify in pos/neg for training. When I tried to use it at the NaiveBayesClassifier() it tooks like 10-15min to return a result... Is there a way to save the result of the classifier like a dump and reuse that for further classifications ?

Thanks

opened by canivel 12
Deploying TextBlob on remote server

Hi,

I am trying to deploy TextBlob on remote server hosted on heroku. Heroku to my knowledge uses pip freeze > requirements.txt to understand the dependencies and install them on the remote server.

The code works perfectly on my local machine, but on the remote server looks for the NLTK corpus and throws an exception.

How do I install TextBlob dependencies on remote server?

I am using virtualenv

opened by seekshreyas 11
since 0.7.1 having trouble with the package

On both my mac and linux machines I have the same problem with 0.7.1

from text.blob import TextBlob Traceback (most recent call last): File "", line 1, in File "text.py", line 5, in from text.blob import TextBlob ImportError: No module named blob

my sys.path does not contain the textblob module

import sys for p in sys.path: ... print p ...

/Library/Python/2.7/site-packages/ipython-2.0.0_dev-py2.7.egg /Library/Python/2.7/site-packages/matplotlib-1.3.0-py2.7-macosx-10.8-intel.egg /Library/Python/2.7/site-packages/numpy-1.9.0.dev_fde3dee-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/pandas-0.12.0_485_g02612c3-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/pymc-2.3a-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/scikit_learn-0.14_git-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/scipy-0.14.0.dev_4938da3-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/statsmodels-0.6.0-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/readline-6.2.4.1-py2.7-macosx-10.7-intel.egg /Library/Python/2.7/site-packages/nose-1.3.0-py2.7.egg /Library/Python/2.7/site-packages/six-1.4.1-py2.7.egg /Library/Python/2.7/site-packages/pyparsing-1.5.7-py2.7.egg /Library/Python/2.7/site-packages/pytz-2013.7-py2.7.egg /Library/Python/2.7/site-packages/pyzmq-13.1.0-py2.7-macosx-10.6-intel.egg /Library/Python/2.7/site-packages/pika-0.9.13-py2.7.egg /Library/Python/2.7/site-packages/Jinja2-2.7.1-py2.7.egg /Library/Python/2.7/site-packages/MarkupSafe-0.18-py2.7-macosx-10.8-intel.egg /Library/Python/2.7/site-packages/patsy-0.2.1-py2.7.egg /Library/Python/2.7/site-packages/Pygments-1.6-py2.7.egg /Library/Python/2.7/site-packages/Sphinx-1.2b3-py2.7.egg /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/PyObjC /Library/Python/2.7/site-packages

despite it being there. I have uninstalled and reinstalled and tried all sorts of things:

mbpdar:deaas daren$ ls /Library/Python/2.7/site-packages/te* /Library/Python/2.7/site-packages/text: /Library/Python/2.7/site-packages/textblob-0.7.1-py2.7.egg-info:

I've verified the init.py doesn't have odd characters. if I change to the /Library/Python/2.7/site-packages/text folder I am able to import:

mbpdar:deaas daren$ cd /Library/Python/2.7/site-packages/text mbpdar:text daren$ python Python 2.7.2 (default, Oct 11 2012, 20:14:37) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin Type "help", "copyright", "credits" or "license" for more information.

from text.blob import TextBlob

I cannot figure out what changed that might cause this.

Thanks in advance Daren

opened by darenr 11
(after 0.5.1) - AttributeError: 'module' object has no attribute 'compat'

Traceback (most recent call last): File "sentiment.py", line 1, in from text.blob import TextBlob File "/usr/local/lib/python2.7/dist-packages/text/blob.py", line 149, in @nltk.compat.python_2_unicode_compatible AttributeError: 'module' object has no attribute 'compat'
bug

opened by ghost 11
Getting wrong value
from textblob import TextBlob text = "Hi, I'm from Canada" text2 = TextBlob(text) Correct = text2.correct() print(Correct)

Hi when I run the above code I get output I, I"m from Canada

which is wrong, am I doing something wrong here? please help
opened by Mank0o 0
Joining TextBlobs / Sentence
Not sure if this is a bug or feature request, I do enjoy that I can concentrate your Objects like strings. Now I wanted to concentrate a list, but ran into the following:

" ".join(storedSentences)

Getting: TypeError: sequence item 0: expected str instance, Sentence found

Unfortunately, the other way does not work either:

Sentence(" ").join(storedSentences)

TypeError: sequence item 0: expected str instance, Sentence found

Maybe I am doing it wrong?

PS: Great library! Especially the TextBlobs Are Like Python Strings! makes things really easy, thanks for implementing that :)
opened by thomasf1 0
Modify TextBlob sentiment prediction algorithm
I am trying to work on a use-case which requires predicting the polarity but the result is not accurate. Our main focus is on the -ve inputs but it is unable to find it with confidence. I tried to go through the github code base and understand how exactly the sentiment is predicted by the algo but was unable to get a clear picture.

So I have 3 questions:

Can we modify and retrain the the algorithm by passing more training data? If YES, then how can we do that?

Textblob sentiment analysis using Naive Bayes but what I want to understand is what steps are happening after passing the data to tb = TextBlob(data) and then calling tb.sentiment on it. I would really appreciate if I can have a detailed steps including preprocessing, etc.

I am performing the following preprocessing steps before passing the data to TextBlob:

removing numbers, dates, months, urls, hashtags, mentions, etc

lowercasing,

removing punctuation marks

stop word removal and converting -ve words like don't to just not as do is a stop word, etc

Can you suggest if removing/ adding any of the above steps will lead to grater confidence & accuracy in polarity prediction?
opened by Deepankar-98 0
Errors occurred when using Naive Bayes for sentiment classification
As the question, when I use the Bayesian classifier for emotion classification, due to the excessive amount of data, when the amount of data exceeds 10,000, it will be automatically killed by the system, and there is no problem when the amount of data is not large

How do you save a trained naïve Bayes model?
opened by yaoysyao 0

Detecting language / get HTTP Error 400?

Hello - i try to detect a language from a word with this code:

from textblob import TextBlob
b = TextBlob("bonjour")
print(b.detect_language())

But unfortunately i get this error:

$ python exmplTextBlob.py
Traceback (most recent call last):
  File "C:\Users\Polzi\Documents\DEV\Python-Diverses\Textblob\exmplTextBlob.py", line 4, in <module>
    print(b.detect_language())
  File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\blob.py", line 597, in detect_language
    return self.translator.detect(self.raw)
  File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\translate.py", line 76, in detect
    response = self._request(url, host=host, type_=type_, data=data)
  File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\translate.py", line 96, in _request
    resp = request.urlopen(req)
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 523, in open
    response = meth(req, response)
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 632, in http_response
    response = self.parent.error(
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 561, in error
    return self._call_chain(*args)
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain
    result = func(*args)
  File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Why is that - am i doing anything wrong?

opened by Rapid1898-code 2

Python 3.9 compatibility

Hello I want to thank you for the project and comment that checking in PyPi I found that it was compatible up to Python 3.8, however I am in 3.9 and it works properly, I would like to know how it can be updated in PyPi. I take this opportunity to ask you if Textblob will have support for 3.11, which will be out soon? Thanks.

opened by xasg 0

Releases(0.7.0)

0.7.0(Sep 26, 2013)

Source code(tar.gz)
Source code(zip)
0.6.3(Sep 15, 2013)

0.6.3 Final release
Source code(tar.gz)
Source code(zip)
trontagger.pickle.gz(3.25 MB)
0.6.3-alpha(Sep 14, 2013)

Source code(tar.gz)
Source code(zip)
trontagger.pickle.gz(3.25 MB)

Owner

Steven Loria

Always a student, forever a junior developer

GitHub Repository https://textblob.readthedocs.io/

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

rJAM splitscreen message reader for MysticBBS A46+

4 Nov 22, 2022

Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

TOPSIS implementation in Python Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) CHING-LAI Hwang and Yoon introduced TOPSIS

8 Dec 10, 2022

Python module (C extension and plain python) implementing Aho-Corasick algorithm

pyahocorasick pyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search meaning that you can find mult

763 Dec 27, 2022

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

1 Nov 24, 2021

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

5.1k Dec 26, 2022

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab. Grad-TTS Official implementation of the Grad-

295 Jan 07, 2023

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Toward a Visual Concept Vocabulary for GAN Latent Space Code and data from the ICCV 2021 paper Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Kl

13 Dec 23, 2022

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Crosslingual Coreference Coreference is amazing but the data required for training a model is very scarce. In our case, the available training for non

71 Jan 04, 2023

Community and sentiment analysis based on tweets

The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of th

3 Nov 17, 2022

Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

193 Dec 30, 2022

Rank-One Model Editing for Locating and Editing Factual Knowledge in GPT

Rank-One Model Editing (ROME) This repository provides an implementation of Rank-One Model Editing (ROME) on auto-regressive transformers (GPU-only).

130 Dec 21, 2022

Clone a voice in 5 seconds to generate arbitrary speech in real-time

This repository is forked from Real-Time-Voice-Cloning which only support English. English | 中文 Features 🌍 Chinese supported mandarin and tested with

25.6k Jan 06, 2023

NLP library designed for reproducible experimentation management

Welcome to the Transfer NLP library, a framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP You can

290 Dec 20, 2022

Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.

Sentiment Analyzer The goal of this project is to perform sentiment analysis on textual data that people generally post on websites like social networ

53 Mar 01, 2022

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation This repository contains the implementation of the following paper: Live Speech

575 Dec 31, 2022

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

GPT-2 Catalan playground and scripts to train a GPT-2 model either from scrath or from another pretrained model.

1 Jan 28, 2022