Twitter Scraper

Related tags

Web Crawlingtweety
Overview

tweety

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely fast.

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Internet Connection
  • Python 3.6+
  • BeautifulSoup (Python Module)
  • Requests (Python Module)

All Functions

  • get_tweets()
  • get_user_info()
  • get_trends() (can be used without username)
  • search() (can be used without username)
  • tweet_detail() (can be used without username)

Using tweety

Getting Tweets:

Description:

Get 20 Tweets of a Twitter User

Required Parameter:

  • Username or User profile URL while initiating the Twitter Object

Optional Parameter:

  • pages : int (default is 1,starts from 2) -> Get the mentioned number of pages of tweets
  • include_extras : boolean (default is False) -> Get different extras on the page like Topics etc

Output:

  • Type -> dictionary
  • Structure
    {
      "p-1" : {
        "result": {
            "tweets": []
        }
      },
      "p-2":{
        "result": {
            "tweets": []
        }
      }
    }

Example:

>> from tweet import Twitter >>> all_tweet = Twitter("Username or URL").get_tweets(pages=2) >>> for i in all_tweet: ... print(all_tweet[i]) ">
python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> all_tweet = Twitter("Username or URL").get_tweets(pages=2)
>>> for i in all_tweet:
...   print(all_tweet[i])

Getting Trends:

Description:

Get 20 Locale Trends

Output:

  • Type -> dictionary
  • Structure
", "url":" " }, { "name":" ", "url":" " } ] } ">
  {
    "trends":[
      {
        "name":"
      
       "
      ,
        "url":"
      
       "
      
      },
      {
        "name":"
      
       "
      ,
        "url":"
      
       "
      
      }
    ]
  } 

Example :

>> from tweet import Twitter >>> trends = Twitter().get_trends() >>> for i in trends['trends']: ... print(i['name']) ">
python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().get_trends()
>>> for i in trends['trends']:
...   print(i['name'])

Searching a keyword:

Description:

Get 20 Tweets for a specific Keyword or Hashtag

Required Parameter:

  • keyword : str -> Keyword begin search

Optional Parameter:

  • latest : boolean (Default is False) -> Get the latest tweets

Output:

  • Type -> list

Example:

>> from tweet import Twitter >>> trends = Twitter().search("Pakistan") ">
python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().search("Pakistan")

Getting USER Info:

Description:

Get the information about the user

Required Parameter:

  • Username or User profile URL while initiating the Twitter Object

Optional Parameter:

  • banner_extensions : boolean (Default is False) -> get more information about user banner image
  • image_extensions : boolean (Default is False) -> get more information about user profile image

Output:

  • Type -> dict

Example:

>> from tweet import Twitter >>> trends = Twitter("Username or URL").get_user_info() ">
python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter("Username or URL").get_user_info()

Getting a Tweet Detail:

Description:

Get the detail of a tweet including its reply

Required Parameter:

  • Identifier of the Tweet -> Either Tweet URL OR Tweet ID

Output:

  • Type -> dict
  • Structure
  {
    "conversation_threads":[],
    "tweet": {}
  }

Example:

>> from tweet import Twitter >>> trends = Twitter().tweet_detail("https://twitter.com/Microsoft/status/1442542812197801985") ">
python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().tweet_detail("https://twitter.com/Microsoft/status/1442542812197801985")

Updates:

Update 0.1:

  • Get Multiple Pages of tweets using pages parameter in get_tweets() function
  • output of get_tweets has been reworked.

Update 0.2:

Update 0.2.1:

  • Fixed Hashtag Search
Owner
Tayyab Kharl
Newbie But Passionate
Tayyab Kharl
A Python module to bypass Cloudflare's anti-bot page.

cloudflare-scrape A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Reque

3k Jan 04, 2023
This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.

crawler_to_visual_gmane Analyzing an EMAIL Archive from gmane and vizualizing the data using the D3 JavaScript library. This is a set of tools that al

Saim Zafar 1 Dec 20, 2021
Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTss python Package extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install GetTss Us

laojunjun 6 Nov 21, 2022
Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Toxicity comments crawler Crawler job that scrapes comments from social media posts and saves them in a S3 bucket. Twitter Tweets and replies are scra

Douglas Trajano 2 Jan 24, 2022
API to parse tibia.com content into python objects.

Tibia.py An API to parse Tibia.com content into object oriented data. No fetching is done by this module, you must provide the html content. Features:

Allan Galarza 25 Oct 31, 2022
a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

George Reyes 7 Nov 27, 2022
Python scrapper scrapping torrent website and download new movies Automatically.

torrent-scrapper Python scrapper scrapping torrent website and download new movies Automatically. If you like it Put a ⭐ on this repo 😇 Run this git

Fazil vk 1 Jan 08, 2022
哔哩哔哩爬取器:以个人为中心

Open Bilibili Crawer 哔哩哔哩是一个信息非常丰富的社交平台,我们基于此构造社交网络。在该网络中,节点包括用户(up主),以及视频、专栏等创作产物;关系包括:用户之间,包括关注关系(following/follower),回复关系(评论区),转发关系(对视频or动态转发);用户对创

Boshen Shi 3 Oct 21, 2021
对于有验证码的站点爆破,用于安全合法测试

使用方法 python3 main.py + 配置好的文件 python3 main.py Verify.json python3 main.py NoVerify.json 以上分别对应有验证码的demo和无验证码的demo Tips: 你可以以域名作为配置文件名字加载:python3 main

47 Nov 09, 2022
Dictionary - Application focused on word search through web scraping

Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.

Juan Manuel 2 May 09, 2022
Open Crawl Vietnamese Text

Open Crawl Vietnamese Text This repo contains crawled Vietnamese text from multiple sources. This list of a topic-centric public data sources in high

QAI Research 4 Jan 05, 2022
Scraping web pages to get data

Scraping Data Get public data and save in database This is project use Python How to run a project 1 - Clone the repository 2 - Install beautifulsoup4

Soccer Project 2 Nov 01, 2021
河南工业大学 完美校园 自动校外打卡

HAUT-checkin 河南工业大学自动校外打卡 由于github actions存在明显延迟,建议直接使用腾讯云函数 特点 多人打卡 使用简单,仅需账号密码以及用于微信推送的uid 自动获取上一次打卡信息用于打卡 向所有成员微信单独推送打卡状态 完美校园服务器繁忙时造成打卡失败会自动重新打卡

36 Oct 27, 2022
This project was created using Python technology and flask tools to scrape a music site

python-scrapping This project was created using Python technology and flask tools to scrape a music site You need to install the following packages to

hosein moradi 1 Dec 07, 2021
This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the

David Souza 1 Jan 12, 2022
Scrape Twitter for Tweets

Backers Thank you to all our backers! 🙏 [Become a backer] Sponsors Support this project by becoming a sponsor. Your logo will show up here with a lin

Ahmet Taspinar 2.2k Jan 05, 2023
a high-performance, lightweight and human friendly serving engine for scrapy

a high-performance, lightweight and human friendly serving engine for scrapy

Speakol Ads 30 Mar 01, 2022
This script is intended to crawl license information of repositories through the GitHub API.

GithubLicenseCrawler This script is intended to crawl license information of repositories through the GitHub API. Taking a csv file with requirements.

schutera 4 Oct 25, 2022
A tool can scrape product in aliexpress: Title, Price, and URL Product.

Scrape-Product-Aliexpress A tool can scrape product in aliexpress: Title, Price, and URL Product. Usage: 1. Install Python 3.8 3.9 padahal halaman ins

Rahul Joshua Damanik 1 Dec 30, 2021
Deep Web Miner Python | Spyder Crawler

Webcrawler written in Python. This crawler does dig in till the 3 level of inside addressed and mine the respective data accordingly

Karan Arora 17 Jan 24, 2022