This program scrapes information and images for movies and TV shows.

Last update: Dec 05, 2021

Related tags

Overview

Media-WebScraper

This program scrapes information and images for movies and TV shows.

Summary

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

For a given list of media, the program will scrape and save general information, images and any episode information for each media.

General Information (default):

Saved as a .txt file

This will scrape general information:

Title
Release date
Runtime
Genre
Director
Cast
Plot description

Additional information saved:

Source database used for scrape
ID for media in source database
Poster image link

Images (default):

Saved as a .jpg file

This will scrape the poster.

Episode Information (if specified):

Saved as a .csv file

This will scrape information for each episode for a TV show:

Season number
Episode number
Episode title
Episode air date
Episode description

Features:

Multithreaded scraping for media in list to greatly improve the time taken when scraping for large media lists.
Can generate a media list from folders and files in a specified directory or from user input.
Can specify save location for scraped data.
Can specify search tags for media list for a more accurate scrape.
Can choose to scrape all episode information for a TV show.
Can detect if data is already scraped which allows for scraping new media from an already scraped list of media very efficient.
Can recover missing scraped files if one or more are missing without rescraping all data.
Can retry the scrape before exiting the program if there were any incomplete scrapes (successfully scraped files will not be altered or rescraped).
Currently only supports scraping data from IMDb.

Usage:

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

Currently a terminal-based program.

Running the program using python:

Requirements: Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable file (created using pyinstaller):

Requirements: Windows 10
Creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.
- The temporary files will delete automatically but if the program is closed abruptly, the files will remain.
- The 'temp' folder can be manually deleted after closing the program.
- (As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Updates:

For information on version history, read the HISTORY markdown file.

Scrapes proxies and saves them to a text file

Proxy Scraper Scrapes proxies from https://proxyscrape.com and saves them to a file. Also has a customizable theme system Made by nell and Lamp

2 Dec 22, 2021

Meme-videos - Scrapes memes and turn them into a video compilations

Meme Videos Scrapes memes from reddit using praw and request and then converts t

12 Oct 28, 2022

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

1 Feb 10, 2022

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

WebScraping Web scraping Pyton program that scrapes Job website for python devel

2 Jul 22, 2022

:arrow_double_down: Dumb downloader that scrapes the web

You-Get NOTICE: Read this if you are looking for the conventional "Issues" tab. You-Get is a tiny command-line utility to download media contents (vid

46.4k Jan 3, 2023

Anonymously scrapes onlinesim.ru for new usable phone numbers.

phone-scraper Anonymously scrapes onlinesim.ru for new usable phone numbers. Usage Clone the repository $ git clone https://github.com/thomasgruebl/ph

16 Oct 8, 2022

A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)

6 Aug 10, 2022

Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

18 Dec 14, 2022

Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

2 Jan 15, 2022

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)
WebScrape v1.3.0

See version history document for all changes.

Running the program using python:

Download the source code.

Requirements:

Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable:

Download the WebScrape-1.3.0 zip file containing the bundled executable (created using pyinstaller).

Requirements:

Windows 10

Note:

The executable file creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.

The temporary files will delete automatically but if the program is closed abruptly, the files will remain.

The 'temp' folder can be manually deleted after closing the program.

(As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Source code(tar.gz)
Source code(zip)
WebScrape-1.3.0.zip(8.71 MB)

This program scrapes information and images for movies and TV shows.

Related tags

Overview

Media-WebScraper

Summary

General Information (default):

Images (default):

Episode Information (if specified):

Features:

Usage:

Running the program using python:

Running the program from bundled executable file (created using pyinstaller):

Updates:

You might also like...

Scrapes proxies and saves them to a text file

Meme-videos - Scrapes memes and turn them into a video compilations

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

:arrow_double_down: Dumb downloader that scrapes the web

Anonymously scrapes onlinesim.ru for new usable phone numbers.

A Python package that scrapes Google News article data while remaining undetected by Google.

Scrapes Every Email Address of Every Society in Every University

Automatically scrapes all menu items from the Taco Bell website

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)

WebScrape v1.3.0

Running the program using python:

Requirements:

Running the program from bundled executable:

Requirements:

Note:

Owner

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

A Python library for automating interaction with websites.

Basic-html-scraper - A complete how to of web scraping with Python for beginners

A universal package of scraper scripts for humans

A web crawler script that crawls the target website and lists its links

A Powerful Spider(Web Crawler) System in Python.

Scrap the 42 Intranet's elearning videos in a single click

爬取各大SRC当日公告 | 通过微信通知的小工具 | 赏金工具

Jobinja.ir jobs scraper.

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

A simple reddit scraper to get memes (only images) from r/ProgrammerHumor.

Amazon web scraping using Scrapy Framework

Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

This is a webscraper for a specific website

Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

Extract embedded metadata from HTML markup

Tool to scan for secret files on HTTP servers

Web scraper for Zillow