A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Last update: Dec 13, 2021

Overview

combined-shop-scraper

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items.

Features

Define an input file components.json with components to be scraped and the source urls
Find the cheapest order combination including the shipping prices
Get alarm prices when single components are below a defined price
Easily expand for new shops (scraping basic know-how required). Default basic support for notebooksbilliger, cyberport and future-x

Usage

JSON file definition

The default name of the input JSON file is components.json and must be located in the same folder as scraper.py. This is the basic structure of the file:

{
  "component1": {
    "alarm_price": 260,
    "quantity": 1,
    "urls": [
      "https://www.someshop.com/component1",
      "https://www.someshop.com/component1-alternative",
      "https://www.anothershop.com/component1-alternative"]
  },
  "component2": {
    "urls": [
      "https://www.someshop.com/component2",
      "https://www.anothershop.com/component2",
      "https://www.onemoreshop.com/component2"]
  }

The component name and at least one url are mandatory. It is possible to add several urls from the same shop for the same component if there are some alternatives for this. The quantity of each component defaults to 1, the alarm price is optional.

Execution

Just call the script scraper.py from within the folder, so the components.json file can be found. It will print an overview of the ideal order to minimize the overall cost. The program runs just once and does not keep tracking prices in the background. As usual with scraping, be gentle and fair and don't abuse this program.

Addition of new shops

If you want to add a new shop, you need to edit the file shops.py and:

Enter the significant part of the shop url in the method Shop._get_shops_dict and define a new class type (child of Shop)
Implement the methods _process_soup and get_shipping_cost for the new class. Use the existing classes as reference for the data you need to scrap.
Add your new urls to the input file!

License

See the LICENSE for license details.

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Related tags

Overview

combined-shop-scraper

Features

Usage

JSON file definition

Execution

Addition of new shops

License

Owner

Minecraft Item Scraper

Web-Scrapper using Python and Flask

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

爬取各大SRC当日公告 | 通过微信通知的小工具 | 赏金工具

Introduction to WebScraping Workshop - Semcomp 24 Beta

Dailyiptvlist.com Scraper With Python

HappyScrapper - Google news web scrapper with python

Web Scraping images using Selenium and Python

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

Audio media crawler for lbry.

Web Scraping COVID 19 Meta Portal with Python

Screenhook is a script that captures an image of a web page and send it to a discord webhook.

Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

Jobinja.ir jobs scraper.

Automatically download and crop key information from the arxiv daily paper.

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

A Web Scraper built with beautiful soup, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file

Scraping weather data using Python to receive umbrella reminders