Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

tiptop is a command-line system monitoring tool in the spirit of top.

Command-line system monitoring. tiptop is a command-line system monitoring tool in the spirit of top. It displays various interesting system stats, gr

Nico Schlömer 1.3k Jan 08, 2023
An open-source CLI tool for backing up RDS(PostgreSQL) Locally or to Amazon S3 bucket

An open-source CLI tool for backing up RDS(PostgreSQL) Locally or to Amazon S3 bucket

1 Oct 30, 2021
The Pythone Script will generate a (.)sh file with reverse shell codes then you can execute the script on the target

Pythone Script will generate a (.)sh file with reverse shell codes then you can execute the script on the targetPythone Script will generate a (.)sh file with reverse shell codes then you can execute

Boy From Future 15 Sep 16, 2022
Simple command-line implementation of minesweeper

minesweeper This is a Python implementation of 2-D Minesweeper! Check out the tutorial here: https://youtu.be/Fjw7Lc9zlyU You start a game by running

Kylie 49 Dec 10, 2022
MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future

MsfMania MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future. Sum

446 Dec 21, 2022
A command-line utility that, given a markdown file, checks whether all its links work.

A command-line utility written in Python that checks validity of links in a markdown file.

Teclado 2 Dec 08, 2021
💻 Physics2Calculator - A simple and powerful calculator for Physics 2

💻 Physics2Calculator A simple and powerful calculator for Physics 2 🔌 Predefined constants pi = 3.14159... k = 8988000000 (coulomb constant) e0 = 8.

Dylan Tintenfich 4 Dec 01, 2021
A simple discord slash command handler for for discord.py.

A simple discord slash command handler for discord.py About ⦿ Installation ⦿ Disclaimer ⦿ Examples ⦿ Documentation ⦿ Discussions Note that master bran

641 Jan 03, 2023
pypinfo is a simple CLI to access PyPI download statistics via Google's BigQuery.

pypinfo: View PyPI download statistics with ease. pypinfo is a simple CLI to access PyPI download statistics via Google's BigQuery. Installation pypin

Ofek Lev 351 Dec 26, 2022
Package installer for python

This is a package that adds a JSON file to your project that records all of the packages used in it and allows people to install it with a single command.

Anmol Malik 1 May 23, 2022
A VIM-inspired filemanager for the console

ranger 1.9.3 ranger is a console file manager with VI key bindings. It provides a minimalistic and nice curses interface with a view on the directory

12.6k Dec 30, 2022
A simple CLI based any Download Tool, that find files and let you stream or download thorugh WebTorrent CLI or Aria or any command tool

Privateer A simple CLI based any Download Tool, that find files and let you stream or download thorugh WebTorrent CLI or Aria or any command tool How

Shreyash Chavan 2 Apr 04, 2022
Commandline Python app to Autodownload mediafire folders and files.

Commandline Python app to Autodownload mediafire folders and files.

Tharuk Renuja 3 May 12, 2022
This CLI give the possibility to do a queries in Star Wars API and returns a JSON in a terminal.

Star Wars CLI (swcli) This CLI give the possibility to do a queries in Star Wars API and returns a JSON in a terminal. Install $ pip install swcli Qu

Pery Lemke 5 Nov 05, 2021
Logic-Sim - A clone of 'Digital Logic Sim' from Sebastian Lague

Logic Simulator This is a clone of 'Digital Logic Sim' from Sebastian Lague. But

Ethan 1 Feb 01, 2022
ICMP Reverse Shell written in Python 3 and with Scapy (backdoor/rev shell)

icmpdoor - ICMP Reverse Shell icmpdoor is an ICMP rev shell written in Python3 and scapy. Tested on Ubuntu 20.04, Debian 10 (Kali Linux), and Windows

Jeroen van Kessel 206 Dec 29, 2022
An interactive aquarium for your terminal.

sipedon An interactive aquarium for your terminal, written using pytermgui. The project got its name from the Common Watersnake, also known as Nerodia

17 Nov 07, 2022
Enlighten Progress Bar is a console progress bar library for Python.

Overview Enlighten Progress Bar is a console progress bar library for Python. The main advantage of Enlighten is it allows writing to stdout and stder

Rockhopper Technologies 265 Dec 28, 2022
A simple command line dumper written in Python 3.

A simple command line dumper written in Python 3.

ImFatF1sh 1 Oct 10, 2021
Detect secret in source code, scan your repo for leaks. Find secrets with GitGuardian and prevent leaked credentials. GitGuardian is an automated secrets detection & remediation service.

GitGuardian Shield: protect your secrets with GitGuardian GitGuardian shield (ggshield) is a CLI application that runs in your local environment or in

GitGuardian 1.2k Jan 06, 2023