An Amazon Product Scraper built using scapy module of python

Overview

Amazon Product Scraper

This is an Amazon Product Scraper built using scapy module of python

Features

it scrape various things

  • Product Title
  • Product Image
  • Product Price
  • Product Rating
  • Product Description
  • Product Reviews
  • Product Brand
  • Product Colour

By default it scrapes Mobile Phones of 5 Pages from Amazon. In case you want to change it to scrape other product, follow the instructions

  1. Open file /amazon_scraper/spiders/amazon_scraper.py
  2. Chnage the urls list at line 16
  3. Update no_of_pages variable to change number of pages to be scraped

Execute Amazon Scraper

there are two ways to execute scraper

First one

you can directly execute run.sh file using shell

sh ./run.sh

Second one

you can execute the following command

scrapy crawl amazon_scraper -o ./data/data.json

It will create data.json file inside the data folder containing all the scraped data in JSON format and all the images will be saved in data/img/full folder.

Sample Data

Already fetched sample data is available in data folder

Troubleshooting

If data.json file doesn't generate in proper format then just delete data.json file and img folder.
Now you good to go ;)

Preresuisites

  • you have to install scrapy
  • you have to install pillow

[MIT]

Owner
Sudhanshu Jha
Sudhanshu Jha
An Auto-Grinding bot made for Pokemeow. Efficient but not many features yet

PokeGrinder šŸ¤– This is an Auto-Grinding bot made for Pokemeow. Efficient but not many features yet. Supported features This bot can currently handle :

Xombie 9 Feb 01, 2022
Celestial - a Python regex Discord chatbot who can talk with you.

Celestial a Python regex Discord chat bot who can talk with you. Invite url: https://discord.com/api/oauth2/authorize?client_id=927573556961869825&per

Jirayu Kaewsing 3 Jan 01, 2023
Discord Token Generator of a project - Some stupids ppl are trying to leak it so i'm leaking faster :)

Original creator: Rolf (dort) HCaptcha Bypasser: h0nde Shark.Solar Discord Token Generator of a project - Some stupids ppl are trying to leak it so i'

Stanley 14 Sep 29, 2021
Python API client library for phpIPAM installations

phpypam: Python API client library for phpIPAM installation As we started to develop phpipam-ansible-modules we used an existing python library for ph

codeaffen 3 Aug 03, 2022
An API wrapper around the pythonanywhere's API.

pyaww An API wrapper around the pythonanywhere's API. The name stands for pythonanywherewrapper. 100% api coverage most of the codebase is documented

7 Dec 11, 2022
Scheduled Block Checker for Cardano Stakepool Operators

ScheduledBlocks Scheduled Block Checker for Cardano Stakepool Operators Lightweight and Portable Scheduled Blocks Checker for Current Epoch. No cardan

SNAKE (Cardano Stakepool) 4 Oct 18, 2022
The best (and now open source) Discord selfbot.

React Selfbot Yes, for real Why am I making this open source? Because can't stop calling my product a rat, tokenlogger and what else not. But there is

30 Nov 13, 2022
A script written in python3 for bruteforcing Gmail accounts.

GmailBruteforce Made for bruteforcing gmail accounts. It needs Less Secure Apps setting turned on in order to work. Installation For windows git clone

Shinero 4 Sep 16, 2022
Instrument asyncio Python for distributed tracing with AWS X-Ray.

xraysink (aka xray-asyncio) Extra AWS X-Ray instrumentation to use distributed tracing with asyncio Python libraries that are not (yet) supported by t

Gary Donovan 12 Nov 10, 2022
Python client library for Google Maps API Web Services

Python Client for Google Maps Services Description Use Python? Want to geocode something? Looking for directions? Maybe matrices of directions? This l

Google Maps 3.8k Jan 01, 2023
The Bot provide Hadith API and fetch content via api.hadith.sutanlab.id

Bot Hadith-API on Telegram The Bot provide Hadith API and fetch content via api.hadith.sutanlab.id Built With Python Asynchronous HTTP protocol client

xMan 12 Feb 19, 2022
Schedule Twitter updates with easy

coo: schedule Twitter updates with easy Coo is an easy to use Python library for scheduling Twitter updates. To use it, you need to first apply for a

wilfredinni 46 Nov 03, 2022
Source code for "Efficient Training of BERT by Progressively Stacking"

Introduction This repository is the code to reproduce the result of Efficient Training of BERT by Progressively Stacking. The code is based on Fairseq

Gong Linyuan 101 Dec 02, 2022
Discord heximals: More colors for your bot

DISCORD-HEXIMALS More colors for your bot ! Support : okimii#0434 COLORS ( 742 )

4 Feb 04, 2022
A chatbot that helps you set price alerts for your amazon products.

Amazon Price Alert Bot Description A Telegram chatbot that helps you set price alerts for amazon products. The bot checks the price of your watchliste

Rittik Basu 24 Dec 29, 2022
⚔ PoC: Hide a c&c botnet in the discord client. (Proof Of Concept)

šŸ‘Øā€šŸ’» Discord Self Bot šŸ‘Øā€šŸ’» A Discord Self-Bot in Python by natrix Installation Run: selfbot.bat Python: version : 3.8 Modules

0хVιcнy#1337 37 Oct 21, 2022
I-Spy is a discord and twitter bot šŸ¤– that keeps a check on usage foul language, hate-speech and NSFW contents

I-Spy is a discord and twitter bot šŸ¤– that keeps a check on usage foul language, hate-speech and NSFW contents. It is the one stop solution to monitor your discord servers and twitter handles against

Tia Saxena 5 Nov 16, 2022
Amanda-A next gen powerful telegram group manager bot for manage your groups and have fun with other cool modules.

Amanda-A next gen powerful telegram group manager bot for manage your groups and have fun with other cool modules.

Team Amanda 4 Oct 21, 2022
Open Source Discord Account Creator

Alter Token Generator Open Source Discord Account Creator This program abuses the discord api and uses the 2Captcha captcha solving service to make di

24 Dec 13, 2022
GitHub Usage Report

github-analytics from github_analytics import analyze pr_analysis = analyze.PRAnalyzer( "organization/repo", "organization", "team-name",

Shrivu Shankar 1 Oct 26, 2021