A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

Last update: Dec 31, 2022

Overview

🕳️ CygnusX1

Code by Trong-Dat Ngo.

Overviews

🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engines 🔎 . It is straightforward to set up and run!

Key features

🥰 No knowledge is required to get up and to run.
🚀 Download image using customizable number of threads.
⛏️ Crawl all possible images (search results and recommendations).

Installation

This repository is tested on Python 3.6+ and PyTorch selenium 3.141.0+, as well as it works fine on macOS, Windows, Linux.

You should setup and run 🕳️ CygnusX1 in a virtual environment. If you're unfamiliar with Python virtual environments, check out the user guide here.

First, create a virtual environment with the version of Python you're going to use and activate it. (Can be omitted if you want to set up directly on the OS environment)

source venv/bin/activate

Then download 🕳️ CygnusX1 from Github:

git clone https://github.com/dat821168/CygnusX1.git

Finally install dependencies in requirements.txt:

pip install -r requirements.txt

Run

Use run.py to start the script:

python run.py  --keywords "keyword 1, keyword 2" --workers 8 --use_suggestions --headless

Argument details:

--keywords: Indicate the keywords/keyphrases you want to search. For multiple keywords, separate them with commas.
--out_dir: Path where to save results. Default = './IMAGES'.
--workers: The maximum number of workers used to crawl image. Default = 2.
--use_suggestions: Crawl search engine suggestions/recommendations. Default = False.
--headless: Hide browser during scraping. Default = False.

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

Related tags

Overview

🕳️ CygnusX1

Overviews

Key features

Installation

Run

Future Releases

References

Owner

DatNgo

PS5 bot to find a console in france for chrismas 🎄🎅🏻 NOT FOR SCALPERS

Python scraper to check for earlier appointments in Clalit Health Services

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Web Scraping Practica With Python

SkyScrapers: A collection of variety of Scraping Apps

Web-scraping - Program that scrapes a website for a collection of quotes, picks one at random and displays it

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

Web Scraping COVID 19 Meta Portal with Python

Automated data scraper for Thailand COVID-19 data

Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy uses Request and Response objects for crawling web sites.

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

热搜榜-python爬虫+正则re+beautifulsoup+xpath

京东云无线宝积分推送，支持查看多设备积分使用情况

VG-Scraper is a python program using the module called BeautifulSoup which allows anyone to scrape something off an website. This program lets you put in a number trough an input and a number is 1 news article.

A simple Discord scraper for discord bots

Twitter Scraper

Works very well and you can ask for the type of image you want the scrapper to collect.

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Async Python 3.6+ web scraping micro-framework based on asyncio