爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Last update: Jan 05, 2023

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介：

时光荏苒，记不清写了多少案例了。作者文章发布在csdn，代码随后往github上更新。csdn部分文章为收费案例，合理订阅。

声明：

本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。
作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。
因本库引起的或与之有关的任何争议，各方应友好协商解决，协商不成的任何后果与作者无关。

专栏

网络爬虫基础：适合有python语法基础准备学爬虫的同学

web逆向基础：有爬虫经验即可（包含猿人学爬虫题目解析）

安卓逆向基础：工具介绍、逆向记录、案例分享

爬虫案例合集：付费专栏、经典案例、持续更新

博客

交流

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)
使用说明：

1、启动dist目录下的run.exe程序。

2、填入主播uid，你的cookie，房间id

3、点击启动后，等待即可，不可重复点击。

4、需要确认主播当前是否还在直播。

参数获取：

主播uid：浏览器上的网址最后一个参数。

比如网址为： https://live.kuaishou.com/u/yingjia2019

主播的uid为： yingjia2019

你的cookie：

1、打开控制台，鼠标右键点击审查元素或者按F12.

2、点击控制台的Network。

3、刷新页面，可已按F5刷新

4、找到和主播uid一样html文件，然后点击右侧的headers

5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;

6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

房间id：

1、点击控制台的 Elements，按ctrl+F，打开搜索框。输入： live-stream-id

2、复制 live-stream-id="Zo9Upaz8w90"

3、要输入的房间id是 Zo9Upaz8w90

运行时最好保持页面打开，关闭页面后过一段时间会导致cookie失效。

此工具以学习为主，禁止滥用
Source code(tar.gz)
Source code(zip)
default.rar(21.47 MB)
小说下载器(Feb 2, 2021)
简介

1、小说下载(优势：速度快，直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势：资源全，已上架的小说几乎都能找到)

特别声明:

本脚本仅用于测试和学习研究，禁止用于商业用途，不能保证其合法性，准确性，完整性和有效性，请根据情况自行判断。

本项目内所有资源文件，禁止任何公众号、自媒体进行任何形式的转载、发布。

本项目内任何脚本问题概不负责，包括但不限于由任何脚本错误导致的任何损失或损害.

请勿将项目的任何内容用于商业或非法目的，否则后果自负。

本项目遵循GPL-3.0 License协议，如果本特别声明与GPL-3.0 License协议有冲突之处，以本特别声明为准。

Source code(tar.gz)
Source code(zip)
default.zip(44.16 MB)

Owner

lx

Every noble work is at first impossible.

GitHub Repository

mlscraper: Scrape data from HTML pages automatically with Machine Learning

🤖 Scrape data from HTML websites automatically with Machine Learning

798 Dec 29, 2022

Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

3 Feb 04, 2022

Scrape puzzle scrambles from csTimer.net

Scroodle Selenium script to scrape scrambles from csTimer.net csTimer runs locally in your browser, so this doesn't strain the servers any more than i

1 Oct 29, 2021

A package designed to scrape data from Yahoo Finance.

yahoostock A package designed to scrape data from Yahoo Finance. Installation The most simple installation method is through PIP. pip install yahoosto

2 May 28, 2022

Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submit

14 Dec 09, 2022

🕷 Phone Crawler with multi-thread functionality

Phone Crawler: Phone Crawler with multi-thread functionality Disclaimer: I'm not responsible for any illegal/misuse actions, this program was made for

3 Feb 10, 2022

A list of Python Bots used to extract data from several websites

A list of Python Bots used to extract data from several websites. Data extraction is for products on e-commerce (ecommerce) websites. Data fetched i

1 Jan 14, 2022

A dead simple crawler to get books information from Douban.

Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)

1 Jan 10, 2022

OSTA web scraper, for checking the status of school buses in Ottawa

OSTA-La-Vista OSTA web scraper, for checking the status of school buses in Ottawa. Getting Started Using a Raspberry Pi, download Python 3, and option

1 Jan 28, 2022

Python framework to scrape Pastebin pastes and analyze them

pastepwn - Paste-Scraping Python Framework Pastebin is a very helpful tool to store or rather share ascii encoded data online. In the world of OSINT,

105 Dec 29, 2022

京东茅台抢购最新优化版本，京东秒杀，添加误差时间调整，优化了茅台抢购进程队列

776 Jul 28, 2021

A web scraper which checks price of a product regularly and sends price alerts by email if price reduces.

Amazon-Web-Scarper Created a web scraper using simple functions to check price of a product on amazon (can be duplicated to check price at other marke

1 Jan 17, 2022

Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Twitter Turbo / Auto Claimer / Swapper Version: 1.0 Last Update: 01/26/2022 Use this at your own descretion. I've only used this on test accounts and

6 May 02, 2022

Web3 Pancakeswap Sniper bot written in python3

Pancakeswap_BSC_Sniper_Bot Web3 Pancakeswap Sniper bot written in python3, Please note the license conditions! The first Binance Smart Chain sniper bo

295 Dec 31, 2022

A dead simple crawler to get books information from Douban.

Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)

1 Jan 10, 2022

tweet random sand cat pictures

sandcatbot setup pip3 install --user -r requirements.txt cp sandcatbot.example.conf sandcatbot.conf vim sandcatbot.conf running the first parameter i

8 Aug 07, 2022

优化版本的京东茅台抢购神器

1.8k Mar 18, 2022

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

5 Nov 19, 2021

Subscrape - A Python scraper for substrate chains

subscrape A Python scraper for substrate chains that uses Subscan. Usage copy co

14 Dec 15, 2022

PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

PaperRobot PaperRobot 是一个论文抓取工具，可以快速批量下载大量论文，方便后期进行持续的论文管理与学习。 PaperRobot通过多个接口抓取论文，目前抓取成功率维持在90%以上。通过配置Config文件，可以抓取任意计算机领域相关会议的论文。 Installation Down

47 Nov 23, 2022

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Related tags

Overview

lxSpider

专栏

目录

博客

推荐

交流

You might also like...

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)

使用说明：

参数获取：

你的cookie：

房间id：

小说下载器(Feb 2, 2021)

简介

特别声明:

Owner

lx

mlscraper: Scrape data from HTML pages automatically with Machine Learning

Web scrapping

Scrape puzzle scrambles from csTimer.net

A package designed to scrape data from Yahoo Finance.

Binance Smart Chain Contract Scraper + Contract Evaluator

🕷 Phone Crawler with multi-thread functionality

A list of Python Bots used to extract data from several websites

A dead simple crawler to get books information from Douban.

OSTA web scraper, for checking the status of school buses in Ottawa

Python framework to scrape Pastebin pastes and analyze them

京东茅台抢购最新优化版本，京东秒杀，添加误差时间调整，优化了茅台抢购进程队列

A web scraper which checks price of a product regularly and sends price alerts by email if price reduces.

Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Web3 Pancakeswap Sniper bot written in python3

A dead simple crawler to get books information from Douban.

tweet random sand cat pictures

优化版本的京东茅台抢购神器

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

Subscrape - A Python scraper for substrate chains

PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management