Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Last update: Oct 12, 2022

Related tags

Web Crawling bitcoin-github-scrape

Overview

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data.

Each output/<pr number> folder contains

comments.json: an aggregated list of both issue and review comments, in Github's original format
commits.json: a list of commit objects corresponding to the PR, in Github's original format
pr.json: the pull request object, in Github's original format
comments_abbrev.csv: abbreviated representation of each comment in CSV format
pr_abbrev.csv: abbreviated representation of the PR in CSV format
done: the datetime we retrieved the PR data

Limitations

Right now this doesn't really handle open PRs (or PRs that are expected to be updated) properly since it will not refresh data once the done sentinel is created. This could be fixed by comparing various timestamps to the done sentinel and overwriting.

Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Related tags

Overview

Limitations

See also

Owner

James O'Beirne

:arrow_double_down: Dumb downloader that scrapes the web

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

Simply scrape / download all the media from an fansly account.

基于Github Action的定时HITsz疫情上报脚本，开箱即用

HappyScrapper - Google news web scrapper with python

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

Scrapes proxies and saves them to a text file

Library to scrape and clean web pages to create massive datasets.

Web Scraping Framework

A high-level distributed crawling framework.

Crawl the information of a given keyword on Google search engine

Scrapes Every Email Address of Every Society in Every University

Automated data scraper for Thailand COVID-19 data

Minimal set of tools to conduct stealthy scraping.

Creating Scrapy scrapers via the Django admin interface

This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon

A webdriver-based script for reserving Tsinghua badminton courts.

🕷 Phone Crawler with multi-thread functionality