The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.

Last update: Dec 14, 2022

Related tags

Web Crawling python scraper news jupyter-notebook journalism california data-journalism coronavirus covid-19 git-scraping

Overview

The open-source web scrapers that feed the Los Angeles Times' California coronavirus tracker.

Processed data ready for analysis is available at datadesk/california-coronavirus-data.

Scrapers

The scrapers are written using Python and Jupyter notebooks, scheduled and run via GitHub Actions and then archived using git.

module	status	maintainer
bed-surges		Ben Welsh
cases-deaths-demographics		Ben Welsh
cases-deaths-tests		Sean Greene
demographics-age		Sean Greene
demographics-race-by-county		Rahul Mukherjee
demographics-race-statewide		Aida Ylanan
federal-prisons		Iris Lee
homeless-impact		Jennifer Lu
hopkins		Ben Welsh
hospital-patients		Ben Welsh
hospital-capacity		Ben Welsh
hospital-locations		Ben Welsh
ice-detainees		Iris Lee
icu-capacity		Sean Greene
local-adult-detention-facilities		Iris Lee
local-juvenile-detention-facilities		Iris Lee
places		Et al.
probable-cases		Ben Welsh
reopening-tiers	Retired	Ben Welsh
school-reopenings	Retired	Iris Lee
skilled-nursing-facilities		Ben Welsh
skilled-nursing-totals		Ben Welsh
state-prisons		Iris Lee
vaccine-breakthrough-cases		Sean Greene
vaccine-cdc-state-totals		Ben Welsh
vaccine-doses-on-hand		Sean Greene
vaccine-progress		Sean Greene
vaccine-hpi		Sean Greene
vaccine-demographics-by-county		Sean Greene
vaccine-demographics-statewide		Sean Greene
vaccine-shipped-delivered		Sean Greene
variant-proportions-states		Matt Stiles
variant-toplines-ca		Matt Stiles
vaccine-zip-codes		Sean Greene, Matt Stiles

Installation

Clone the repository and install the Python dependencies.

pipenv install

Run all of the scraper commands.

make

Run one of the scraper commands.

make -f vaccine-hpi/Makefile

Owner

Los Angeles Times Data and Graphics Department

Reporting, editing, computer programming

Los Angeles Times Data and Graphics Department

GitHub Repository https://www.latimes.com/projects/california-coronavirus-cases-tracking-outbreak/

基于Github Action的定时HITsz疫情上报脚本，开箱即用

HITsz Daily Report 基于 GitHub Actions 的「HITsz 疫情系统」访问入口定时自动上报脚本，开箱即用。感谢 @JellyBeanXiewh 提供原始脚本和 idea。感谢 @bugstop 对脚本进行重构并新增 Easy Connect 校内代理访问。

56 Nov 27, 2022

Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data

8 Apr 28, 2022

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

Disboard-Scraper Threaded Scraper to get discord servers from disboard.org written in python3. Setup. One thread / tag If you whant to look for multip

11 Nov 01, 2022

This program will help you to properly scrape all data from a specific website

This program will help you to properly scrape all data from a specific website

0 May 15, 2022

An Web Scraping API for MDL(My Drama List) for Python.

PyMDL An API for MyDramaList(MDL) based on webscraping for python. Description An API for MDL to make your life easier in retriving and working on dat

6 Dec 10, 2022

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

8.4k Jan 08, 2023

crypto currency scraping

SCRYPTO What ? Crypto currencies scraping (At the moment, only bitcoin and ethereum crypto currencies are supported) How ? A python script is running

15 Sep 01, 2022

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

taobao_seckill 淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党依赖安装chrome浏览器，根据浏览器的版本找到对应的chromedriver下载安装 web版使用说明 1、抢购前需要校准本地时间，然后把需要抢购的商品加入购物车 2、如果要打包成可执行文件，可使用pyinstalle

2k Jan 05, 2023

Web scrapper para cotizar articulos

WebScrapper Este web scrapper esta desarrollado en python 3.10.0 para buscar en la pagina de cyber puerta articulos dentro del catalogo. El programa t

1 Oct 27, 2021

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

118 Dec 16, 2022

Simple library for exploring/scraping the web or testing a website you’re developing

Robox is a simple library with a clean interface for exploring/scraping the web or testing a website you’re developing. Robox can fetch a page, click on links and buttons, and fill out and submit for

79 Nov 27, 2022

🕷 Phone Crawler with multi-thread functionality

Phone Crawler: Phone Crawler with multi-thread functionality Disclaimer: I'm not responsible for any illegal/misuse actions, this program was made for

3 Feb 10, 2022

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

71 Oct 04, 2022

A simple code to fetch comments below an Instagram post and save them to a csv file

fetch_comments A simple code to fetch comments below an Instagram post and save them to a csv file usage First you have to enter your username and pas

2 Jul 14, 2022

Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

2 Jan 15, 2022

Kusonime scraper using python3

Features Scrap from url Scrap from recommendation Search by query Todo [+] Search by genre Example # Get download url from kusonime import Scrap

2 Jan 28, 2022

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

crawlersuseragents This Python script can be used to check if there is any differences in responses of an application when the request comes from a se

13 Dec 27, 2022

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Iceberg Locations Antarctic large iceberg positions derived from ASCAT and OSCAT-2. All data collected here are from the NASA SCP website Overview Thi

5 Jul 27, 2022

A web scraping pipeline project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database.

New to Streaming Scraper An in-progress web scraping project built with Python, R, and SQL. The scraped data are movie and TV show information. The go

1 Mar 28, 2022

download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

1 Dec 02, 2022