WebScrapping Project - G1 Latest News

Last update: Feb 13, 2022

Related tags

Overview

Web Scrapping com Python

Esse projeto consiste em um código para o usuário buscar as últimas nóticias sobre um termo qualquer, no site G1. Para esse projeto foi escolhida a linguagem de programação Python. Para que fosse possível realizar essa busca, foram utilizadas três bibiliotecas, que foram:

selenium - Utilizada para automatizar o processo e obter o conteúdo da página Web.
bs4 - BeautifoulSoup - Utilizada para manipular o conteúdo HTML.
Pandas - Utilizada para criar e exportar um dataframe com as informações obtidas.

💻 Pré-Requisitos

Antes de comerçar, verifique se você atende os seguintes requisitos:

Possuir Windows, Linux or Mac.
Possuir o Python instalado em sua máquina.
Possuir o navegador Google Chrome instalado em sua máquina na versão 97.0.4692.71.
Possuir conexão à Internet

💻 Running

Instale os pacotes necessários:

$ pip install -r requirements.txt

Execute o arquivo main.py, aguarde alguns segundos e será gerada uma planilha XLSX e um arquivo CSV com as informações.

License

MIT

Free Software, Hell Yeah!

WebScrapping Project - G1 Latest News

Related tags

Overview

Web Scrapping com Python

💻 Pré-Requisitos

💻 Running

License

Owner

Eduardo Henrique

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

Scraping news from Ucsal portal with Scrapy.

抖音批量下载用户所有无水印视频

Screenhook is a script that captures an image of a web page and send it to a discord webhook.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）

Telegram Group Scrapper

Generate a repository with mirror links for DriveDroid app

Pyrics is a tool to scrape lyrics, get rhymes, generate relevant lyrics with rhymes.

Screen scraping and web crawling framework

Google Scholar Web Scraping

学习强国自动化百分百正确、瞬间答题，分值45分

Library to scrape and clean web pages to create massive datasets.

UdemyBot - A Simple Udemy Free Courses Scrapper

A dead simple crawler to get books information from Douban.

This program will help you to properly scrape all data from a specific website

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

download NCERT books using scrapy

Extract embedded metadata from HTML markup

script to scrape direct download links (ddls) from google drive index.

WebScrapping Project - G1 Latest News

Related tags

Overview

Web Scrapping com Python

💻 Pré-Requisitos

💻 Running

License

Owner

Eduardo Henrique

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

Scraping news from Ucsal portal with Scrapy.

抖音批量下载用户所有无水印视频

Screenhook is a script that captures an image of a web page and send it to a discord webhook.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸 每日一句 + 毒鸡汤（从2月份稳定运行至今）

Telegram Group Scrapper

Generate a repository with mirror links for DriveDroid app

Pyrics is a tool to scrape lyrics, get rhymes, generate relevant lyrics with rhymes.

Screen scraping and web crawling framework

Google Scholar Web Scraping

学习强国 自动化 百分百正确、瞬间答题，分值45分

Library to scrape and clean web pages to create massive datasets.

UdemyBot - A Simple Udemy Free Courses Scrapper

A dead simple crawler to get books information from Douban.

This program will help you to properly scrape all data from a specific website

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

download NCERT books using scrapy

Extract embedded metadata from HTML markup

script to scrape direct download links (ddls) from google drive index.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）

学习强国自动化百分百正确、瞬间答题，分值45分