Web-Scrapper using Python and Flask

Overview

Web-Scrapper

"[초급]Python으로 웹 스크래퍼 만들기" 코스
-NomadCoders

기초적인 Python 문법강의부터 시작하여 웹사이트의 html파일에서 원하는 내용을 Scrapping해서 출력, csv 파일로 저장, flask를 이용한 간단한 웹페이지 및 fakedb를 구축하는 방법을 배우는 코스입니다. django를 시작하기 전 간단히 설명해주는 강의가 포함되어 있습니다.

사용 기술

  • Python
  • html

진도

  • 파이썬 기초 강의
  • Indeed에서 구직정보 추출하기
  • StackOverflow에서 구직정보 추출하기
  • CSV파일로 저장하기
  • Flask를 이용하여 웹사이트에서 정보 주고 받기
  • 추출한 구직정보를 웹사이트에 출력하기
  • Fake DB 만들기
  • CSV파일로 저장하기 기능 구현
  • 최종 결과물 업데이트

결과물

  • 추가 예정...

강의에서 배운 Tips

  • urllib3보다 requests가 더 사용하기 편하다.
  • BeautifulSoup를 이용하여 HTML문서에서 원하는 데이터에 간편하게 접근할 수 있다.
  • list[-1]은 list의 마지막 요소를 가리킨다. list[0:-1] 또는 list[:-1]은 list의 첫번째부터 마지막 전까지를 나타낸다.(마지막 요소를 제외한 모든 요소)
    []안의 -1은 'len(list)-1'로 추정된다.(개인적인 생각)

기타

  • package 확인하기
$ pip show 'package-name'
  • package 설치하기
$ pip install 'package-name'
  • 다른 환경에서 작업할 일이 많을 경우 requirements.txt로 패키지를 관리하면 편리하다.
    (이름을 꼭 저걸로 할 필요는 없으나 대부분 저 이름으로 관리)
#파일 생성
$ pip freeze > requirements.txt

#패키지 설치
$ pip install -r requirements.txt
Owner
윤성도
윤성도
Crawler in Python 3.7, 3.8. 3.9. Pypy3

Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn

Vinit Kumar 2 Mar 12, 2022
Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Eric DE MARIA 1 Nov 30, 2021
ChromiumJniGenerator - Jni Generator module extracted from Chromium project

ChromiumJniGenerator - Jni Generator module extracted from Chromium project

allenxuan 4 Jun 12, 2022
Luis M. Capdevielle 1 Jan 14, 2022
Pseudo API for Google Trends

pytrends Introduction Unofficial API for Google Trends Allows simple interface for automating downloading of reports from Google Trends. Only good unt

General Mills 2.6k Dec 28, 2022
Scrape all the media from an OnlyFans account - Updated regularly

Scrape all the media from an OnlyFans account - Updated regularly

CRIMINAL 3.2k Dec 29, 2022
PS5 bot to find a console in france for chrismas 🎄🎅🏻 NOT FOR SCALPERS

Une PS5 pour Noël Python + Chrome --headless = une PS5 pour noël MacOS Installer chrome Tweaker le .yaml pour la listes sites a scrap et les criteres

Olivier Giniaux 3 Feb 13, 2022
Google Scholar Web Scraping

Google Scholar Web Scraping This is a python script that asks for a user to input the url for a google scholar profile, and then it writes publication

Suzan M 1 Dec 12, 2021
A tool to easily scrape youtube data using the Google API

YouTube data scraper To easily scrape any data from the youtube homepage, a youtube channel/user, search results, playlists, and a single video itself

7 Dec 03, 2022
Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

Charles 3 Feb 04, 2022
NASA APOD Discord Bot - Fetches information from NASA APOD site.

NASA APOD Discord Bot - Fetches information from NASA APOD site.

Astronomy Club IITK 4 Apr 23, 2022
京东茅台抢购

截止 2021/2/1 日,该项目已无法使用! 京东:约满即止,仅限京东实名认证用户APP端抢购,2月1日10:00开始预约,2月1日12:00开始抢购(京东APP需升级至8.5.6版本及以上) 写在前面 本项目来自 huanghyw - jd_seckill,作者的项目地址我找不到了,找到了再贴上

abee 73 Dec 03, 2022
A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.

WaGpScraper A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working

Muhammed Rizad 27 Dec 18, 2022
A tool can scrape product in aliexpress: Title, Price, and URL Product.

Scrape-Product-Aliexpress A tool can scrape product in aliexpress: Title, Price, and URL Product. Usage: 1. Install Python 3.8 3.9 padahal halaman ins

Rahul Joshua Damanik 1 Dec 30, 2021
Scraping Thailand COVID-19 data from the DDC's tableau dashboard

Scraping COVID-19 data from DDC Dashboard Scraping Thailand COVID-19 data from the DDC's tableau dashboard. Data is updated at 07:30 and 08:00 daily.

Noppakorn Jiravaranun 5 Jan 04, 2022
Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

crawlersuseragents This Python script can be used to check if there is any differences in responses of an application when the request comes from a se

Podalirius 13 Dec 27, 2022
Incredibly fast crawler designed for OSINT.

Photon Incredibly fast crawler designed for OSINT. Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap Key Features Dat

Somdev Sangwan 9.3k Jan 02, 2023
Minimal set of tools to conduct stealthy scraping.

Stealthy Scraping Tools Do not use puppeteer and playwright for scraping. Explanation. We only use the CDP to obtain the page source and to get the ab

Nikolai Tschacher 88 Jan 04, 2023
🥫 The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

Max Humber 692 Dec 22, 2022
抢京东茅台脚本,定时自动触发,自动预约,自动停止

jd_maotai 抢京东茅台脚本,定时自动触发,自动预约,自动停止 小白信用 99.6,暂时还没抢到过,朋友 80 多抢到了一瓶,所以我感觉是跟信用分没啥关系,完全是看运气的。

Aruelius.L 117 Dec 22, 2022