Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Last update: Aug 18, 2021

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Web scraping was performed on the Top 10 Tech Channels on Youtube using Selenium (an automated browser (driver) controlled using python, which is often used in web scraping and web testing). Web scrapped Youtube channels were were determined using a Top 10 Tech Youtubers list from blog.bit.ai.

All data was saved to multiple CSV files to aid in further analyze on a Google Colab notebook. Please see my for more more details.

Sample of Data Collected

The average number of videos per channel was around 200. In total, the data from 2000 videos was scrapped.

Word Cloud of Word Frequency in Video Titles

Take Aways

Video Comment numbers have very little correlation to any data that was obtained in this project.
The following seem to be seems to be highly correlated.
- Channel Views and Subscribers
- Interactions and Video Views
Video titles fall into 5 topic groups.

Kmeans and PCA used to create clusters for video titles
- Iphone (kmeans 0)
- Samsung (kmeans 1)
- Reviews (kmeans 2)
- Unboxing (kmeans 3)
- How-to (kmeans 4)
70% of the the most viewed videos are about phones.
Join Date (Date a Youtube Channel was created) does not seem to have any relationship to number of subscribers or overall cha

Project Links

"Data Analysis of Youtube Tech Channels"

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Related tags

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Sample of Data Collected

Word Cloud of Word Frequency in Video Titles

Take Aways

Kmeans and PCA used to create clusters for video titles

Project Links

Owner

David Rusho

Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

Pseudo API for Google Trends

An helper library to scrape data from Instagram effortlessly, using the Influencer Hunters APIs.

script to scrape direct download links (ddls) from google drive index.

A simple python script to fetch the latest covid info

京东茅台抢购

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

A simple code to fetch comments below an Instagram post and save them to a csv file

Demonstration on how to use async python to control multiple playwright browsers for web-scraping

LSpider 一个为被动扫描器定制的前端爬虫

fork huanghyw/jd_seckill

A modern CSS selector implementation for BeautifulSoup

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

Rottentomatoes, Goodreads and IMDB sites crawler. Semantic Web final project.

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Dailyiptvlist.com Scraper With Python

Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json

Extract embedded metadata from HTML markup

This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon

A simple django-rest-framework api using web scraping