Automatically detect changes made to the official Telegram sites.

Overview

🕷 Telegram Web Crawler

This project is developed to automatically detect changes made to the official Telegram sites. This is necessary for anticipating future updates and other things (new vacancies, API updates, etc).

Name Commits Status
Site updates tracker Commits Fetch new content of tracked links to files
Site links tracker Commits Generate or update list of tracked links
  • passing – new changes
  • failing – no changes

You should to subscribe to channel with alerts to stay updated. Copy of Telegram websites stored here.

GitHub pretty diff

How it works

  1. Link crawling runs as often as possible. Starts crawling from the home page of the site. Detects relative and absolute sub links and recursively repeats the operation. Writes a list of unique links for future content comparison. Additionally, there is the ability to add links by hand to help the script find more hidden (links to which no one refers) links. To manage exceptions, there is a system of rules for the link crawler.

  2. Content crawling is launched as often as possible and uses the existing list of links collected in step 1. Going through the base it gets contains and builds a system of subfolders and files. Removes all dynamic content from files.

  3. Using of GitHub Actions. Works without own servers. You can just fork this repository and own tracker system by yourself. Workflows launch scripts and commit changes. All file changes are tracked by the GIT and beautifully displayed on the GitHub. GitHub Actions should be built correctly only if there are changes on the Telegram website. Otherwise, the workflow should fail. If build was successful, we can send notifications to Telegram channel and so on.

FAQ

Q: How often is "as often as possible"?

A: TLTR: content update action runs every ~10 minutes. More info:

Q: Why there is 2 separated crawl scripts instead of one?

A: Because the previous idea was to update tracked links once at hour. It was so comfortably to use separated scripts and workflows. After Telegram 7.7 update, I realised that find new blog posts so slowly is bad idea.

Q: Why alert for sending alerts have while loop?

A: Because GitHub API doesn't return information about commit immediately after push to repository. Therefore, script are waiting for information to appear...

Q: Why are you using GitHab Personal Access Token in action/checkout workflow`s step?

A: To have ability to trigger other workflows by on push trigger. More info:

Q: Why are you using GitHab PAT in make_and_send_alert.py?

A: To increase limits of GitHub API.

TODO list

  • add storing history of content using hashes;
  • add storing hashes of image, svg, video.

Example of link crawler rules configuration

CRAWL_RULES = {
    # every rule is regex
    # empty string means match any url
    # allow rules with higher priority than deny
    'translations.telegram.org': {
        'allow': {
            r'^[^/]*$',  # root
            r'org/[^/]*/$',  # 1 lvl sub
            r'/en/[a-z_]+/$'  # 1 lvl after /en/
        },
        'deny': {
            '',  # all
        }
    },
    'bugs.telegram.org': {
        'deny': {
            '',    # deny all sub domain
        },
    },
}

Current hidden urls list

HIDDEN_URLS = {
    # 'corefork.telegram.org', # disabled

    'telegram.org/privacy/gmailbot',
    'telegram.org/tos',
    'telegram.org/tour',
    'telegram.org/evolution',

    'desktop.telegram.org/changelog',
}

License

Licensed under the MIT License.

Owner
Il'ya
Telegram: https://t.me/MarshalX
Il'ya
Python Client for Yandex Cloud Logging

Python Client for Yandex Cloud Logging Installation pip3 install python-yandex-cloud-logging Creating a Yandex Cloud Logging Group yc logging group c

MCode 0 Dec 08, 2021
NewpaperNews-API - Json data of the news with python

NewsAPI API Documentation BASE_URL = "https://saurav.tech/NewsAPI/" top_headline

Aryaman Prakash 2 Sep 23, 2022
Stock trading bot made using the Robinhood API / Python library...

High-Low Stock trading bot made using the Robinhood API / Python library... Index Installation Use Development Notes Installation To Install and run t

Reed Graff 1 Jan 07, 2022
A pypi packages finder telegram bot.

PyPi-Bot A pypi packages information finder telegram bot. Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https:

Fayas Noushad 17 Oct 21, 2022
Anti Spam/NSFW Telegram Bot Written In Python With Pyrogram.

Anti Spam/NSFW Telegram Bot Written In Python With Pyrogram.

Wahyusaputra 2 Dec 29, 2021
Unofficial Medium Python Flask API and SDK

PyMedium - Unofficial Medium API PyMedium is an unofficial Medium API written in python flask. It provides developers to access to user, post list and

Engine Bai 157 Nov 11, 2022
微信支付接口V3版python库

wechatpayv3 介绍 微信支付接口V3版python库。 适用对象 wechatpayv3支持微信支付直连商户,接口说明详见 官网。 特性 平台证书自动更新,无需开发者关注平台证书有效性,无需手动下载更新; 支持本地缓存平台证书,初始化时指定平台证书保存目录即可。 适配进度 微信支付V3版A

chen gang 258 Jan 06, 2023
A napari plugin for visualising and interacting with electron cryotomograms

napari-subboxer A napari plugin for visualising and interacting with electron cryotomograms. Installation You can install napari-subboxer via pip: pip

3 Nov 25, 2021
Telegram PHub Bot using ARQ Api and Pyrogram. This Bot can Download and Send PHub HQ videos in Telegram using ARQ API.

Tg_PHub_Bot Telegram PHub Bot using ARQ Api and Pyrogram. This Bot can Download and Send PHub HQ videos in Telegram using ARQ API. OS Support All linu

TheProgrammerCat 13 Oct 21, 2022
GUI Pancakeswap V2 and Uniswap V3 trading client (and bot) MOST ADVANCE TRADING BOT SUPPORT WINDOWS LINUX MAC (BUY TOKEN ON LAUNCH)

GUI Pancakeswap 2 and Uniswap 3 SNIPER BOT 🏆 🥇 (MOST ADVANCE TRADING BOT SUPPORT WINDOWS LINUX MAC) (AUTO BUY TOKEN ON LAUNCH AFTER ADD LIQUIDITY) S

HYDRA 16 Dec 22, 2021
Ark API Wrapper in Python

Pythark Ark API Wrapper in Python. Built with Python Requests Installation Pythark uses Arky to create a new transaction, if you want to use this feat

Jolan 14 Mar 11, 2021
A Discord bot themed around the Swedish heavy metal band Sabaton! (Python)

A Discord bot themed around the Swedish heavy metal band Sabaton! (Python)

Evan Lundberg 1 Nov 29, 2021
Unauthenticated enumeration of services, roles, and users in an AWS account or in every AWS account in existence.

Quiet Riot 🎶 C'mon, Feel The Noise 🎶 An enumeration tool for scalable, unauthenticated validation of AWS principals; including AWS Acccount IDs, roo

Wes Ladd 89 Jan 05, 2023
A simple google translator telegram bot

Translator-Bot A simple google translator telegram bot Please fork this repository don't import code Made with Python3 (C) @FayasNoushad Copyright per

Fayas Noushad 14 Nov 12, 2022
🎥 Stream your favorite movie from the terminal!

Stream-Cli stream-cli is a Python scrapping CLI that combine scrapy and webtorrent in one command for streaming movies from your terminal. Installatio

R E D O N E 379 Dec 24, 2022
LoL API is a Python application made to serve League of Legends data.

LoL API is a Python application made to serve League of Legends data.

Caique Cunha Pereira 1 Nov 06, 2021
A fast, easy to set up telegram userbot running Python 3 which uses fork of the Telethon Library.

forked from friendly-telegram/friendly-telegram Friendly Telegram Userbot A fast, easy to set up telegram userbot running Python 3 which uses fork of

GeekTG 75 Jan 04, 2023
Python script for download course from platzi.com

Platzi Downloader Tool Esta es una pequeña herramienta que hace mucho y que te ahorra una gran cantidad de trabajo a la hora de descargar cursos de Pl

Devil64-Dev 21 Sep 22, 2022
Threat Intel Platform for T-POTs

T-Pot 20.06 runs on Debian (Stable), is based heavily on docker, docker-compose

Deutsche Telekom Security GmbH 4.3k Jan 07, 2023
Python + AWS Lambda Hands OnPython + AWS Lambda Hands On

Python + AWS Lambda Hands On Python Criada em 1990, por Guido Van Rossum. "Bala de prata" (quase). Muito utilizado em: Automatizações - Selenium, Beau

Marcelo Ortiz de Santana 8 Sep 09, 2022