Generate a repository with mirror links for DriveDroid app

Last update: Nov 19, 2022

Overview

DriveDroid Repository Generator

Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone

Check also an official scraper written in JavaScript

Try Already Built Repo

Add the next link to image repositories in DriveDroid app:

https://dd.hexed.pw

https://raw.githubusercontent.com/flameshikari/ddrg/master/repo/repo.json

Requirements
Usage
How to Make a Scraper
Misc
Roadmap
Credits
License

Requirements

Python 3.6+ with packages included in requirements.txt.

I recommend to create a venv then install packages there.

Usage

python ./src/main.py [-i dir] [-o dir] [-g]

-i dir where dir is a directory with distro scrapers (./src/distros is default).

-o dir where dir is a directory where the built repo will be saved (./build is default).

-g will generate a webpage to present the content of repo.json.

-h option is available anyway.

How to Make a Scraper

Create a folder in ./src/distros with next structure:

distro_name
├── info.toml
├── logo.png
└── scraper.py

If distro_name starts with underscore (e.g. _disabled), it will not be counted.

Let's take a look for every file.

`info.toml`

info.toml contains a distro name and a link to the official website. Arch Linux info.toml example:

name = "Arch Linux" # name of distro
url  = "https://example.com" # official site

If info.toml is missing or values ain't provided, fallback values will be used. Arch Linux fallback values will be next:

name = "arch" # distro folder name as value, also used in url
url  = "https://distrowatch.com/table.php?distribution=arch"

`logo.png`

Should be 128x128px with transparent background. Arch Linux logo.png example:

If logo.png is missing, the fallback logo will be used:

`scraper.py`

A scraper can be written as you like, as long as it returns the desired values.

It must return an array of tuples (every tuple contains iso_url, iso_arch, iso_size, iso_version in order).

Arch Linux scraper returns next values:

[
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.05.01/archlinux-2021.05.01-x86_64.iso',
    'x86_64',
    792014848,
    '2021.05.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.06.01/archlinux-2021.06.01-x86_64.iso',
    'x86_64',
    811937792,
    '2021.06.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.07.01/archlinux-2021.07.01-x86_64.iso',
    'x86_64',
    817180672,
    '2021.07.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot-network.iso',
    'x86_64',
    516947968,
    '2020.07'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot.iso',
    'x86_64',
    1280491520,
    '2020.07'
  )
]

A scraper includes from public import * in top which imports next stuff to the namespace:

bs (short for BeautifulSoup)
json
re
requests

Also it includes these functions:

get_afh_url(iso_url) — returns a download link for the file from AndroidFileHost
iso_url must be like this: https://androidfilehost.com/?fid=8889791610682936459
get_iso_arch(iso_url) — returns the used processor architecture of iso_url
get_iso_size(iso_url) — returns the file size of iso_url in bytes

Arch Linux scraper.py example:

from public import *  # noqa


def init():

    array = []
    base_urls = [
        "https://mirror.yandex.ru/archlinux/iso/latest",
        "https://mirror.yandex.ru/archlinux/iso/archboot/latest"
    ]

    for base_url in base_urls:

        html = bs(requests.get(base_url).text, "html.parser")

        for filename in html.find_all("a", {"href": re.compile("^.*\.iso$")}):

            iso_url = f"{base_url}/{filename['href']}"
            iso_arch = get_iso_arch(iso_url)
            iso_size = get_iso_size(iso_url)
            iso_version = re.search(r"-(\d+.\d+(.\d+)?)", iso_url).group(1)

            array.append((iso_url, iso_arch, iso_size, iso_version))

    return array

Misc

Here's a snippet for nginx if you decided to self host the repository with website and you wanna access repo.json only by hostname via DriveDroid. Place it in server section of your config:

location = / {
  if ($http_user_agent ~* 'okhttp') {
    rewrite ^/(.*)$ /repo.json break;
  }
}

Roadmap

Option to generate a webpage
Add a mechanism to retry scraping if a network error occurs
Option to select mirrors (mainly uses mirrors based in Russia)
Package this project perhaps
Probably make the code better

Credits

afh-dl by kade-robertson
Yandex.Disk direct links by DokPub

License

MIT License

Generate a repository with mirror links for DriveDroid app

Related tags

Overview

DriveDroid Repository Generator

Try Already Built Repo

Contents

Requirements

Usage

How to Make a Scraper

`info.toml`

`logo.png`

`scraper.py`

Misc

Roadmap

Credits

License

Owner

Evgeny

NASA APOD Discord Bot - Fetches information from NASA APOD site.

Searching info from Google using Python Scrapy

Web-Scraping using Selenium Master

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Telegram Group Scrapper

A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

A web scraper that exports your entire WhatsApp chat history.

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

A Powerful Spider(Web Crawler) System in Python.

A command-line program to download media, like and unlike posts, and more from creators on OnlyFans.

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)

A tool can scrape product in aliexpress: Title, Price, and URL Product.

🥫 The simple, fast, and modern web scraping library

Library to scrape and clean web pages to create massive datasets.

12306抢票脚本

Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Binance Smart Chain Contract Scraper + Contract Evaluator

Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan

A simplistic scraper made to download tons of random screenshots made by people.

Async Python 3.6+ web scraping micro-framework based on asyncio