Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Overview

open(LARGE)

Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Motivation

Oftentimes, especially when working with data-heavy applications, large files can proliferate in a repository. Version controlling them is an obvious next step, however, GitHub's git LFS implementation doesn't support the deletion of large files, making it easy for them to eat-up the LFS quota and explode the size of your repos.

Solution

pip install open-large

Simple example

from open_large import LargeFile

LargeFile.configure_credentials({
    "aws_region_name": "your_region_like_eu-west-2",
    "aws_access_key_id": "YOUR_ACCESS_KEY_ID",
    "aws_secret_access_key": "YOUR_VERY_SECRET_ACCESS_KEY",
    "large_files_bucket_name": "create_a_bucket_and_put_its_name_here",
})

# Creates a new version and deletes the older version leaving the 3 most recently used intact
with LargeFile("test.txt", "w", keep_last_n=3) as f:
    for i in range(100000):
        f.write('test\n')

# By default the latest version is returned
# but an optional `version` keyword argument can be provided as well
with LargeFile("test.txt", "r") as f:
    print(f.readlines()[0])

Automatically creates a file, writes to it, uploads it to S3, and then queries the most recent version of it. In this case, the latest version is already in the local cache, no download is required.

More details

LargeFile behaves like an opened file (in the background it is a temp file after all). Binary reading and writing is supported along with the different keywords open() accepts.

The local cache can be configured with these properties:

LargeFile.cache_path = Path('.cache')
LargeFile.max_cache_size = "30 GB"

I only need a path

In case you only need a path to the "remote" file, this pattern can be applied:

path_to_model = LargeFile("folder-of-my-bert-model", version=31).get()

This will first download the file/folder into your local cache folder. Then, it returns a Path object to the local version. Which can be turned into a string with str(path_to_model).

The same approach works for uploads:

LargeFile("folder-of-my-bert-model").push('path_to_local/folder_or_file')

This way, both regular files and folders can be handled. The uploaded file is called folder-of-my-bert-model, the local name is ignored.

Lastly, all version of the remote object can be deleted by calling LargeFile("my-file").delete(). It will still reside in your local cache afterwards, its deletion will happen next time your local cache has to be pruned.

Command-line example

The package can be used as a module from the command-line to give you more flexibility.

Setup

Create an .ini file (or use ~/.aws/credentials). It may look like this:

[DEFAULT]
aws_region_name = your_region_like_eu-west-2
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_VERY_SECRET_ACCESS_KEY
large_files_bucket_name = my_large_files

Just like in example secrets.

Print the expected options

python3 -m open_large --help

Upload some files

python3 -m open_large --secrets secrets.ini --push my_first_file.json folder/my_second_file my_folder

Only the filename is used as the S3 name, the rest of the path is ignored.

Download some files to the local cache

This can be useful when building a Docker image for example. This way, the files can already reside inside the container and need not be downloaded later.

python3 -m open_large --secrets ~/.aws/credentials --cache my_first_file.json:3 my_second_file my_folder:0

Versions may be specified by using :-s.

Delete remote files

python3 -m open_large --secrets ~/.aws/credentials --delete my_first_file.json
Download Apple Music Cover Artwork in the best Quality by providing an Apple Music Link. It downloads the jpg, png and webp version since they often differ from another.

amogus.py - Version 0.0.5 amogus - Apple Music Hi-Res Artwork Fetcher this is my first real python tool so sorry if its bad amogus is a Python script

reaper 46 Jan 09, 2023
Python script designed to search and fetch direct download links from nxbrew.com

SwitchGamesDownloader Only for windows nxbrew.com is a website, accessible only using a proxy, where the majority of games for the Nintendo Switch are

Backend 91 Dec 28, 2022
Spotify Playlist Downloader With Python

Spotify Playlist Downloader This will let you download Spotify playlists for free without Premium. It gets all the songs from the API and downloads th

Yasho 16 Sep 28, 2022
🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube

🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube 🎗 Fast And Free Bot No Need To Pay ✅ By SL-Alpha-X-Team ⚡

Official Alpha-X-Team Account 7 Aug 31, 2022
Search & download music from a certain streaming service

Search & download music from a certain streaming service

mat 2 Mar 11, 2022
Automatically download multiple papers by keywords in CVPR

Automatically download multiple papers by keywords in CVPR

46 Jun 08, 2022
A cross platform front-end GUI of the popular youtube-dl written in wxPython.

youtube-dlG A cross platform front-end GUI of the popular youtube-dl media downloader written in wxPython. Supported sites Screenshots Requirements Py

8.7k Dec 31, 2022
Get the latest updates around you as they happen

Adherent We all are different, experience various things happening around us but we stick together. We are all a part of a greater community. As human

Shreyas Daniel 1 Nov 10, 2022
a simple ehentai downloader with jpg 2 pdf

Simple_Ehentai_DownLoader a simple ehentai downloader with jpg 2 pdf 中文介绍 Environment python3.8 How to use before you start,there are some tips. the q

Hibian 6 Dec 11, 2022
1Fichier Download Manager.

1fichier-dl 1Fichier Download Manager. Features ⭐ Manage your downloads ⭐ Bypass time limits Credits All icons, including the app icon, were provided

manuGMG 470 Oct 08, 2022
Download candlestick data fast & easy for analysis

crypto-candlesticks 📈 The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit

Pedro Torres 31 Dec 11, 2022
The PornHub Downloader is a powerfull script used to download and manage both videos and pictures

The PornHub Downloader is a powerfull script used to download and manage both videos and pictures

16 Aug 31, 2022
GTK4 + Python tutorial with code examples

Taiko's GTK4 Python tutorial Wanna make apps for Linux but not sure how to start with GTK? This guide will hopefully help! The intent is to show you h

190 Jan 08, 2023
Twayback: Downloading deleted Tweets from the Wayback Machine, made easy

Finding and downloading deleted Tweets takes a lot of time. Thankfully, with this tool, it becomes a piece of cake! 🎂

126 Dec 27, 2022
Vinetrimmer-DRM-TOOL - Widevine DRM downloader and decrypter for AMZN|NF|STAN And all

🍃 ✂️ Vinetrimmer Widevine DRM downloader and decrypter. Thanks to wvleaks for t

Vlad Tănăsescu 20 Jan 13, 2022
YouTube to MP3 or 4, you get to choose...

UTubeToMP YouTube to MP3 or 4, you get to choose... If you don't wanna git clone andor dont wanna install python. Here: Repl.it Instructions: Pretty s

1 Jan 29, 2022
Download from HBO-MAX-BLIM-TV-Paramount

#HBO MAX- BlimTV -Paramount plus 4K Downloader Tool To download 4K HDR DV SDR from HBO MAX- BlimTV -Paramount plus Hello Fellow Developers/ ! Hi! M

4 Dec 25, 2021
Download and save Bing wallpapers and set as background for GNOME desktop

Save Bing wallpapers and set as background for GNOME desktop This script downloads the Bing wallpaper and sets it in the background of your gnome desk

manikamran 2 Nov 06, 2021
A python module to download ISO Standards

ISO Standards Downloader A python module to download ISO Standards from https://standards.iso.org/iso-iec/ Report Bug · Request Feature Table of conte

Daniel 1 Dec 29, 2021
Script that allows to download portable installers of different versions Adobe software for macOS

What is this and for what This is a script that allows you to download portable installers of programs from Adobe for macOS with different versions. T

715 Jan 06, 2023