Python function to query SQLite files stored on S3

Overview

sqlite-s3-query CircleCI Test Coverage

Python function to query a SQLite file stored on S3. It uses multiple HTTP range requests per query to avoid downloading the entire file, and so is suitable for large databases.

All the HTTP requests for a query request the same version of the database object in S3, so queries should complete succesfully even if the database is replaced concurrently by another S3 client. Versioning must be enabled on the S3 bucket.

Operations that write to the database are not supported.

Inspired by phiresky's sql.js-httpvfs, and dacort's Stack Overflow answer.

Installation

sqlite-s3-query depends on APSW, which is not available on PyPI, but can be installed directly from GitHub.

pip install sqlite_s3_query
pip install https://github.com/rogerbinns/apsw/releases/download/3.36.0-r1/apsw-3.36.0-r1.zip --global-option=fetch --global-option=--version --global-option=3.36.0 --global-option=--all --global-option=build --global-option=--enable-all-extensions

Usage

from sqlite_s3_query import sqlite_s3_query

results_iter = sqlite_s3_query(
    'SELECT * FROM my_table WHERE my_column = ?', params=('my-value',),
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
)

for row in results_iter:
    print(row)

If in your project you use multiple queries to the same file, functools.partial can be used to make an interface with less duplication.

from functools import partial
from sqlite_s3_query import sqlite_s3_query

query_my_db = partial(sqlite_s3_query,
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
)

for row in query_my_db('SELECT * FROM my_table WHERE my_col = ?', params=('my-value',)):
    print(row)

for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)):
    print(row)

The AWS region and the credentials are taken from environment variables, but this can be changed using the get_credentials parameter. Below shows the default implementation of this that can be overriden.

import os
from functools import partial
from sqlite_s3_query import sqlite_s3_query

query_my_db = partial(sqlite_s3_query
    url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite',
    get_credentials=lambda: (
        os.environ['AWS_DEFAULT_REGION'],
        os.environ['AWS_ACCESS_KEY_ID'],
        os.environ['AWS_SECRET_ACCESS_KEY'],
        os.environ.get('AWS_SESSION_TOKEN'),  # Only needed for temporary credentials
    ),
)

for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)):
    print(row)
Owner
Michal Charemza
Michal Charemza
Postgres full text search options (tsearch, trigram) examples

postgres-full-text-search Postgres full text search options (tsearch, trigram) examples. Create DB CREATE DATABASE ftdb; To feed db with an example

Jarosław Orzeł 97 Dec 30, 2022
Migrate data from SQL to NoSQL easily

Migrate data from SQL to NoSQL easily Installation 💯 pip install sql2nosql --upgrade Dependencies 📢 For the package to work, it first needs "clients

Facundo Padilla 43 Mar 26, 2022
ChaozzDBPy - A python implementation based on the original ChaozzDB from Chaozznl with some new features

ChaozzDBPy About ChaozzDBPy is a python implementation based on the original Cha

Igor Iglesias 1 May 25, 2022
HTTP graph database built in Python 3

KiwiDB HTTP graph database built in Python 3. Reference Format References are strings in the format: { JanCraft 1 Dec 17, 2021

Лабораторные работы по Postgresql за 5 семестр

Практикум по Postgresql ERD для заданий 2.x: ERD для заданий 3.x: Их делал вот тут Ниже есть 2 инструкции — по установке postgresql на manjaro и по пе

Danila 10 Oct 31, 2022
A Painless Simple Way To Create Schema and Do Database Operations Quickly In Python

PainlessDB - Taking Your Pain away to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : PainlessDB is a Python-based free and open-

Aiden Ellis 3 Jul 15, 2022
Simple json type database for python3

What it is? Simple json type database for python3! What about speed? The speed is great! All data is stored in RAM until saved. How to install? pip in

3 Feb 11, 2022
A NoSQL database made in python.

CookieDB A NoSQL database made in python.

cookie 1 Nov 30, 2022
Tiny local JSON database for Python.

Pylowdb Simple to use local JSON database 🦉 # This is pure python, not specific to pylowdb ;) db.data['posts'] = ({ 'id': 1, 'title': 'pylowdb is awe

Hussein Sarea 3 Jan 26, 2022
This is a simple graph database in SQLite, inspired by

This is a simple graph database in SQLite, inspired by "SQLite as a document database".

Denis Papathanasiou 1.2k Jan 03, 2023
ClutterDB - Extremely simple JSON database made for infrequent changes which behaves like a dict

extremely simple JSON database made for infrequent changes which behaves like a dict this was made for ClutterBot

Clutter Development 1 Jan 12, 2022
securedb is a fast and lightweight Python framework to easily interact with JSON-based encrypted databases.

securedb securedb is a Python framework that lets you work with encrypted JSON databases. Features: newkey() to generate an encryption key write(key,

Filippo Romani 2 Nov 23, 2022
Simple embedded in memory json database

dbj dbj is a simple embedded in memory json database. It is easy to use, fast and has a simple query language. The code is fully documented, tested an

Pedro Gonring 25 Aug 12, 2022
AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database.

AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database. This database is completely free* 💸

Oren Leung 42 Nov 25, 2022
Elara DB is an easy to use, lightweight NoSQL database that can also be used as a fast in-memory cache.

Elara DB is an easy to use, lightweight NoSQL database written for python that can also be used as a fast in-memory cache for JSON-serializable data. Includes various methods and features to manipula

Saurabh Pujari 101 Jan 04, 2023
Manage your sqlite database very easy (like django) ...

Manage your sqlite database very easy (like django) ...

aWolver 1 Feb 09, 2022
pickleDB is an open source key-value store using Python's json module.

pickleDB pickleDB is lightweight, fast, and simple database based on the json module. And it's BSD licensed! pickleDB is Fun import pickledb

Harrison Erd 738 Jan 04, 2023
TinyDB is a lightweight document oriented database optimized for your happiness :)

Quick Links Example Code Supported Python Versions Documentation Changelog Extensions Contributing Introduction TinyDB is a lightweight document orien

Markus Siemens 5.6k Dec 30, 2022
Tools for analyzing Git history using SQLite

git-history Tools for analyzing Git history using SQLite Installation Install this tool using pip: $ pip install git-history Usage This tool can be r

Simon Willison 128 Jan 02, 2023
LaikaDB, banco de dados para projetos simples.

LaikaDB LaikaDB é um banco de dados noSQL para uso local e simples, onde você pode realizar gravações e leituras de forma eficiente e simples. Todos o

Jaedson Silva 0 Jun 24, 2022