NumPy String-Indexed is a NumPy extension that allows arrays to be indexed using descriptive string labels

Overview

NumPy String-Indexed

PyPI Version Python Versions

NumPy String-Indexed is a NumPy extension that allows arrays to be indexed using descriptive string labels, rather than conventional zero-indexing. When a friendly matrix object is initialized, labels are assigned to each array index and each dimension, and they stick to the array after NumPy-style operations such as transposing, concatenating, and aggregating. This prevents Python programmers from having to keep track mentally of what each axis and each index represents, instead making each reference to the array in code naturally self-documenting.

NumPy String-Indexed is especially useful for applications like machine learning, scientific computing, and data science, where there is heavy use of multidimensional arrays.

The friendly matrix object is implemented as a lightweight wrapper around a NumPy ndarray. It's easy to add to a new or existing project to make it easier to maintain code, and has negligible memory and performance overhead compared to the size of array (O(x + y + z) vs. O(xyz)).

Basic functionality

It's recommended to import NumPy String-Indexed idiomatically as fm:

import friendly_matrix as fm

Labels are provided during object construction and can optionally be used in place of numerical indices for slicing and indexing.

The example below shows how to construct a friendly matrix containing an image with three color channels:

image = fm.ndarray(
	numpy_ndarray_image,  # np.ndarray with shape (3, 100, 100)
	dim_names=['color_channel', 'top_to_bottom', 'left_to_right'],
	color_channel=['R', 'G', 'B'])

The matrix can then be sliced like this:

# friendly matrix with shape (100, 100)
r_channel = image(color_channel='R')

# an integer
g_top_left_pixel_value = image('G', 0, 0)

# friendly matrix with shape (2, 100, 50)
br_channel_left_half = image(
	color_channel=('B', 'R'),
	left_to_right=range(image.dim_length('left_to_right') // 2))

Documentation

Full documentation can be found here. Below is a brief overview of Friendly Matrix functionality.

Matrix operations

Friendly matrix objects can be operated on just like NumPy ndarrays with minimal overhead. The package contains separate implementations of most of the relevant NumPy ndarray operations, taking advantage of labels. For example:

side_by_side = fm.concatenate((image1, image2), axis='left_to_right')

An optimized alternative is to perform label-less operations, by adding "_A" (for "array") to the operation name:

side_by_side_arr = fm.concatenate_A((image1, image2), axis='left_to_right')

If it becomes important to optimize within a particular scope, it's recommended to shed labels before operating:

for image in huge_list:
	image_processor(image.A)

Computing matrices

A friendly matrix is an ideal structure for storing and retrieving the results of computations over multiple variables. The compute_ndarray() function executes computations over all values of the input arrays and stores them in a new Friendly Matrix ndarray instance in a single step:

'''Collect samples from a variety of normal distributions'''

import numpy as np

n_samples_list = [1, 10, 100, 1000]
mean_list = list(range(-21, 21))
var_list = [1E1, 1E0, 1E-1, 1E-2, 1E-3]

results = fm.compute_ndarray(
	['# Samples', 'Mean', 'Variance']
	n_samples_list,
	mean_list,
	var_list,
	normal_sampling_function,
	dtype=np.float32)

# friendly matrices can be sliced using dicts
print(results({
	'# Samples': 100,
	'Mean': 0,
	'Variance': 1,
}))

Formatting matrices

The formatted() function displays a friendly matrix as a nested list. This is useful for displaying the labels and values of smaller matrices or slice results:

mean_0_results = results({
	'# Samples': (1, 1000),
	'Mean': 0,
	'Variance': (10, 1, 0.1),
})
formatted = fm.formatted(
	mean_0_results,
	formatter=lambda n: round(n, 1))

print(formatted)

'''
Example output:

# Samples = 1:
	Variance = 10:
		2.2
	Variance = 1:
		-0.9
	Variance = 0.1:
		0.1
# Samples = 1000:
	Variance = 10:
		-0.2
	Variance = 1:
		-0.0
	Variance = 0.1:
		0.0
'''

Installation

pip install numpy-string-indexed

NumPy String-Indexed is listed in PyPI and can be installed with pip.

Prerequisites: NumPy String-Indexed 0.0.1 requires Python 3 and a compatible installation of the NumPy Python package.

Discussion and support

NumPy String-Indexed is available under the MIT License.

Owner
Aitan Grossman
Aitan Grossman
A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.

Sentiment Analysis on Yelp's Dataset Author: Roberto Sanchez, Talent Path: D1 Group Docker Deployment: Deployment of this application can be found her

Roberto Sanchez 0 Aug 04, 2021
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

44 Jan 06, 2023
Train and use generative text models in a few lines of code.

blather Train and use generative text models in a few lines of code. To see blather in action check out the colab notebook! Installation Use the packa

Dan Carroll 16 Nov 07, 2022
基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

SentencesJudger SentencesJudger 是一个基于GRU神经网络的句子判断程序,基本的功能是判断文章中的某一句话是否为一个优美的句子。 English 如何使用SentencesJudger 确认Python运行环境 安装pyTorch与LTP python3 -m pip

8 Mar 24, 2022
Minimal GUI for accessing the Watson Text to Speech service.

Description Minimal graphical application for accessing the Watson Text to Speech service. Requirements Python 3 plus all dependencies listed in requi

Moritz Maxeiner 1 Oct 22, 2021
CoNLL-English NER Task (NER in English)

CoNLL-English NER Task en | ch Motivation Course Project review the pytorch framework and sequence-labeling task practice using the transformers of Hu

Kevin 2 Jan 14, 2022
Crie tokens de autenticação íntegros e seguros com UToken.

UToken - Tokens seguros. UToken (ou Unhandleable Token) é uma bilioteca criada para ser utilizada na geração de tokens seguros e íntegros, ou seja, nã

Jaedson Silva 0 Nov 29, 2022
Sequence Modeling with Structured State Spaces

Structured State Spaces for Sequence Modeling This repository provides implementations and experiments for the following papers. S4 Efficiently Modeli

HazyResearch 902 Jan 06, 2023
Code for the paper "Language Models are Unsupervised Multitask Learners"

Status: Archive (code is provided as-is, no updates expected) gpt-2 Code and models from the paper "Language Models are Unsupervised Multitask Learner

OpenAI 16.1k Jan 08, 2023
Large-scale Knowledge Graph Construction with Prompting

Large-scale Knowledge Graph Construction with Prompting across tasks (predictive and generative), and modalities (language, image, vision + language, etc.)

ZJUNLP 161 Dec 28, 2022
Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

T-IAI-901-MSC2022 - GROUP 18 Gestion de projet Notre travail a été organisé et réparti dans un Trello. https://trello.com/b/X3s2fpPJ/ia-projet Install

1 Feb 05, 2022
novel deep learning research works with PaddlePaddle

Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa

1.5k Jan 03, 2023
Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Diaformer Diaformer: Automatic Diagnosis via Symptoms Sequence Generation (AAAI 2022) Diaformer is an efficient model for automatic diagnosis via symp

Junying Chen 20 Dec 13, 2022
A multi-voice TTS system trained with an emphasis on quality

TorToiSe Tortoise is a text-to-speech program built with the following priorities: Strong multi-voice capabilities. Highly realistic prosody and inton

James Betker 2.1k Jan 01, 2023
A very simple framework for state-of-the-art Natural Language Processing (NLP)

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. IMPORTANT: (30.08.2020) We moved our models

flair 12.3k Dec 31, 2022
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
Fake Shakespearean Text Generator

Fake Shakespearean Text Generator This project contains an impelementation of stateful Char-RNN model to generate fake shakespearean texts. Files and

Recep YILDIRIM 1 Feb 15, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect.

117 Jan 07, 2023
Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

sumekenov 7 Dec 14, 2022
Conditional probing: measuring usable information beyond a baseline

Conditional probing: measuring usable information beyond a baseline

John Hewitt 20 Dec 15, 2022