VectorAscent: Generate vector graphics from a textual description

Example

"a painting of an evergreen tree"

python text_to_painting.py --prompt "a painting of an evergreen tree" --num_iter 2500 --use_blob --subdir vit_rn50_useblob

We rely on CLIP for its aligned text and image encoders, and diffvg, a differentiable vector graphics rasterizer. Differentiable rendering allows us to generate raster images from vector paths, but isn't provided textual descriptions. We use CLIP to score the similarity between raster graphics and textual captions. Using gradient ascent, we can then optimize for a vector graphic whose rasterization has high similarity with a user-provided caption, backpropagating through CLIP and diffvg to the vector graphics parameters. This project is partially inspired by Deep Daze, a caption-guided raster graphics generator.

Quick start

Requirements:

torch
torchvision
matplotlib
numpy
scikit-image
clip
diffvg

Install our dependencies and CLIP.

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
pip install ftfy regex tqdm numpy matplotlib scikit-image
pip install git+https://github.com/openai/CLIP.git

Then follow these instructions to install diffvg.

Generate vector graphics from a textual caption

Related tags

Overview

VectorAscent: Generate vector graphics from a textual description

Example

Quick start

Owner

Ajay Jain

Machine learning models from Singapore's NLP research community

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

Poetry PEP 517 Build Backend & Core Utilities

Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

Knowledge Management for Humans using Machine Learning & Tags

The training code for the 4th place model at MDX 2021 leaderboard A.

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Unsupervised text tokenizer for Neural Network-based text generation.

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

hashily is a Python module that provides a variety of text decoding and encoding operations.

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

BERT-based Financial Question Answering System

Transformer - A TensorFlow Implementation of the Transformer: Attention Is All You Need

This repository contains the code for "Generating Datasets with Pretrained Language Models".

A method to generate speech across multiple speakers

Data preprocessing rosetta parser for python