FairyTailor: Multimodal Generative Framework for Storytelling

Last update: Dec 30, 2022

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. FairyTailor adds another modality and modifies the text generation process to help producing a coherent and creative story.

Set-up (development)

After cloning the repository:

Client (Vue 2.6)

Install and check that the client compiles:

cd client
npm i
npm run build

Backend (FASTAPI)

Install and activate the environment (conda provided):

conda env create -f environment.yml
conda activate MultiModalStory

Install environment globally in the directory:

pip install -e .
pip install git+https://github.com/openai/CLIP.git

After installation run:

python -m spacy download en_core_web_sm

In python terminal:

nltk.download('wordnet')
nltk.download('sentiwordnet')
nltk.download('averaged_perceptron_tagger')

Large Data Management (dvc)

Our large data files are stored on IBM's Cloud Object Storage, and to pull data files from that platform you will use a special, read-only .dvc/config file.

dvc pull -f

Which will pull:

backend/outputs (five preset stories)
backend/story_generator/downloaded (transformers)
client/public/unsplash25k (styled images)

Running the framework during developemnt

Client:

cd client
npm run devw

Backend (with server auto reload):

uvicorn backend.server:app --reload --reload-dir backend

Open the uvicorn server localhost:8000 in your web browser

Modifications Ideas:

New huggingface transformer

Place the transformer in backend/story_generator/downloaded directory.
Update the current model path by changing the constant FINETUNED_GPT2_PATH in backend/story_generator/constants.py.

New images folder

Replace the folder client/public/unsplash25k/sketch_images1024 with yours.
Update the current path by changing the constant IMAGE_PATH in client/src/components/Constants.js.

API functionalities

Add functions to the backend endpoint at backend/server/main.py.
Update client/src/js/api/mainApi.js to call the backend endpoint from the client.
Update the corresponding user components in client/src/components.

FairyTailor: Multimodal Generative Framework for Storytelling

Related tags

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Set-up (development)

Client (Vue 2.6)

Backend (FASTAPI)

Large Data Management (dvc)

Running the framework during developemnt

Modifications Ideas:

New huggingface transformer

New images folder

API functionalities

Owner

Eden Bens

Educational 2D SLAM implementation based on ICP and Pose Graph

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

FTIR-Deep Learning - FTIR Deep Learning With Python

[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

UV matrix decompostion using movielens dataset

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

Tiny Kinetics-400 for test

2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

STEM: An approach to Multi-source Domain Adaptation with Guarantees

Puzzle-CAM: Improved localization via matching partial and full features.

tinykernel - A minimal Python kernel so you can run Python in your Python

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

A DeepStack custom model for detecting common objects in dark/night images and videos.

Brain Tumor Detection with Tensorflow Neural Networks.

Episodic-memory - Ego4D Episodic Memory Benchmark

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning