Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020

Related tags

Testingshow-edit-tell
Overview

Show, Edit and Tell: A Framework for Editing Image Captions | arXiv

This contains the source code for Show, Edit and Tell: A Framework for Editing Image Captions, to appear at CVPR 2020

Requirements

  • Python 3.6 or 3.7
  • PyTorch 1.2

For evaluation, you also need:

Argument Parser is currently not supported. We will add support to it soon.

Pretrained Models

You can download the pretrained models from here. Place them in eval folder.

Download and Prepare Features

In this work, we use 36 fixed bottom-up features. If you wish to use the adaptive features (10-100), please refer to adaptive_features folder in this repository and follow the instructions.

First, download the fixed features from here and unzip the file. Place the unzipped folder in bottom-up_features folder.

Next type this command:

python bottom-up_features/tsv.py

This command will create the following files:

  • An HDF5 file containing the bottom up image features for train and val splits, 36 per image for each split, in an (I, 36, 2048) tensor where I is the number of images in the split.
  • PKL files that contain training and validation image IDs mapping to index in HDF5 dataset created above.

Download/Prepare Caption Data

You can either download all the related caption data files from here or create them yourself. The folder contains the following:

  • WORDMAP_coco: maps the words to indices
  • CAPUTIL: stores the information about the existing captions in a dictionary organized as follows: {"COCO_image_name": {"caption": "existing caption to be edited", "encoded_previous_caption": an encoded list of the words, "previous_caption_length": a list contaning the length of the caption, "image_ids": the COCO image id}
  • CAPTIONS the encoded ground-truth captions (a list with number_images x 5 lists. Example: we have 113,287 training images in Karpathy Split, thereofre there is 566,435 lists for the training split)
  • CAPLENS: the length of the ground-truth captions (a list with number_images x 5 vallues)
  • NAMES: the COCO image name in the same order as the CAPTIONS
  • GENOME_DETS: the splits and image ids for loading the images in accordance to the features file created above

If you'd like to create the caption data yourself, download Karpathy's Split training, validation, and test splits. This zip file contains the captions. Place the file in caption data folder. You should also have the pkl files created from the 'Download Features' section: train36_imgid2idx.pkl and val36_imgid2idx.pkl.

Next, run:

python preprocess_caps.py

This will dump all the files to the folder caption data.

Next, download the existing captios to be edited, and organize them in a list containing dictionaries with each dictionary in the following format: {"image_id": COCO_image_id, "caption": "caption to be edited", "file_name": "split\\COCO_image_name"}. For example: {"image_id": 522418, "caption": "a woman cutting a cake with a knife", "file_name": "val2014\\COCO_val2014_000000522418.jpg"}. In our work, we use the captions produced by AoANet.

Next, run:

python preprocess_existing_caps.py

This will dump all the existing caption files to the folder caption data.

Prepare/Download Sequence-Level Training Data

Download the RL-data for sequence-level training used for computing metric scores from here.

Alternitavely, you may prepare the data yourself:

Run the following command:

python preprocess_rl.py

This will dump two files in the data folder used for computing metric scores.

Training and Validation

XE training stage:

For training DCNet, run:

python dcnet.py

For optimizing DCNet with MSE, run:

python dcnet_with_mse.py

For training editnet:

python editnet.py
Cider-D Optimization stage:

For training DCNet, run:

python dcnet_rl.py

For training editnet:

python editnet_rl.py

Evaluation

Refer to eval folder for instructions. All the generated captions and scores from our model can be found in the outputs folder.

BLEU-1 BLEU-4 CIDEr SPICE
Cross-Entropy Loss 77.9 38.0 1.200 21.2
CIDEr Optimization 80.6 39.2 1.289 22.6

Citation

@InProceedings{Sammani_2020_CVPR,
author = {Sammani, Fawaz and Melas-Kyriazi, Luke},
title = {Show, Edit and Tell: A Framework for Editing Image Captions},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

References

Our code is mainly based on self-critical and show attend and tell. We thank both authors.

Owner
Fawaz Sammani
The human brain is a miracle every human has, and mathematically modelling that brain is an overwhelming matter! I like teaching machines vision-language
Fawaz Sammani
Python tools for penetration testing

pyTools_PT python tools for penetration testing Please don't use these tool for illegal purposes. These tools is meant for penetration testing for leg

Gourab 1 Dec 01, 2021
Getting the most out of your hobby servo

ServoProject by Adam Bäckström Getting the most out of your hobby servo Theory The control system of a regular hobby servo looks something like this:

209 Dec 20, 2022
Run ISP speed tests and save results

SpeedMon Automatically run periodic internet speed tests and save results to a variety of storage backends. Supported Backends InfluxDB v1 InfluxDB v2

Matthew Carey 9 May 08, 2022
HTTP traffic mocking and testing made easy in Python

pook Versatile, expressive and hackable utility library for HTTP traffic mocking and expectations made easy in Python. Heavily inspired by gock. To ge

Tom 305 Dec 23, 2022
pytest plugin providing a function to check if pytest is running.

pytest-is-running pytest plugin providing a function to check if pytest is running. Installation Install with: python -m pip install pytest-is-running

Adam Johnson 21 Nov 01, 2022
Sixpack is a language-agnostic a/b-testing framework

Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries

1.7k Dec 24, 2022
Command line driven CI frontend and development task automation tool.

tox automation project Command line driven CI frontend and development task automation tool At its core tox provides a convenient way to run arbitrary

tox development team 3.1k Jan 04, 2023
hyppo is an open-source software package for multivariate hypothesis testing.

hyppo (HYPothesis Testing in PythOn, pronounced "Hippo") is an open-source software package for multivariate hypothesis testing.

neurodata 137 Dec 18, 2022
Scalable user load testing tool written in Python

Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst

Locust.io 20.4k Jan 04, 2023
pywinauto is a set of python modules to automate the Microsoft Windows GUI

pywinauto is a set of python modules to automate the Microsoft Windows GUI. At its simplest it allows you to send mouse and keyboard actions to windows dialogs and controls, but it has support for mo

3.8k Jan 06, 2023
An Instagram bot that can mass text users, receive and read a text, and store it somewhere with user details.

Instagram Bot 🤖 July 14, 2021 Overview 👍 A multifunctionality automated instagram bot that can mass text users, receive and read a message and store

Abhilash Datta 14 Dec 06, 2022
Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur.

Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur. LinkedI

Furkan Gulsen 8 Nov 01, 2022
Travel through time in your tests.

time-machine Travel through time in your tests. A quick example: import datetime as dt

Adam Johnson 373 Dec 27, 2022
The definitive testing tool for Python. Born under the banner of Behavior Driven Development (BDD).

mamba: the definitive test runner for Python mamba is the definitive test runner for Python. Born under the banner of behavior-driven development. Ins

Néstor Salceda 502 Dec 30, 2022
Minimal example of how to use pytest with automated 'devops' style automated test runs

Pytest python example with automated testing This is a minimal viable example of pytest with an automated run of tests for every push/merge into the m

Karma Computing 2 Jan 02, 2022
Tools for test driven data-wrangling and data validation.

datatest: Test driven data-wrangling and data validation Datatest helps to speed up and formalize data-wrangling and data validation tasks. It impleme

269 Dec 16, 2022
A simple serverless create api test repository. Please Ignore.

serverless-create-api-test A simple serverless create api test repository. Please Ignore. Things to remember: Setup workflow Change Name in workflow e

Sarvesh Bhatnagar 1 Jan 18, 2022
Load Testing ML Microservices for Robustness and Scalability

The demo is aimed at getting started with load testing a microservice before taking it to production. We use FastAPI microservice (to predict weather) and Locust to load test the service (locally or

Emmanuel Raj 13 Jul 05, 2022
Ab testing - The using AB test to test of difference of conversion rate

Facebook recently introduced a new type of offer that is an alternative to the current type of bidding called maximum bidding he introduced average bidding.

5 Nov 21, 2022
Python package to easily work with selenium and manage tabs effectively.

Simple Selenium The aim of this package is to quickly get started with working with selenium for simple browser automation tasks. Installation Install

Vishal Kumar Mishra 1 Oct 27, 2021