Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Last update: Jan 03, 2023

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This is the PyTorch companion code for the paper:

Amaia Salvador, Erhan Gundogdu, Loris Bazzani, and Michael Donoser. Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning. CVPR 2021

If you find this code useful in your research, please consider citing using the following BibTeX entry:

@inproceedings{salvador2021revamping,
    title={Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning},
    author={Salvador, Amaia and Gundogdu, Erhan and Bazzani, Loris and Donoser, Michael},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Cloning

This repository uses git-lfs to store model checkpoint files. Make sure to install it before cloning by following the instructions here:

Once installed, model checkpoint files will be automatically downloaded when cloning the repository with:

git clone [email protected]:amzn/image-to-recipe-transformers.git

These files can optionally be ignored by using git lfs install --skip-smudge before cloning the repository, and can be downloaded at any time using git lfs pull.

Installation

Create conda environment: conda env create -f environment.yml
Activate it with conda activate im2recipetransformers

Data preparation

Download & uncompress Recipe1M dataset. The contents of the directory DATASET_PATH should be the following:

layer1.json
layer2.json
train/
val/
test/

The directories train/, val/, and test/ must contain the image files for each split after uncompressing.

Make splits and create vocabulary by running:

python preprocessing.py --root DATASET_PATH

This process will create auxiliary files under DATASET_PATH/traindata, which will be used for training.

Training

Launch training with:

python train.py --model_name model --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Tensorboard logging can be enabled with --tensorboard. Then, from the checkpoints directory run:

tensorboard --logdir "./" --port PORT

Run python train.py --help for the full list of available arguments.

Evaluation

Extract features from the trained model for the test set samples of Recipe1M:

python test.py --model_name model --eval_split test --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Compute MedR and recall metrics for the extracted feature set:

python eval.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000

Pretrained models

We provide pretrained model weights under the checkpoints directory. Make sure you run git lfs pull to download the model files.
Extract the zip files. For each model, a folder named MODEL_NAME with two files, args.pkl, and model-best.ckpt is provided.
Extract features for the test set samples of Recipe1M using one of the pretrained models by running:

python test.py --model_name MODEL_NAME --eval_split test --root DATASET_PATH --save_dir ../checkpoints

A file with extracted features will be saved under ../checkpoints/MODEL_NAME.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Related tags

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Cloning

Installation

Data preparation

Training

Evaluation

Pretrained models

Security

License

Owner

Amazon

Code for text augmentation method leveraging large-scale language models

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

A python package for deep multilingual punctuation prediction.

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

Use Tensorflow2.7.0 Build OpenAI'GPT-2

Unsupervised text tokenizer focused on computational efficiency

EasyTransfer is designed to make the development of transfer learning in NLP applications easier.

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

💫 Industrial-strength Natural Language Processing (NLP) in Python

Azure Text-to-speech service for Home Assistant

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Sorce code and datasets for "K-BERT: Enabling Language Representation with Knowledge Graph",

Mlcode - Continuous ML API Integrations

Code for the paper "Language Models are Unsupervised Multitask Learners"

TFIDF-based QA system for AIO2 competition

Yet Another Sequence Encoder - Encode sequences to vector of vector in python !