Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Last update: Dec 30, 2022

Related tags

Overview

Blended Diffusion for Text-driven Editing of Natural Images

Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami, Dani Lischinski, Ohad Fried

Abstract: Natural language offers a highly intuitive interface for image editing. In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask. We achieve our goal by leveraging and combining a pretrained language-image model (CLIP), to steer the edit towards a user-provided text prompt, with a denoising diffusion probabilistic model (DDPM) to generate natural-looking results. To seamlessly fuse the edited region with the unchanged parts of the image, we spatially blend noised versions of the input image with the local text-guided diffusion latent at a progression of noise levels. In addition, we show that adding augmentations to the diffusion process mitigates adversarial results. We compare against several baselines and related methods, both qualitatively and quantitatively, and show that our method outperforms these solutions in terms of overall realism, ability to preserve the background and matching the text. Finally, we show several text-driven editing applications, including adding a new object to an image, removing/replacing/altering existing objects, background replacement, and image extrapolation.

Applications

Multiple synthesis results for the same prompt

Synthesis results for different prompts

Altering part of an existing object

Background replacement

Scribble-guided editing

Text-guided extrapolation

Composing several applications

Code availability

Full code will be released soon.

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Related tags

Overview

Blended Diffusion for Text-driven Editing of Natural Images

Applications

Multiple synthesis results for the same prompt

Synthesis results for different prompts

Altering part of an existing object

Background replacement

Scribble-guided editing

Text-guided extrapolation

Composing several applications

Code availability

Owner

Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

python library for invisible image watermark (blind image watermark)

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Multi-objective constrained optimization for energy applications via tree ensembles

Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Embeds a story into a music playlist by sorting the playlist so that the order of the music follows a narrative arc.

Experiments with Fourier layers on simulation data.

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

This is a JAX implementation of Neural Radiance Fields for learning purposes.

Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST

Equivariant Imaging: Learning Beyond the Range Space

Fast and exact ILP-based solvers for the Minimum Flow Decomposition (MFD) problem, and variants of it.

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classiﬁer')

SBINN: Systems-biology informed neural network

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

Official implementation of the NRNS paper: No RL, No Simulation: Learning to Navigate without Navigating

NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.