An Edit Friendly DDPM Noise Space: Inversion and Manipulations (DDPM Inversion)

[Project Page]

[Github]

[arXiv](2023/04/14 version v2)

Abstract

DDPM의 편집 친화적인 latent noise space를 추출하는 inversion 방법 제안

The DDPM noise space

Diffusion Forward process:

다음과 같이 간단하게 표현할 수 있다.

Backward process:

z_t는 랜덤 가우시안 노이즈다. 실제로 DDPM 논문에서 유연성을 위해 사용한다.

Edit friendly inversion

애초에 DDPM의 noise space가 편집 친화적이지 않기 때문에 noise map을 대체하는 단순한 방법으로는 편집이 불가능하다.

주어진 이미지 x₀의 구조를 '각인'하기 위해 다음과 같은 보조 시퀀스를 구성한다.

Forward process와 똑같아 보이지만 markov chain인 실제 forward process와 달리 t step에서 각각 노이즈를 샘플링하여 x₁, . . . , x_T를 구성한다. 각 노이즈는 통계적으로 독립적이며, 이로 인해 x_t와 x_t-1이 더 멀리 떨어져 있고 더 높은 분산을 가진다.

x_t와 x_t-1을 알고 있을 때, 다음과 같이 z_t를 구할 수 있다.

이 반전 방법의 특징:

수치 오차의 누적을 보정하기 때문에 입력 이미지를 기계 정밀도까지 재구성할 수 있다.
모든 종류의 확산 과정에 적용할 수 있다.
각 t에서 노이즈 샘플링의 무작위성으로 인해 다양한 반전을 얻을 수 있다.
이러한 다양성은 DDIM 반전에서는 얻을 수 없다.

Properties of the edit-friendly noise space

파란색은 DDPM 추론, 빨간색은 노이즈 섭동이다. 오른쪽의 노이즈 섭동이 훨씬 큰 것을 볼 수 있다.

Image shifting

각 t 단계의 z_t를 d pixel 만큼 이동한 결과:

Color manipulations

Text-Guided Image Editing

Experiments

저작자표시 (새창열림)

'논문 리뷰 > Diffusion Model' 카테고리의 다른 글

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG) (0)	2024.01.29
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LMD) (0)	2024.01.29
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models (0)	2024.01.29
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models (0)	2024.01.22
UniVG: Towards UNIfied-modal Video Generation (1)	2024.01.22
InstantID: Zero-shot Identity-Preserving Generation in Seconds (2)	2024.01.20

Ostin X

An Edit Friendly DDPM Noise Space: Inversion and Manipulations (DDPM Inversion)

Abstract

The DDPM noise space

Edit friendly inversion

Properties of the edit-friendly noise space

'논문 리뷰 > Diffusion Model' 카테고리의 다른 글

티스토리툴바

An Edit Friendly DDPM Noise Space: Inversion and Manipulations (DDPM Inversion)

Abstract

The DDPM noise space

Edit friendly inversion

Properties of the edit-friendly noise space

'논문 리뷰 > Diffusion Model' 카테고리의 다른 글

'논문 리뷰/Diffusion Model' Related Articles

티스토리툴바