Diffusion for World Modeling: Visual Details Matter in Atari (DIAMOND)

Abstract

Diffusion world model에서 훈련된 RL agent인 DIAMOND (DIffusion As a Model Of eNvironment Dreams) 소개

[arXiv](2024/05/20 version v1)

실제 환경이 아닌 diffusion model이 생성한 world에서 RL agent를 훈련한다.

DM은 이전 장면과 agent의 행동을 반영하여 다음 장면을 생성한다.

알고리즘:

먼저 정책 π_ϕ를 통해 실제 환경에서 데이터 수집

→ World model인 diffusion model 업데이트

→ 보상 및 종료를 담당하는 모델 R (LSTM) 업데이트

→ Actor-Critic model로써 π_ϕ, V_ϕ 업데이트

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control (0)	2024.07.25
End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking (DeepThinking Systems) (0)	2024.05.29
Your Transformer is Secretly Linear (1)	2024.05.26
The Platonic Representation Hypothesis (1)	2024.05.22
Is Flash Attention Stable? (0)	2024.05.13
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior (0)	2024.04.22