하위 모듈들의 순서가 제대로 나와있지 않아서 직접 찾아봄.
conv_in
down_blocks:
(CrossAttnDownBlock2D:
ResnetBlock2D
Transformer2DModel
ResnetBlock2D
Transformer2DModel
Downsample2D
) x 3
DownBlock2D:
(ResnetBlock2D) x 2
mid_blocks:
ResnetBlock2D
Transformer2DModel
ResnetBlock2D
up_blocks:
UpBlock2D:
(ResnetBlock2D) x 3
Upsample2D
(CrossAttnUpBlock2D:
(ResnetBlock2D
Transformer2DModel) x 3
Upsample2D) x 2
CrossAttnUpBlock2D:
(ResnetBlock2D
Transformer2DModel) x 3
out
Stable Diffusion XL-v1.0
conv_in
down_blocks:
DownBlock2D:
(ResnetBlock2D) x 2
Downsample2D
CrossAttnDownBlock2D:
(ResnetBlock2D
Transformer2DModel (BasicTransformerBlock x 2) ) x 2
Downsample2D
CrossAttnDownBlock2D:
(ResnetBlock2D
Transformer2DModel (BasicTransformerBlock x 10) ) x 2
mid_blocks:
ResnetBlock2D
Transformer2DModel
ResnetBlock2D
up_blocks:
CrossAttnUpBlock2D:
(ResnetBlock2D
Transformer2DModel (BasicTransformerBlock x 10) ) x 3
Upsample2D
CrossAttnUpBlock2D:
(ResnetBlock2D
Transformer2DModel (BasicTransformerBlock x 2) ) x 3
Upsample2D
UpBlock2D:
(ResnetBlock2D) x 3
out
Original Stable diffusion과의 차이점:
- 최고 해상도의 transformer block 제거
- 최저 해상도(8x8) 제거
- SD에서는 모든 Transformer2DModel에 각각 1개의 BasicTransformerBlock을 사용했지만, SDXL에서는 구성이 달라졌다.
'Deep Learning > Diffusion' 카테고리의 다른 글
DiffStyler 써보기 (0) | 2023.01.17 |
---|---|
Paint by Example 써보기 (1) | 2023.01.13 |
DAAM 써보기 (1) | 2023.01.13 |