본문 바로가기

Deep Learning/Diffusion

Stable Diffusion, SDXL U-Net Architecture 살펴보기

먼저 stable diffusion

 

하위 모듈들의 순서가 제대로 나와있지 않아서 직접 찾아봄.

 

 

conv_in

 

down_blocks:

    (CrossAttnDownBlock2D:

        ResnetBlock2D

        Transformer2DModel

        ResnetBlock2D

        Transformer2DModel

        Downsample2D

    ) x 3

    DownBlock2D:

        (ResnetBlock2D) x 2

 

mid_blocks:

        ResnetBlock2D

        Transformer2DModel

        ResnetBlock2D

 

up_blocks:

    UpBlock2D:

       (ResnetBlock2D) x 3

        Upsample2D

    (CrossAttnUpBlock2D:

        (ResnetBlock2D

        Transformer2DModel) x 3

        Upsample2D) x 2

    CrossAttnUpBlock2D:

        (ResnetBlock2D

        Transformer2DModel) x 3

 

out


Stable Diffusion XL-v1.0

 

conv_in

 

down_blocks:

    DownBlock2D:

        (ResnetBlock2D) x 2

        Downsample2D

    CrossAttnDownBlock2D:

        (ResnetBlock2D

        Transformer2DModel (BasicTransformerBlock x 2) ) x 2

        Downsample2D

    CrossAttnDownBlock2D:

        (ResnetBlock2D

        Transformer2DModel (BasicTransformerBlock x 10) ) x 2

 

mid_blocks:

        ResnetBlock2D

        Transformer2DModel

        ResnetBlock2D

 

up_blocks:

    CrossAttnUpBlock2D:

        (ResnetBlock2D

        Transformer2DModel (BasicTransformerBlock x 10) ) x 3

        Upsample2D

    CrossAttnUpBlock2D:

        (ResnetBlock2D

        Transformer2DModel (BasicTransformerBlock x 2) ) x 3

        Upsample2D

    UpBlock2D:

       (ResnetBlock2D) x 3

 

out

 

Original Stable diffusion과의 차이점:

  • 최고 해상도의 transformer block 제거
  • 최저 해상도(8x8) 제거
  • SD에서는 모든 Transformer2DModel에 각각 1개의 BasicTransformerBlock을 사용했지만, SDXL에서는 구성이 달라졌다.

 

'Deep Learning > Diffusion' 카테고리의 다른 글

DiffStyler 써보기  (0) 2023.01.17
Paint by Example 써보기  (1) 2023.01.13
DAAM 써보기  (1) 2023.01.13