본문 바로가기

논문 리뷰/Mamba

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

 

Abstract

Mamba와 Sliding Window를 계층적으로 결합한 hybrid architecture인 Samba를 제안하고 실험

 

[Github]

[arXiv](2024/06/11 version v1)

 

 

Methodology

Mamba

Sliding Window Attention

 

 

 

Experiments and Results

Language Modeling on Textbook Quality Data

 

Perplexity

 

처리 속도

 

Long-Context Understanding