Abstract
Mamba와 Sliding Window를 계층적으로 결합한 hybrid architecture인 Samba를 제안하고 실험
[Github]
[arXiv](2024/06/11 version v1)
Methodology
Experiments and Results
Language Modeling on Textbook Quality Data
Perplexity
처리 속도
Long-Context Understanding
'논문 리뷰 > Mamba' 카테고리의 다른 글
Jamba: A Hybrid Transformer-Mamba Language Model (0) | 2024.04.01 |
---|---|
Zoology: Measuring and Improving Recall in Efficient Language Models (0) | 2024.02.28 |
VMamba: Visual State Space Model (0) | 2024.01.24 |
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model (0) | 2024.01.22 |
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts (0) | 2024.01.15 |
Mamba: Linear-Time Sequence Modeling with Selective State Spaces (0) | 2024.01.15 |