AI ResearchZyphra Proposes a Third Path for Long Context: Neither Linear Attention nor Full Self-Attention
A new sequence mixing layer uses online clustering to approach self-attention quality while keeping memory constant.
Oliver Senti14 hours ago6 min