Sliding attention: Sliding window to speed-up inference

justynasty@lemmy.kya.moe · 1 year ago

Sliding attention: Sliding window to speed-up inference

noneabove1182@sh.itjust.works · 1 year ago

This is great and comes with a very interesting model!

I wonder if they cleverly slide the window in any way or if it’s just a naive slide, could probably be pretty smart if you discard tokens that have minimal attention on them anyways to focus on important text

For now, this is awesome!

Sliding attention: Sliding window to speed-up inference

Sliding attention: Sliding window to speed-up inference

GitHub - mistralai/mistral-src