this post was submitted on 22 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

https://arxiv.org/abs/2311.10770

"UltraFastBERT", apparently a variant of BERT, that uses only 0.3% of it's neurons during inference, is performing on par with similar BERT models.

I hope that's going to be available for all kinds of models in the near future!

you are viewing a single comment's thread
view the rest of the comments
[–] Acceptable_Can5509@alien.top 1 points 11 months ago (1 children)
[–] lakolda@alien.top 1 points 11 months ago

GPT-4 turbo only speeds things up by 3x…