I say:
- It has a performance hit, but it remains to be seen if going with a much larger model can compensate for that.
- The model needs to be trained from scratch, you cannot finetune an existing model for this apparently...
Community to discuss about Llama, the family of large language models created by Meta AI.
I say:
" we provide high-level CPU code achieving 78x speedup over the optimized baseline feedforward implementation"
Big if true, we wouldn't need to buy 3090 cards anymore to get sufficiant memory, just buying more RAM would suffice
Huge, if true.
you might want to read here: https://www.reddit.com/r/MachineLearning/comments/1815a05/r_exponentially_faster_language_modelling/