this post was submitted on 30 Nov 2023
1 points (100.0% liked)
LocalLLaMA
11 readers
4 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This technique is actually really useful for batch processing.
I.e. if you run 100 generations and reuse the layer while it is loaded that will go much faster than the total serial time.