this post was submitted on 28 Nov 2023
        
      
      1 points (100.0% liked)
      LocalLLaMA
    11 readers
  
      
      4 users here now
      Community to discuss about Llama, the family of large language models created by Meta AI.
        founded 2 years ago
      
      MODERATORS
      
    you are viewing a single comment's thread
view the rest of the comments
    view the rest of the comments
I'll reply to myself!
It's not just about GPU expense. You need a small team of ML data scientists. You need access to (or a way to scrape/generate) a mind-bogglingly broad dataset. You need to clean, normalize, and prepare the dataset. All of this takes a huge amount of expertise, time and money. I wouldn't be at all surprised if the auxiliary costs surpassed the GPU rental cost.
So the main answer to your question "Why is no one releasing 70b models?" is: it's really, really, really expensive. Other parts of the answer are: lack of expertise, difficulty of generating a good dataset, and probably a hundred things I haven't thought of.
But mainly it just comes down to cost. I bet you wouldn't see any change from $5,000,000 if you wanted to make your own new 70b base model.