this post was submitted on 30 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Hey everyone, I work as a scientist at Microsoft Research and I spend a lot of my time training LLMs. I am super excited about the progress of open source llms and have I am happy to share some thoughts and ideas on how to scale them to GPT4 level.

Find my video lecture on this Here

https://youtu.be/gWJj-6udLWU?si=AqiJ-PpTQMBJAAm3

I plan on sharing a lot more ideas and code tutorials on building foundation models, instruction finetuning and alignment.

top 2 comments
sorted by: hot top controversial new old
[โ€“] klenen@alien.top 1 points 11 months ago
[โ€“] xadiant@alien.top 1 points 11 months ago

Really cool, will check the video out. Since we found an actually qualified person though, let me ask a few layman questions, hope you have time to answer them!

sampling methods. Most of them look simple, but we still don't really know how to tune them. Do you think novel sampling methods or specific combinations could improve output quality by a lot?

For instance, beam search. Does beam search provide a linear improvement in quality as you go up or not?

Do you think ideal numbers for temperature, top_k and top_p are context or model based, or both?