this post was submitted on 22 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Honestly, I mostly just follow hugging face's blog and articles. I know there are some latest fancy attention improvements, alternatives for RLHF, GPU whatever optimization, etc, but I'm not going to implement those myself. If it's not in hugging face's ecosystem, then I most likely wouldn't use it in my daily work/production code anyway.
Hugging face is for sure a godsend, even though I'm still at a semi-loss with their API. It changed so much, and there is so much more now that it has become a little confusing. Nothing a little work can't fix ! But that raises the question to me : how do these people manage to get out every model so fast ?
Yeah, reading all their latest releases is already taking me a lot of time so I just mostly stop there. They also don't have a lot of documentations for their latest stuffs, so it takes a bit time to figure things out. I think their packages will settle down to a more stable state after a year or two, after the NLP hype cooldowns a bit.