this post was submitted on 09 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I'm a data engineer who somehow ended up as a software developer. So many of my friends are working now with the OpenAI api to add generative capabilities to their product, but they lack A LOT of context when it comes to how LLMs actually works.

This is why I started writing popular-science style articles that unpack AI concepts for software developers working on real-world application. It started kind of slow, honestly I wrote a bit too "brainy" for them, but now I've found a voice that resonance with this audience much better and I want to ramp up my writing cadence.

I would love to hear your thoughts about what concepts I should write about next?
What get you excited and you find hard to explain to someone with a different background?

you are viewing a single comment's thread
view the rest of the comments
[–] rejectedlesbian@alien.top 1 points 10 months ago (4 children)

optimizer OMG no one touched optimizes for decades.
we basically figure its ADAM/SGD and there wasnt really any improvement on it.

I tried finding an improvement to it myself for a few months but failed miserably

[–] satireplusplus@alien.top 1 points 10 months ago

Because its super hard to build something that works better than ADAM across many tasks. There's probably no shortage of people trying to come up with something better.

[–] koolaidman123@alien.top 1 points 10 months ago

Nothing beats adamw + compute. Plus with the current data centric approach everything kinda converges at scale

[–] charlesGodman@alien.top 1 points 10 months ago

There has been LOADS of research on deep learning optimisation in recent years. However, TLDR nothing beats ADAM.

[–] currentscurrents@alien.top 1 points 10 months ago

Learned optimizers look promising - training a neural network to train neural networks.

Unfortunately they're hard to train and nobody has gotten them to really work yet. The two main approaches are meta-training or reinforcement learning, but meta-training is very expensive and RL has all the usual pitfalls of RL.