overview for lakolda

[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video in c/machinelearning@academy.garden

[–] lakolda@alien.top 1 points 11 months ago

Ever hear the term might?

[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video in c/machinelearning@academy.garden

[–] lakolda@alien.top 1 points 11 months ago (2 children)

Some of the content also seems to allude to what Q* might be…

40x or more speedup by selecting important neurons in c/localllama@poweruser.forum

[–] lakolda@alien.top 1 points 11 months ago

GPT-4 turbo only speeds things up by 3x…

ShareGPT4V - New multi-modal model, improves on LLaVA in c/localllama@poweruser.forum

[–] lakolda@alien.top 1 points 11 months ago

This isn’t comparing with the 13B version of LLAVA. I’d be curious to see that.

[R] Levels of AGI: Operationalizing Progress on the Path to AGI - DeepMind 2023 in c/machinelearning@academy.garden

[–] lakolda@alien.top 1 points 1 year ago (1 children)

In context learning allows the model to learn new skills to a limited degree.

PHIND V7: Red Flags in c/localllama@poweruser.forum

[–] lakolda@alien.top 1 points 1 year ago (2 children)

GPT-3.5 turbo apparently has 20 billion parameters, significantly less than the previous best Phind models. Given how bad GPT-3.5 is, I think it was more likely just fine tuned some other base model on GPT-3.5 outputs.

What are top open source projects in LLM space in c/localllama@poweruser.forum

[–] lakolda@alien.top 1 points 1 year ago

The original LLMZip paper mainly focused on text compression. A later work (I forget the name) used an LLM trained on byte tokens. This allowed it to compress not just text, but any file format. I think it may have been Google who published that particular paper… Very impressive though.

What are top open source projects in LLM space in c/localllama@poweruser.forum

[–] lakolda@alien.top 1 points 1 year ago (2 children)

LLMZip achieves SOTA compression by a large margin.