maizeq

joined 11 months ago
[–] maizeq@alien.top 1 points 10 months ago

Much more likely: the story is apocryphal, or is at least, highly exaggerated.

[–] maizeq@alien.top 1 points 10 months ago (2 children)

What is there differentiating factor, or are they planning on being another one of maybe hundred or so companies copy-pasting the same basic architecture, and the same basic training data?

I think the proliferation of smaller LLMs is wonderful but none have really placed a dent on the capabilities of the best closed source models (mostly OpenAI), which is largely due to model size. Beyond model size even, there seems to be no real innovation happening in architecture, design, or UX between Falcon, Mistral, Llama, Yi, etc. etc.

LLMs seem like a black hole in VC space, gambling at the level of billionaires. It reminds me of the talk given by Warren Buffet years back on how hard difficult it is to predict winners even when you know a technology is inevitable:

"There were two thousand auto companies: the most important invention, probably, of the first half of the twentieth century. It had an enormous impact on people’s lives. If you had seen at the time of the first cars how this country would develop in connection with autos, you would have said, ‘This is the place I must be.’ But of the two thousand companies, as of a few years ago, only three car companies survived.21 And, at one time or another, all three were selling for less than book value, which is the amount of money that had been put into the companies and left there. So autos had an enormous impact on America, but in the opposite direction on investors.”

And also with respect to airline companies:

“Now the other great invention of the first half of the century was the airplane. In this period from 1919 to 1939, there were about two hundred companies. Imagine if you could have seen the future of the airline industry back there at Kitty Hawk. You would have seen a world undreamed of. But assume you had the insight, and you saw all of these people wishing to fly and to visit their relatives or run away from their relatives or whatever you do in an airplane, and you decided this was the place to be.

As of a couple of years ago, there had been zero money made from the aggregate of all stock investments in the airline industry in history."

Taken from The Snowball by Alice Schroeder

[–] maizeq@alien.top 1 points 10 months ago

The decrease in quality in the new ChatGPT 4.0 is actually making me Google more once again.

[–] maizeq@alien.top 1 points 10 months ago

Banal and unhelpful response.

[–] maizeq@alien.top 1 points 10 months ago

There use of monolingual and multilingual to describe the same dataset is unusual.

I get that they're probably trying to say "monolingual at the document-level", but the back and forth is quite confusing.

E.g.

"We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset

"We use both supervised parallel data with a machine translation objective and the monolingual MADLAD-400 dataset"

"Through MADLAD-400, we introduce a highly multilingual, general web-domain, document-level text dataset"

Unless I am missing something obvious, these are either typos or poor wording decisions.

[–] maizeq@alien.top 1 points 10 months ago

This is not actually the first diffusion based LLM. See SUNDAE.