overview for we_are

But the point you were trying to prove was that the discussions were "constant". How does picking your own threads spanning 2 months support it at all?

The OP didn't say that the discussions were completely gone. Yes, there are some, but pretty thin and usually glib. I don't count "Wow! This is exciting. I'll have to take a look at this awesome new paper!" as discussion. A bot harvesting upvotes could post this.

Is there an interest in resurrecting technical discussions of the latest research? [D] in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago (2 children)

here constantly.

Fortnightly. Finally got a chance to use this word :-) 4 links spanning 2 months.

But even in these picks, take a look at the first one, for example. 10 comments. Only one of them suggests that the commentator looked at the paper itself.

1

Is there an interest in resurrecting technical discussions of the latest research? [D] (alien.top)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

16 comments fedilink

Technical discussions of new research seem to have mostly disappeared in this subreddit, because researchers became a small fraction of its immense readership of 3e6 members.

So I created a subreddit to host such discussions. A "safe space" for researchers, if you will, with strict standards for content^1 . I seeded it with posts about a few recent papers I thought were interesting and my own takes on them, to get the discussion started.

But then I said to myself: "You don't have time to manage a subreddit. WTF are you doing?" and deleted it all. Nevertheless, I'd like to see someone else, perhaps someone with more time, try to do it.

^1: Its main rule was: "No low-effort or low-expertise posts or comments: If your average ML PhD student, or someone with a higher level of expertise wouldn't have posted something, then it does not belong here." Other rules dealt with the format of the posts.

Bill Gates told a German newspaper that GPT5 wouldn't be much better than GPT4: "there are reasons to believe that we have reached a plateau" [N] in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago (1 children)

According to the scaling laws, the loss/error is approximated as

w0 + w1 * pow(num_params, -w2) + w3 * pow(num_tokens, -w4)

Bill wrote before that he'd been meeting with the OpenAI team since 2016, so he's probably pretty knowledgeable about these things. He might be referring to the fact that, after a while, you will see very diminishing returns while increasing num_params. In the limit, the corresponding term disappears, but the others do not.

1

Bill Gates told a German newspaper that GPT5 wouldn't be much better than GPT4: "there are reasons to believe that we have reached a plateau" [N] (www.handelsblatt.com)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

130 comments fedilink

[D] Machine learning conferences are problematic in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

a messed-up experiment or a poorly written/plainly incorrect paper that slips through the review system could be your end

Is that true? If your paper is totally wrong, publish a retraction, do not include the paper in your "list of publications", and move on.

[R] "Toeplitz Neural Network for Sequence Modeling" & "Accelerating Toeplitz Neural Network with Constant-time Inference Complexity" in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

Technical discussion seems to be dead in r/MachineLearning, but I'll ask anyway: Isn't it strange that in Figure 3 of the first paper, layer 1 has a blurry diagonal, while the rest of them are sharp? I would have expected the opposite: the lowest layer to be very local, and higher layers to be more global.

[R] Exponentially Faster Language Modelling in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

the claimed 117.83x speedup, might be somewhat misleading

If you compare the best implementation of FFF on CUDA to the best implementation of FF on CUDA, then the speed-up they got is 3.15x:

See Page 5 Further comparisons: "On GPU, the PyTorch BMM implementation of FFF delivers a 3.15x speedup over the fastest (Native fused) implementation of FF"

The 40x that u/lexected mentioned seems to apply only when comparing to an apparently much slower FF version.

It's a pretty cool paper regardless, as far as I can tell from skimming it. But it could benefit from stating more clearly what has been achieved.

[R] Exponentially Faster Language Modelling in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

has 4095 neurons but selectively uses only 12 (0.03%) for inference

an extra 0 in there

[D] Exclusive: Sam Altman's ouster at OpenAI was precipitated by letter to board about AI breakthrough in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

So the implication here is that the CEO knew about the breakthrough, but hid it from the board?

MSFT did experience a 20% climb over the last month. Maybe it was due to this news leaking out?

[R] Exponentially Faster Language Modelling in c/machinelearning@academy.garden

[–] we_are_mammals@alien.top 1 points 2 years ago

I think DistilBERT needs to be in Table 2, since it's their most direct competitor: it trades off accuracy for speed, and requires extra training effort, like their approach.

Still, if they are about 20x faster than DistilBERT using cuBLAS, that's pretty amazing.

1

OpenAI: "We have reached an agreement in principle for Sam to return to OpenAI as CEO" [N] (alien.top)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

0 comments fedilink

OpenAI announcement:

"We have reached an agreement in principle for Sam to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.

We are collaborating to figure out the details. Thank you so much for your patience through this."

https://twitter.com/OpenAI/status/1727205556136579362

1

OpenAI: "We have reached an agreement in principle for Sam to return to OpenAI as CEO" [N] (alien.top)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

29 comments fedilink

OpenAI announcement:

"We have reached an agreement in principle for Sam to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.

We are collaborating to figure out the details. Thank you so much for your patience through this."

https://twitter.com/OpenAI/status/1727205556136579362

1

Stability AI releases a video model [N] (alien.top)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

0 comments fedilink

Stability AI is releasing Stable Video Diffusion, their first foundation model for generative video based on the image model Stable Diffusion:

https://stability.ai/news/stable-video-diffusion-open-ai-video-model

1

Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence [N] (alien.top)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

4 comments fedilink

https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

It looks like content will have to be labeled, showing if it's AI-generated or not.

And special rules will apply to:

any model that was trained using a quantity of computing power greater than 10^26 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10^23 integer or floating-point operations; and

any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10^20 integer or floating-point operations per second for training AI.

Also, easier visas for "AI talent".