this post was submitted on 13 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I see a lot -- by no means an overabundance, but enough to "trigger" me -- of laughing at some of the "obvious" research that gets posted here.

One example from a week or two ago that's been rattling around in my head was someone saying in reply to the paper (paraphrased):

That's just RAG with extra steps.

Exactly. But what were those steps attempting? Did it make RAG better?

Yes. Great, let's continue pulling the thread.

No. Ok, let's let others know that pulling this thread in this direction has been tried, and they should take a different approach; maybe it can be pulled in a different direction.

We are at the cusp of a shift in our cultural and technical cultures. Let's not shame the people sharing their work with the community.

you are viewing a single comment's thread
view the rest of the comments
[–] maxjprime@alien.top 1 points 10 months ago (2 children)

I agree...but as a new "tinkerer" in this space and entirely unfamiliar with how academic research works, I've always been curious, why is the stuff always published in proprietary PDF form at on some third party research site or journal? Why don't more people just post their findings on their own websites in an open standard like HTML?

[–] RonLazer@alien.top 1 points 10 months ago

Because real research is supposed to be peer reviewed, and journals offer peer review by panels of experts. Arxiv was supposed to circumvent that by allowing for review by an open group of peers, but the cycle for new research is so short nowadays that it basically means "review by twitter"

[–] stannenb@alien.top 1 points 10 months ago

Scientific publishing is done the way it is for a number of reasons, important ones already have been noted by other commentators.

But one aspect not covered is peer review. A peer reviewed journal article has been the gold standard of scientific research for quite some time. This involves submitting the article to the journal, and letting the journal anonymously recruit other experts to critique the paper. The goal here is to strengthen the research and screen for mistakes (and fraud, though peer review assumes honesty on the part of the researchers.)

This process has been challenged recently because, in the end, it doesn't really work to create a uniform gold standard of research. Large fields of research can't actually replicate the results of peer reviewed studies. Large, systematic frauds go undetected. The process is agonizingly slow in case of emergencies like COVID. And "idiosyncratic" reviewers can make the process worse. (See: Reviewer 2).

On top of that mess is money. You often pay to have your article published, your institution's library has to pay for a subscription, and your work is locked behind a paywall. This is particularly galling when the government has paid for the research in the first place. But the big publishers retain (too much) control.

The last problem is the discovery of your work. Publish a blog post and who notices? Publish in a leading, reputable journal, you're guaranteed eyes on your work.

None of this is particularly good for either researchers or scientists and it's interesting to watch academics experiment with alternate ways of codifying their discoveries. The insistence on "open access" - no paywalls - is one way. PrePrint servers like Arvix allow researchers to meaningfully distribute papers intended to be peer reviewed before the review happens. And fields like LLMs are moving so fast no disciplined publishing process could keep up are helping to further disrupt this.