this post was submitted on 13 Sep 2023
58 points (100.0% liked)

Technology

39454 readers
551 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
 

Avram Piltch is the editor in chief of Tom's Hardware, and he's written a thoroughly researched article breaking down the promises and failures of LLM AIs.

you are viewing a single comment's thread
view the rest of the comments
[–] lily33@lemm.ee 27 points 2 years ago* (last edited 2 years ago) (20 children)

They have the right to ingest data, not because they're “just learning like a human would". But because I - a human - have a right to grab all data that's available on the public internet, and process it however I want, including by training statistical models. The only thing I don't have a right to do is distribute it (or works that resemble it too closely).

In you actually show me people who are extracting books from LLMs and reading them that way, then I'd agree that would be piracy - but that'd be such a terrible experience if it ever works - that I can't see it actually happening.

[–] RickRussell_CA@beehaw.org 23 points 2 years ago* (last edited 2 years ago) (16 children)

Two things:

  1. Many of these LLMs -- perhaps all of them -- have been trained on datasets that include books that were absolutely NOT released into the public domain.

  2. Ethically, we would ask any author who parrots the work of others to provide citations to original references. That rarely happens with AI language models, and if they do provide citations, they often do it wrong.

[–] RandoCalrandian@kbin.social 4 points 2 years ago (6 children)

Is there a meaningful difference between reproducing the work and giving a summary? Because I’ll absolutely be using AI to filter all the editorial garbage out of news, setup and trained myself to surface what is meaningful to me stripped of all advertising, sponsorships, and detectable bias

[–] Tarte@kbin.social 5 points 2 years ago* (last edited 2 years ago)

I have yet to find an LLM that can summarize a text without errors. I already mentioned this in another post a few days back, but Google‘s new search preview is driving me mad with all the hidden factual errors. They make me click only to realize that the LLM told me what I wanted to find, not what is there (wrong names, wrong dates, etc.).

I greatly prefer the old excerpt summaries over the new imaginary ones (they‘re currently A/B testing).

load more comments (5 replies)
load more comments (14 replies)
load more comments (17 replies)