ell1e

joined 1 week ago
[โ€“] ell1e@leminal.space 5 points 1 week ago* (last edited 1 week ago)

sadly, data that is too centralized and easily available will always be abused at some point. the recent US developments are showing this nicely.

[โ€“] ell1e@leminal.space 13 points 1 week ago (2 children)

you deserve a trophy ๐Ÿ† ๐Ÿฅฐ

[โ€“] ell1e@leminal.space 9 points 1 week ago* (last edited 1 week ago) (2 children)

not that the recent governments care, they want to centralize most of the data of citizens now with pretty poor protections in a lot of cases. sads

[โ€“] ell1e@leminal.space 5 points 1 week ago (2 children)

i may or may not be german as well ๐Ÿซฃ

[โ€“] ell1e@leminal.space 7 points 1 week ago (2 children)

interestingly, most commenters here don't seem to be on .world ๐Ÿค”

[โ€“] ell1e@leminal.space 29 points 1 week ago* (last edited 1 week ago) (5 children)

surprise germans ๐Ÿซจ

[โ€“] ell1e@leminal.space 1 points 1 week ago* (last edited 1 week ago) (2 children)

But the article later does back it up: "Although Cloudflare singled out Google, other search engines that view AI search features as part of their search products also use the same bots for training as they do for search indexing."

In any case, I'm okay with admitting neither you nor me can look inside Google to see they're doing. But the claims are out there, I didn't make them up, whether they're true or not. Thank you for the certainly interesting Google crawler info link.

[โ€“] ell1e@leminal.space 2 points 1 week ago* (last edited 1 week ago) (4 children)

You look up what Googlebot does. No AI.

The page seems written to perhaps suggest it but doesn't explicitly say the other bots can't feed into some other sort of AI training. It would be in Google's interest to mislead the users here.

Edit: I found a quote where it says Googlebot does both in one: "Google-Extended doesn't have a separate HTTP request user agent string. Crawling is done with existing Google user agent [...]" and I guess Cloudflare doesn't trust Google to abide by the access controls. That seems sensible to me. Edit 2: What exactly the CEO believes was perhaps rightfully disputed below, it was just my guess.

[โ€“] ell1e@leminal.space 2 points 1 week ago* (last edited 1 week ago) (6 children)

Nothing on this page seems to contradict the article. But if I simply missed the part that does, I'd be happy to learn.

[โ€“] ell1e@leminal.space 3 points 1 week ago* (last edited 1 week ago) (8 children)

So what's the quote from the documentation that backs up your claim? The line "perform other product specific crawls" seems extremely vague by design.

[โ€“] ell1e@leminal.space 9 points 1 week ago* (last edited 1 week ago) (10 children)
view more: โ€น prev next โ€บ