this post was submitted on 27 Apr 2026
149 points (96.3% liked)

Programming

26701 readers
517 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] e8d79@discuss.tchncs.de 9 points 11 hours ago (2 children)

So how would I create such an "Open Source" model? They don't share the data used to create them do they? Let's not even get started on how much computing power I would need to train one of those things. These selfhosted models solve nothing except some data privacy issues. Sure you no longer send all your code to a shady AI company but you are still 100% dependent on them sharing their models.

[–] The_Decryptor@aussie.zone 6 points 10 hours ago* (last edited 10 hours ago) (1 children)

So how would I create such an “Open Source” model? They don’t share the data used to create them do they?

No, and going by the OSI definition of "open source AI" they don't have to, acknowledging that the training material is often copyrighted and can't be shared.

It's a strange definition of "open source", one where you're not actually allowed to see the source.

[–] Eyekaytee@aussie.zone 3 points 9 hours ago* (last edited 8 hours ago)

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, training data and methods, is openly accessible and fully documented.

https://ethz.ch/en/news-and-events/eth-news/news/2025/09/press-release-apertus-a-fully-open-transparent-multilingual-language-model.html

There is also a move into synthetic data and human trained so we will have to see where the training data goes copyright wise in the future