this post was submitted on 23 May 2024
45 points (94.1% liked)
Selfhosted
59955 readers
382 users here now
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam.
-
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
-
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
-
Submission headline should match the article title.
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
While this will get you a selfhosted LLM it is not possible to feed data to them like this. As far as I know there are a 2 possibilities:
Take an existing model and use the literature data to fine tune the model. The success of this will depend on how much "a lot" means when it comes to the literature
Create a model yourself using only your literature data
Both approaches will require some yrogramming knowledge and understanding of how a llm works. Additionally it will require a preparation of the unstructured literature data to a kind of structured data that can be used to train or fine tune the model.
Im just a CS student so not an expert in this regard ;)
Thx for this comment.
My main drive for self hosting is to escape data harvesting and arbitrary query limits, and to say, "I did this." I fully expect it to be painful and not very fulfilling...