Capital-Alps5626

joined 1 year ago
[โ€“] Capital-Alps5626@alien.top 1 points 11 months ago

Money will be an object but I did this recently for consumer hardware. Dark zero motherboard 14900k 192gb ram 4090 t700 crucial ssd

 

First, I'd like to describe my two possible hardware setup for this problem.

Hardware Setup 1: 14900K + RTX 3060 (8GB VRAM) + 192GB RAM

Hardware Setup 2: 12600K + RTX 4090 (24GB VRAM) + 64GB RAM

The performance requirement of this task is somewhat reasonable, but it is a batch processor, so it doesn't have to be real time.

The problem at hand is trying to use LLMs to fact check or "categorize" snippets of text. What the customer say they want is summarize this snippet of text and tell me what it is about. If anyone knows which kind of model does that well for a setup I described, I'll happily take that as answer.

However, my technical judgement tells me they really want a hot dog or not hot dog machine (silicon valley reference).

90% of the questions they want to ask of a snippet of text is along the following lines:

"Tell me 'truth' if this text that I pasted above is talking about a middle aged woman with arthritis? If it's talking about a man or an older woman with arthritis then tell me false. If it is not talking about a human being with arthritis, tell me n/a"

The ideal classification will be a human female that is middle aged (and we're happy to define that in the context) and not a human male or any other mammal returns true, and it either returns false, or n/a, so a little bit more like hot dog/not hot dog/not food.

What would be a good model for this? The context is typically 2 A4-sized pages of paper that are small font texts.

Today we're using Azure OpenAI and it works very well, but there is a desire to first do a "hot dog or not" so that we don't just send random snippets of text to Azure OpenAI.

Think of this like a first line of defense. If this works well, the local llm setup will be used for psychiatry and sexual topics which are prohibitied in Azure OpenAI.

https://www.reddit.com/r/LocalLLaMA/comments/17yxoxv/local_llm_for_hot_dog_or_not_hot_dog_kind_of_fact/

Would you say your advice in this post is applicable to my post? I think I'm in this same camp. I don't want to go through the hundreds of fine-tuned models. I just want to talk to the model with the kinds of things you've mentioned.

Then why do people fine-tune for instruction? Perhaps the answer to my question is how do you fine tune a model for instruction? Is there a document or steps?