Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 1 year ago

MODERATORS

communick@academy.garden

[D] interview question: deploying LLM (alien.top)

submitted 11 months ago by No_Oilve_6577@alien.top to c/machinelearning@academy.garden

4 comments fedilink hide all child comments

I had an interview question regarding LLM. How exaclty do you deploy LLM, what are your consideration in terms of speed, resource, imbalance load, and all that stuff?

top 4 comments

sorted by: hot top controversial new old

[–] pm_me_your_pay_slips@alien.top 1 points 11 months ago

If you're asking that question here, you ma not be qualified for the job.

[–] Slightlycritical1@alien.top 1 points 11 months ago

I think you’re looking at the problem wrong by focusing just on the LLM aspect of it. If you’re deploying any type of application then it will depend on the demand you’re expecting from users and the use cases the application will be used. A failure rate for medical applications is probably a lot more important than a low budget game service.

[–] milkteaoppa@alien.top 1 points 11 months ago (1 children)

Call the API using requests.post(..)

[–] HPLaserJetM140we@alien.top 1 points 11 months ago

Troglodyte