this post was submitted on 23 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

I am looking for an easy way to turn my local Jupyter notebooks into a deployable model without having to write a lot of code or configuration. Thank you.

you are viewing a single comment's thread
view the rest of the comments
[–] qalis@alien.top 1 points 11 months ago (2 children)

You shouldn't do that, for multiple reasons (I can elaborate if needed). Your model is a binary file, a set of weights, basically no matter what you train. Once you write it to disk, typically with built-in serialization (e.g. pickle for Scikit-learn, or .pth format for PyTorch), there are lots of frameworks to deploy it.

The easiest to use and the most generic one is BentoML, which will package the code into a Docker image and automatically deploy with REST and gRPC endpoints. It has a lot of integrations, and is probably the most popular option. There are also more specialized solutions, e.g. TorchServe.

However, if you care about inference speed, you should also compile or optimize your model for the target architecture before packaging it for the API and target runtime, e.g. with ONNX, Apache TVM, Treelite or NVidia TensorRT.

[–] ThisIsBartRick@alien.top 1 points 11 months ago (1 children)

this is the right answer. The question screams novice. And his comment "You are entitled to your own opinion" when faced with advice shows that he's not willing to learn

[–] Medium_Ad_3555@alien.top 1 points 11 months ago

You can label it as a wish in your hologram of reality; I'm not seeking a pro badge or someone's approval.

[–] Medium_Ad_3555@alien.top 1 points 11 months ago

I agree that ONNX would be the right solution if you need to serve 100M inference requests. However, my code is not for that case; most likely it will serve up only 100K requests and will be either thrown away or completely re-engineered for the next iteration of requirements. Additionally, it's not just about the binary model file; there is pre-processing involved, data needs to be pulled from an internal API, the inference needs to be run, and finally, the results need to be post-processed.

I know how to convert it to fast API, but was curious if there is any solution where I can parameterize and sever an inference cell code with low effort.