Medium_Ad_3555

joined 11 months ago
[–] Medium_Ad_3555@alien.top 1 points 11 months ago

You can label it as a wish in your hologram of reality; I'm not seeking a pro badge or someone's approval.

[–] Medium_Ad_3555@alien.top 1 points 11 months ago

I agree that ONNX would be the right solution if you need to serve 100M inference requests. However, my code is not for that case; most likely it will serve up only 100K requests and will be either thrown away or completely re-engineered for the next iteration of requirements. Additionally, it's not just about the binary model file; there is pre-processing involved, data needs to be pulled from an internal API, the inference needs to be run, and finally, the results need to be post-processed.

I know how to convert it to fast API, but was curious if there is any solution where I can parameterize and sever an inference cell code with low effort.

[–] Medium_Ad_3555@alien.top 1 points 11 months ago (3 children)

You are entitled to your own opinion, and to write your ML inference code in plain C.

 

I am looking for an easy way to turn my local Jupyter notebooks into a deployable model without having to write a lot of code or configuration. Thank you.