Honestly.. for $2K or so a pop for 4090s.. I'd have bought the M3 MAx Pro laptop with 128GB RAM. Video the other day showed it could load and work with 70b LLM just fine, but a single 4090 was not able to. Dual with 48GB ram probably can, but will likely be slower.. and having a top of the line laptop that doubles as a pretty beefy AI run box for about the same price seems like better money spent.
Dry-Vermicelli-682
Well that is interesting. What libraries are they using in Go do you know? Or are they building their own from scratch. I would imagine there would be some movement to translate python to Go for some situations. But there is a couple examples within this thread that show some good use of Go with OpenAI (and a local llama as well).
I am thinking that I could stand up a model locally that uses OpenAI API, and then write some code in Go that calls the OpenAI APIs of the model.. and then it would likely swap to ChatGPT APIs or if we decide to run our own larger model in the cloud.
WOW.. fantastic. This is the first (only?) response I've seen of the dozens here that says any of this. As a long time back end dev, switching to Go for that was amazing. I was FAR faster and more productive than in Java, Python or NodeJS. Especially once I built my own set of frameworks that I quickly copy/paste (or import). I love how fast/easy it is to stand up a docker wrapper as well. Very tiny images with very fast runtimes.
My point in this post was partially stemmed from the need to install so much runtime to use python and the various aspects with speed/memory/etc running python apps. If it is just basic glue code like some have said, then it's not a big deal. But that I know Go and love the language makes me want to use it over anything else if I can.
Now that would be great. I was in the process of pricing out a 32 and 64 core Threadripper.. thinking it would work well for my next desktop but could also run AI stuff locally very nicely until all this "must use GPU" stuff hit me. So it would be fantastic to be able to take advantage of cpu cores not in use for things like this.
So most of the python AI coding folks.. aren't writing CUDA/high end math/algo style code.. they're just using the library similar to any other SDK?
Yah.. the thing is.. do I have to learn very fancy advanced python to do this.. or can I use more simple python that then makes use of as you said more optimized libraries. I am wondering how much time its going to take to figure out python well enough to be of use and hence was thinking Go and Rust might work well as I know those well enough.
If it's just calling APIs even to a local running model, I Can do that in Go just fine. If it's writing advanced AI code in python, then I have to spend likely months or longer learning the language to use it well enough for that sort of work. In which case I am not sure I am up to that task. I am terrible with math/algos, so not sure how much of all that I am going to have to somehow "master" to be a decent to good AI engineer.
I think the idea is.. am I a developer using AI (via APIs or CLI tools) in some capacity, or am I writing AI code that will be used for training, etc AI. I don't know which path to go, but I'd lean towards using models and API calls over having to become a deep AI expert.
Is there a library you're using? Or you're just using go with a simple query and calling an API like OpenAI?
I agree with you here. I suspect from most answers I've read/seen here and other places, that Python was chosen way back when due to dynamic and creative capabilities (being dynamic) and that for most cases it was plenty fast/good enough and just stuck. It's also "easier" for the not so code capable folks to learn and dabble with because its dynamic.. easier to grasp assigning any value to a variable than specifically declaring the type of variables and being stuck using it for just that type.
But I would think.. and thus my reason for asking this, that today, nearing 2024, with all our experience in languages, threading, more and more cpu core counts, etc.. we'd want a faster binary/native runtime than an interpreted single threaded heavy resource usage language like python or nodejs to really speed up and take advantage of todays hardware.
But some have said that the underlying "guts" are c/c++ binary and python or any language more or less calls those bits, so hence the "glue" code. In those cases, I can agree that it may not matter as much.
I was thinking (and still trying to learn a ton more about) the training aspect.. if that is done on the GPU, the code that runs on it if it can run as fast as it can, it would reduce time to train, thus making it possible to train much faster more specialized models. What do I know though. I started my AI journey literally a week ago and am trying to grasp what I imagine has taken years of development in to my old brain.
I've no clue what that means.. I'll take your word for it. :). Just started on this AI journey and so far trying to learn how to run a model locally, figure out how to maximize what little hardware I have, but interested in the whole shabang.. how you gather data, what sort of data, what it looks like, how you format it (is that inference?) to then be ingested during the training step. What code is used for training.. what does training do, how does training result in a model, and what running the model does.. is the model a binary (code) that you just pass input and get output. So much to learn.
I recently saw something about RAG. MAn.. so man damn acronyms.. brain is exploding already with too much. :D.
I hear you.. that's what I am trying to understand. I guess going back to when AI dev started, maybe Go wasn't around much (too early) and Rust as well. But I question why the AI blokes would choose a dynamic slow runtime language for things like training AI, which seems to be the massive bulk of the cpu/gpu work, over using much faster native binary languages like Go/Rust, or even C. But you said something, which maybe is what I am missing. Others have said this too. Python is more or less "glue code" to use underlying C (native) binary libaries. If that is the case, then I get it. I assumed the super crazy ass long training times and expensive GPUs needed was due in part that python is much slower runtime.. and that using Go/Rust/C would reduce the large training times by quite a bit if it was used. But I am guessing from all the responses that the python code just pushes the bulk of the work on to the GPU using native binary libs.. and thus the code done in python does not have to be super fust runtime. Thus, you pick up the "creative" side of python and benefit from using that in ways that might be harder to do in Go or Rust.
But some have replied they are using Rust for inference, data prep, etc.. I'll have to learn what Inference is.. not sure what that part is, and nor do I fully understand what data prep entails. Is it just turning gobs of all sorts of data in various formats in to a specific structure (I gather from some reading a vector database) that the training part understands the structure of that database.. so you're basically gathering data (Scraping the web, reading CSV files, github, etc) and putting that in to a very specific sort of key/value (or similar) structure, that the training bit then uses to train with?
I am really looking forward to Zig maturing. I find the memory model stuff a bit odd, but the overall language looks easy enough for most things, and so far everything I read.. and this being a 0.11 release, is that it's as fast if not faster than C code in most cases. I don't know how well it would compare to higher level languages like Go, Java, etc for back end web stuff.. but I'd imagine with some time and a few good frameworks similar to Go's Chi, JWTAuth and Casbin libraries, it would be very good.