Dry-Vermicelli-682

joined 1 year ago
[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

I am really looking forward to Zig maturing. I find the memory model stuff a bit odd, but the overall language looks easy enough for most things, and so far everything I read.. and this being a 0.11 release, is that it's as fast if not faster than C code in most cases. I don't know how well it would compare to higher level languages like Go, Java, etc for back end web stuff.. but I'd imagine with some time and a few good frameworks similar to Go's Chi, JWTAuth and Casbin libraries, it would be very good.

 

I am doubtful there would be one, but curious if there was any model that would be good at answering various legal questions.. mostly around marriage, work, stocks, corporate, maybe finances, etc. Mostly for fun but ideally something with fairly up to date details around laws.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

Honestly.. for $2K or so a pop for 4090s.. I'd have bought the M3 MAx Pro laptop with 128GB RAM. Video the other day showed it could load and work with 70b LLM just fine, but a single 4090 was not able to. Dual with 48GB ram probably can, but will likely be slower.. and having a top of the line laptop that doubles as a pretty beefy AI run box for about the same price seems like better money spent.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

Well that is interesting. What libraries are they using in Go do you know? Or are they building their own from scratch. I would imagine there would be some movement to translate python to Go for some situations. But there is a couple examples within this thread that show some good use of Go with OpenAI (and a local llama as well).

I am thinking that I could stand up a model locally that uses OpenAI API, and then write some code in Go that calls the OpenAI APIs of the model.. and then it would likely swap to ChatGPT APIs or if we decide to run our own larger model in the cloud.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

WOW.. fantastic. This is the first (only?) response I've seen of the dozens here that says any of this. As a long time back end dev, switching to Go for that was amazing. I was FAR faster and more productive than in Java, Python or NodeJS. Especially once I built my own set of frameworks that I quickly copy/paste (or import). I love how fast/easy it is to stand up a docker wrapper as well. Very tiny images with very fast runtimes.

My point in this post was partially stemmed from the need to install so much runtime to use python and the various aspects with speed/memory/etc running python apps. If it is just basic glue code like some have said, then it's not a big deal. But that I know Go and love the language makes me want to use it over anything else if I can.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (1 children)

Now that would be great. I was in the process of pricing out a 32 and 64 core Threadripper.. thinking it would work well for my next desktop but could also run AI stuff locally very nicely until all this "must use GPU" stuff hit me. So it would be fantastic to be able to take advantage of cpu cores not in use for things like this.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (3 children)

So most of the python AI coding folks.. aren't writing CUDA/high end math/algo style code.. they're just using the library similar to any other SDK?

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (1 children)

Yah.. the thing is.. do I have to learn very fancy advanced python to do this.. or can I use more simple python that then makes use of as you said more optimized libraries. I am wondering how much time its going to take to figure out python well enough to be of use and hence was thinking Go and Rust might work well as I know those well enough.

If it's just calling APIs even to a local running model, I Can do that in Go just fine. If it's writing advanced AI code in python, then I have to spend likely months or longer learning the language to use it well enough for that sort of work. In which case I am not sure I am up to that task. I am terrible with math/algos, so not sure how much of all that I am going to have to somehow "master" to be a decent to good AI engineer.

I think the idea is.. am I a developer using AI (via APIs or CLI tools) in some capacity, or am I writing AI code that will be used for training, etc AI. I don't know which path to go, but I'd lean towards using models and API calls over having to become a deep AI expert.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (1 children)

Is there a library you're using? Or you're just using go with a simple query and calling an API like OpenAI?

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (1 children)

I agree with you here. I suspect from most answers I've read/seen here and other places, that Python was chosen way back when due to dynamic and creative capabilities (being dynamic) and that for most cases it was plenty fast/good enough and just stuck. It's also "easier" for the not so code capable folks to learn and dabble with because its dynamic.. easier to grasp assigning any value to a variable than specifically declaring the type of variables and being stuck using it for just that type.

But I would think.. and thus my reason for asking this, that today, nearing 2024, with all our experience in languages, threading, more and more cpu core counts, etc.. we'd want a faster binary/native runtime than an interpreted single threaded heavy resource usage language like python or nodejs to really speed up and take advantage of todays hardware.

But some have said that the underlying "guts" are c/c++ binary and python or any language more or less calls those bits, so hence the "glue" code. In those cases, I can agree that it may not matter as much.

I was thinking (and still trying to learn a ton more about) the training aspect.. if that is done on the GPU, the code that runs on it if it can run as fast as it can, it would reduce time to train, thus making it possible to train much faster more specialized models. What do I know though. I started my AI journey literally a week ago and am trying to grasp what I imagine has taken years of development in to my old brain.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

I've no clue what that means.. I'll take your word for it. :). Just started on this AI journey and so far trying to learn how to run a model locally, figure out how to maximize what little hardware I have, but interested in the whole shabang.. how you gather data, what sort of data, what it looks like, how you format it (is that inference?) to then be ingested during the training step. What code is used for training.. what does training do, how does training result in a model, and what running the model does.. is the model a binary (code) that you just pass input and get output. So much to learn.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago

I recently saw something about RAG. MAn.. so man damn acronyms.. brain is exploding already with too much. :D.

[–] Dry-Vermicelli-682@alien.top 1 points 11 months ago (1 children)

I hear you.. that's what I am trying to understand. I guess going back to when AI dev started, maybe Go wasn't around much (too early) and Rust as well. But I question why the AI blokes would choose a dynamic slow runtime language for things like training AI, which seems to be the massive bulk of the cpu/gpu work, over using much faster native binary languages like Go/Rust, or even C. But you said something, which maybe is what I am missing. Others have said this too. Python is more or less "glue code" to use underlying C (native) binary libaries. If that is the case, then I get it. I assumed the super crazy ass long training times and expensive GPUs needed was due in part that python is much slower runtime.. and that using Go/Rust/C would reduce the large training times by quite a bit if it was used. But I am guessing from all the responses that the python code just pushes the bulk of the work on to the GPU using native binary libs.. and thus the code done in python does not have to be super fust runtime. Thus, you pick up the "creative" side of python and benefit from using that in ways that might be harder to do in Go or Rust.

But some have replied they are using Rust for inference, data prep, etc.. I'll have to learn what Inference is.. not sure what that part is, and nor do I fully understand what data prep entails. Is it just turning gobs of all sorts of data in various formats in to a specific structure (I gather from some reading a vector database) that the training part understands the structure of that database.. so you're basically gathering data (Scraping the web, reading CSV files, github, etc) and putting that in to a very specific sort of key/value (or similar) structure, that the training bit then uses to train with?

 

I know the typical answer is "no because all the libs are in python".. but I am kind of baffled why more porting isn't going on especially to Go given how Go like Python is stupid easy to learn and yet much faster to run. Truly not trying to start a flame war or anything. I am just a bigger fan of Go than Python and was thinking coming in to 2024 especially with all the huge money in AI now, we'd see a LOT more movement in the much faster runtime of Go while largely as easy if not easier to write/maintain code with. Not sure about Rust.. it may run a little faster than Go, but the language is much more difficult to learn/use but it has been growing in popularity so was curious if that is a potential option.

There are some Go libs I've found but the few I have seem to be 3, 4 or more years old. I was hoping there would be things like PyTorch and the likes converted to Go.

I was even curious with the power of the GPT4 or DeepSeek Coder or similar, how hard would it be to run conversions between python libraries to go and/or is anyone working on that or is it pretty impossible to do so?

 

Hey all,

So I am trying to run some of the various models to try to learn more, but use for some specific research purposes. I have my trusty 16 core Threadripper (Gen 1) with 64GB ram, SSD and an AMD 6700XT GPU.

I installed Ubuntu server.. no GUI/desktop, to hopefully maximize hardware for AI stuff. It runs Docker on boot and it auto starts Portainer for me. I access that via web from another machine, and have deployed a couple of containers. I deployed the ollama container and the ollama-webui container.

Those work. I am able to load a model and run it. But they are insanely slow. My Windows machine with 8 core 5800 cpu and 32GB ram (but a 6900XT gpu) using LMStudio is able to load and respond much faster (though still kind of slow) with the same model.

I understand now after some responses/digging, that GPU is obviously much faster than CPU. I would have hoped a 16 core CPU with 64GB RAM would still offer some decent performance on the DeepSeek Coder 30b model, or the latest meta codellama model (30b). But they both take about 4+ minutes to start to respond to a simple "show me a hello world app in ..." and they take forever to output too.. like 2 or 3 characters per second.

So first, I would have thought it would run much faster on a 16 core machine with 64GB ram. But also.. is it not using my 6700XT GPU with 12GB VRAM? Is there some way I need to configure docker for ollama container to give it more RAM, cpus and access to GPU?

OR is there a better option to run on ubuntu server that mimics the OpenAI API so that webgui works with it? Or perhaps a better overall solution that would load/run models much faster utilizing the hardware?

Thank you.

view more: next ›