this post was submitted on 30 Oct 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Looking at mlc-llm, vllm, nomic, etc. they all seem focused on inferencing with a vulkin backend and all have made statements about multi gpu support either on their roadmaps or being worked on over the past few months. Every time I see one say they added multi gpu support it turns out they just incorporated llama.cpp's CUDA and HIP support rather than implementing it on vulkan. Are there any projects that actually do multi gpu with vulkin and is there some technical reason it doesn't work? I only ask because vulkan is available on multiple platforms with default installs and would surely make things easier for end users.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here