You are right. But if you have a chinese customer for example, there might come up different problems like with NVIDIA and GPUs. Independency is key for a lot of players.
Rutabaga-Agitated
joined 10 months ago
Yeah might fit in the US, but not in Europe. Dependencies can lead to problems. Especially when there might be a conflict. I would not want to run important infrastructure that is dependent on US services only.
Does anyone have some hints how to use exllamav2 and extended context length by using GPTQ weights?
We created a 4x 4090 RTX setup through a mining rig That is 96gb VRAM for round about 10k... does not get cheaper than that. Best compute per cost rn I think.
https://preview.redd.it/nfq4olntq54c1.png?width=1812&format=pjpg&auto=webp&s=a5308bb5eec778072f8d6a394b5243ca33c7fd87
.*