Acceptable_Can5509

joined 1 year ago
[–] Acceptable_Can5509@alien.top 1 points 11 months ago (1 children)

Basically gpt 4 turbo

[–] Acceptable_Can5509@alien.top 1 points 1 year ago (1 children)

Can you share the colab so others can look at how it was done?

Probably heavily quantized and uses a smaller gpt-3 model.

[–] Acceptable_Can5509@alien.top 1 points 1 year ago (2 children)

Wait, whos money is it? Can't you just rent as well?

 

I'm running Llama-2 7b using Google Colab on a 40gb A100. However it's using 26.8 gb of vram, is that normal? I tried using 13b version however the system ran out of memory. Yes I know quantized versions are almost as good but I specifically need unquantized.

https://colab.research.google.com/drive/10KL87N1ZQxSgPmS9eZxPKTXnobUR_pYT?usp=sharing