LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Llama-2 7b Unquantized Transformers using 26.8GB of Vram. (alien.top)

submitted 2 years ago by Acceptable_Can5509@alien.top to c/localllama@poweruser.forum

0 comments fedilink hide all child comments

I'm running Llama-2 7b using Google Colab on a 40gb A100. However it's using 26.8 gb of vram, is that normal? I tried using 13b version however the system ran out of memory. Yes I know quantized versions are almost as good but I specifically need unquantized.

https://colab.research.google.com/drive/10KL87N1ZQxSgPmS9eZxPKTXnobUR_pYT?usp=sharing

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here