Happy_Chicken9835

joined 10 months ago

I deployed Llama 2 (GGUF and using CPU) as an Amazon ECS fargate service

I just bundled my entire Docker build into ECR and fired up my container

[โ€“] Happy_Chicken9835@alien.top 1 points 10 months ago

The bloke has a few quantized variants

A gguf 7B: https://huggingface.co/TheBloke/Orca-2-7B-GGUF