this post was submitted on 17 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
(page 2) 7 comments
sorted by: hot top controversial new old
[–] ab2377@alien.top 1 points 1 year ago

llama.cpp mostly, just on console with main.exe. Wrote a simple python file to talk to the llama.cpp server which also works great. LM Studio is good and i have it installed but i dont use it, i have a 8gb vram laptop gpu at office and 6gb vram laptop gpu at home so i make myself keep used to using the console to save memory where ever i can. My experience with text gen web ui has not been great, its takes far far too long to update, and sometimes it gets the torch installation right and sometimes torch is not installed with cuda. I really dont want to waste my time on that. I like to install everything manually and want some really light weight web ui to just use the server hosted with llama.cpp.

[–] Love_Cat2023@alien.top 1 points 1 year ago

Text generation web ui api with next js , it has more customise

[–] Flashy_Squirrel4745@alien.top 1 points 1 year ago

Text Generation webui for general chatting, and vLLM for processing large amount of data using LLM.

On an RTX3090 vLLM is 10~20x faster than llama.cpp for 13b awq models.

load more comments
view more: ‹ prev next ›