circle

joined 1 year ago
 

intuition: 2 texts similar if cat-ing one to the other barely increases gzip size

no training, no tuning, no params โ€” this is the entire algorithm

https://aclanthology.org/2023.findings-acl.426/

 

As the title suggests, basically i have a few LLM models and wanted to see how they perform with different hardware (Cpus only instances, gpus - t4, v100, a100). Ideally it's to get an idea on the performance and overall price(vm hourly rate/ efficiency)

Currently I've written a script to calculate ms per token, ram usage(memory profiler), total time taken.

Wanted to check if there are better methods or tools. Thanks!