this post was submitted on 22 Nov 2023
1 points (100.0% liked)
LocalLLaMA
4 readers
4 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I dipped my toes in while comparing different methods of running Whisper on Android, and learned that they don't intend developers to use NNAPI directly, but instead use a solution like TensorFlow Lite or PyTorch Mobile, which detects support and implements delegates which it may decide to use depending on the most efficient scenario. A developer needs to convert/"optimize" a model so that it doesn't use any unsupported operations, but there's also size considerations, like the TPU and other areas probably don't have that much memory just yet