this post was submitted on 16 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

I'm looking for suggestions for a transformer model that I can fine-tune for a text classification task. Due to hardware constraints the model has to be fairly small. Something in the order of a 50 MB weight file.

top 5 comments
sorted by: hot top controversial new old
[–] grasshopper-brown@alien.top 1 points 1 year ago

error 404 page not found

[–] kwnaidoo@alien.top 1 points 1 year ago

While not a transformer, what about "Gaussian naive Bayes"? It's not the best classifier around but for some tasks - it's good enough. I used it to build a small search term classifier model which basically classifies e-commerce search terms against a category or tag.

[–] heisenbork4@alien.top 1 points 1 year ago

You could take a look here: https://sparsezoo.neuralmagic.com/?modelSet=natural_language_processing&size=10727959%2C64684665&sort=Size%3Aasc

Smallest model there is 10MB

I think it's still the case that it has to run on x86, though I think there was talk of an arm runtime.

Also, spending on your exact needs, there might be licensing issues. Still probably worth a look though

[–] NoIdeaAbaout@alien.top 1 points 1 year ago

That's very small for a trasformer, as a rule of thumb, this is meaning 25M parameters. Not sure there are similar ones

you can try this:

https://arxiv.org/pdf/2006.03236.pdf

[–] pythonpeasant@alien.top 1 points 1 year ago

Could you please provide some more information as to your constraints? If space is an issue, you might be better off with a more memory friendly model, like an LSTM. You even have per-token attention with some models.

There's a really interesting sparkfun video which I'll look around for, showing a question-answering model using some sort of BERT(?) running on a Raspberry Pi Zero-type chip, with 25-50MB of flash memory.