NoIdeaAbaout

joined 1 year ago
[–] NoIdeaAbaout@alien.top 1 points 11 months ago

There are much less alternatives, you could use keras in R but is actually a hassle. There are few alternatives for topic modelling in R:

[–] NoIdeaAbaout@alien.top 1 points 11 months ago

If I would start again I would focus more on NLP, many of the things I have studied are obsolete. my only suggestion is: to choose the field you like more and go in-depth, especially the application field. For example, I work on AI in biological applications, beyond the knowledge of the algorithms, domain expertise is key

[–] NoIdeaAbaout@alien.top 1 points 1 year ago

That's very small for a trasformer, as a rule of thumb, this is meaning 25M parameters. Not sure there are similar ones

you can try this:

https://arxiv.org/pdf/2006.03236.pdf

[–] NoIdeaAbaout@alien.top 1 points 1 year ago

you can also use this, is very simple library:

https://maartengr.github.io/BERTopic/index.html

[–] NoIdeaAbaout@alien.top 1 points 1 year ago (1 children)

I think you can try a similar way for another task, for me, the approach can be generalized to different tasks

[–] NoIdeaAbaout@alien.top 1 points 1 year ago (3 children)

Have you seen this article by Google?

https://arxiv.org/abs/2305.02301

https://blog.research.google/2023/09/distilling-step-by-step-outperforming.html

they claim that they were able to distill for reasoning task PaLM with T5 (2000 times difference in size) and the distilled T5 was outperforming PaLM

code is here:

https://github.com/google-research/distilling-step-by-step