this post was submitted on 29 Oct 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

I come from computer vision tasks with convnets that are relatively small in size and parameters, yet performing quite well (e.g. ResNet family, YOLO, etc.).

Now I am approaching some NLP and architectures based on transformers tend to be huge, so that I have problems to fit them in memory.

What infrastructure you use to train these model (GPT2, BERT or even the bigger ones)? cloud computing, HPC, etc.

top 2 comments
sorted by: hot top controversial new old
[–] KingsmanVince@alien.top 1 points 1 year ago (1 children)

I have used Google TPU for BLOOM and GPT-2 models.

[–] arena_one@alien.top 1 points 1 year ago

At your current job? What kind of role/company are you at? Most of the places I’ve seen just want to use the openai api, sadly..