I see these as the equivalent of selling picks and shovels in a gold rush. Good thing is that you won't need to bet on any particular vertical or application, which is always hard for novel technologies. Bad thing is infra is usually not where most of the value generation/capture happens.
Machine Learning
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
Yes, semantic indexing and vector databases are now part of AI infra called Retrieval Augmented Generation which is used to link knowledge sources to LLMs for information retrieval. (LLMs are not good at searching). To learn more about how to implement RAG in a GenAI context, check out LLMWare which provides an integrated RAG platform so you can quickly level up in AI Infra: https://github.com/llmware-ai/llmware
No, it is not considered AI infra. Embedded databases consume AI infra to create the embedding, but vector databases can exist without any AI component. AI infra is the generation of output not the consumption of the generation. If that was the case, every piece of software built that uses a LLM component would be considered AI infra.
Tip: go to openAI’s hiring page and look up their infrastructure engineer requirements :)
Yes!