overview for thedabking123

Is there an interest in resurrecting technical discussions of the latest research? [D] in c/machinelearning@academy.garden

[–] thedabking123@alien.top 1 points 2 years ago

So do engineering innovations make sense for discussions?

I would love for someone to do extensive experimentation on best practices for RAG- including implications on cost, latency and performance (precision), etc.

It's not the very latest per se, but knowing the differences and tricks to reduce wall time would be amazing.

667 of OpenAI's 770 employees have threaten to quit. Microsoft says they all have jobs at Microsoft if they want them. in c/localllama@poweruser.forum

[–] thedabking123@alien.top 1 points 2 years ago

And they'd lose out on every single thing they built in-house.

Their MLOPs and data scraping infra? Gone
Their results and experiments (like millions of iterations)? Gone
They would take 1-2 yrs to catch back up and would lose the lead.

Don't assume Google, Inflection, Anthropic etc. are more than 1 yr behind.

667 of OpenAI's 770 employees have threaten to quit. Microsoft says they all have jobs at Microsoft if they want them. in c/localllama@poweruser.forum

[–] thedabking123@alien.top 1 points 2 years ago (3 children)

I wonder how real this threat really is.

Sam Altman is not the only guy who knows how to monetize things. Also these employees are sitting on 10M+ bonuses that they will NEVER get from Microsoft. At most they will get 250K-500K TC.

1

How are people here observing their experiments and production models? (alien.top)

submitted 2 years ago by thedabking123@alien.top to c/localllama@poweruser.forum

1 comments fedilink

I'm currently working on some RAG-based tooling for some non-profits and am having difficulty doing the following things. Wondering what people are using?

Tracking model performance across experiments and productized pipelines
1. changes in test or finetuning data sets
2. Changes in chunking strategy
3. changes in RAG tooling (e.g. RAG Fusion or RAG-DIT)
4. Changes in underlying models and/or finetuning strategies
Tracking pipeline performance (e.g. speed, throughput, latency, etc.) as we change items laid out above

What products do you use and how do you choose them?

[D] What is the future for ML researchers and startups? in c/machinelearning@academy.garden

[–] thedabking123@alien.top 0 points 2 years ago (2 children)

Nationalized infrastructure built by megacorp contractors is my predicted future for AI research the 2030's.

I don't doubt there are multi-trillion parameter multi-modal models being dreamed up right now by the US DoD and OpenAI to run psyops online, detect and recruit agents, push against foreign propaganda, etc..

And that's okay I'd rather that stable organizations had the reins rather than some l33Tcode bro (and in US's case the org is controlled by a democratically ellected person).

[Discussion]About to begin my PhD in Multi-Modality AI, any suggestions? in c/machinelearning@academy.garden

[–] thedabking123@alien.top 1 points 2 years ago

As a guy in his mid 30s doing masters level courses at Stanford in prep for a possible part time masters this advice to stick with the basics for risk mitigation is much needed.

[D] What role does data quality plays in the LLM scaling laws? in c/machinelearning@academy.garden

[–] thedabking123@alien.top 1 points 2 years ago

Measuring and improving quality of NLP datasets in a comprehensive way is probably the main migraine there.

You can measure and improve quality by many dimensions that practitioners disagree on... ( accuracy, completeness, consistency, timeliness, validity, and uniqueness are common ways to slice data quality) and there's no consistent single measure for some of those either.