The drop in questions makes sense, but the interesting metric would be whether the quality of remaining questions has gone up or down. If LLMs are absorbing all the "how do I center a div" and "null pointer exception" questions, what is left should theoretically be harder, more nuanced questions that AI cannot easily answer.
The flip side is that SO answers are now part of the training data that makes LLMs useful. If people stop contributing answers, the models eventually become stale. It is a bit of a tragedy of the commons.
Memory efficiency in K8s is really about getting your requests and limits right. A few things that helped us:
The biggest waste we found was pods requesting 1Gi but using 200Mi on average. Multiply that by 100 pods and you are wasting a lot of cluster capacity.