Hell, just the server logs (timestamps, IP addresses and exact URLs) would be unbelievably valuable.
People say that, but the actual data would be so vast and with so little actual usability, that the dilution of it still results in largely garbage data. Its only when you have a particular focus and have the ability to filter to that focus that the data becomes very valuable.
Even banks and card processors, who have direct, legal, and completely open access to data as critical as where every one of their customers spends money struggle to do more than harvest aggregated usage patterns. The idea that data volumes, at a couple more orders of magnitude and notably more generalized will be easily processed and harvested ends up being pretty silly.
In 10 years, we've made such slow progress on conquering that "small technical hurdle" that it's hard to take the argument seriously.
Generative AI data ingestion techniques are the first round of technology that come close to being able to target the data volume/complexity we'd see in it, and those ingestion techniques are still:
And the techniques that pull data from them don't end up saying more than what you could have gotten from a directed observation. You need to know what you're looking for to get it, or you'd need to code particular ingestion techniques to be able to extract the patterns you wanted to scan for.
So, the end result is still the same: Your concern is over a directed attempt to wiretap you, and if that is your concern, then there are a bunch of other places you need to be concerned with.
Also, if your primary concern is the number of people/agencies that might be trying to wiretap you, then I'd probably agree that Cloudflare is not for you. Maybe some sort of Tor connection via an array of cellular antennae?