There's some good recent papers on how to tackle this. My favourite paper on the topic is probably here: Robust fine-tuning of zero-shot models - arXiv https://arxiv.org/pdf/2109.01903
But tl;dr: a weighted average of fine-tuned weights and original weights through a manually chosen weight tend to greatly mitigate this problem. I was surprised the paper didn't get more attention when it came out, but oh well.
There's some good recent papers on how to tackle this. My favourite paper on the topic is probably here: Robust fine-tuning of zero-shot models - arXiv https://arxiv.org/pdf/2109.01903
But tl;dr: a weighted average of fine-tuned weights and original weights through a manually chosen weight tend to greatly mitigate this problem. I was surprised the paper didn't get more attention when it came out, but oh well.