I have exactly the same problem.
I got as far as using fdupe
to identify duplicates and delete the extras. It was slow.
Thinking about some of the other comments... If you use a tool to create hardlinks first, then one could then traverse the entire tree and deleting a file if it has more than one hardlink. The two phases could be done piecemeal and are cancelable and restartable.
For backup or for file-level reduplication?
If the latter, how?