Detecting Duplicate Code
by Dion Almaer
Tom Copeland has written a nice piece on the open source CPD (copy/paste detector). Having tools like this can really help out, and seeing the amount of copy/paste in the JDK source itself is scary.
However, what do I *really* want tools to find out...
I want them to go above and beyond "this code exists in Foo.java and Bar.java". I want it to tell me "This piece of functionality has been duplicated". In large projects, many core utilities get rewritten by different people.
For example, when working on a system that interfaced with a COBOL application on some of IBMs big iron, we found that a function that cleared the screen had been rewritten more times that you could believe. Over the many years, when new employees came in, they just wrote their own functions to work with.
It would be great if you wrote some code for an app, and were told by a program "*ahem*, I know what you are writing here, but just use the functionality that Bob wrote 5 years ago located here", and "Interesting, how about you refactor code X instead of reinventing the wheel there mate".
Now that will be cool!
Please not through a UI Agent.!
"It appears that your writing a method..."
Behavioral, not Technical?
I'm doubtful that a computer can make the intuitive leap to see that two bits of code are refactorable down to one bit. Would some of the Extreme Programming practices work better than an automated process?
Tools for Duplicate code detection
There is a product available from a company called
Semantic Designs which can detect functional duplication of code (rather than just copy and paste) - its called CloneDR