OSCON 4.5: Extreme Perl Makeover
by Geoff Broadwell
A Peter points out: TIMTOWTDI, but some of those ways stink. Peter quickly ran through a series of steps to do first when you are forced to take over code that uses some of those putrid ways.
First off, if the whitespace and formatting suck, hit it with perltidy. If the code is deeply cluttered with useless comments and docs, kill the crap, keep anything that answers "why?" and perhaps subroutine calling conventions, and move the rest out to another file as pure POD. (It seems to me it might be reasonable to just move the POD to the end of the file, but that seems to be a matter of taste.)
With this easy stuff out of the way, it's time to take a good look at the code itself. First, check if the code uses hashes and regular expressions at all; it's surprisingly common to find code written by people who simply didn't know about one or both of these. Failing to find either one should be a tip that the code was written by a very inexperienced Perl coder, and you are likely going to get big wins through refactoring brute force code into native Perlisms.
Some of the refactorings will be easy -- converting long blocks of prints into a single heredoc, for example. Sometimes it's worth it to go the extra mile and switch to using a templating toolkit (an already existing one, please) if the code would end up as littered with heredocs as it currently is with print calls.
Many instances of awful code were written with total ignorance of scoping. Even if the variables are declared at all, they may be declared in a huge block at the top of the file, with thousands of lines below that with nary a declaration. You'll need to start narrowing variable scope religiously, moving declarations as late as possible. The only exceptions are config variables, which should be kept at top of file, or preferably moved out to a separate file entirely. Variable rescoping can be an arduous process, but it's well worth it, especially as it is a prerequisite to the next step.
Much putrid code either contains very long blocks of code, or is even completely monolithic, with no subroutines at all. This will require heavy refactoring, but doing so will likely vastly increase the clarity of the code. It also gives you a good excuse to add to the test library (or create one if it's missing). Peter recommended the use of Devel::Refactor's extract_subroutine, which works particularly well with the EPIC plugin for the Eclipse IDE.
The next thing to tackle is suspicious code, much of which can be caught with strict and warnings. Peter pointed out (as had several other speakers) that these are not silver bullets; they merely catch a certain percentage of easily-identified errors. You still need to do some heavy reviews after the code is strict and warnings clean. Peter recommended starting the process by putting use strict; no strict; at the top of the code, and slowly moving the no strict; down, fixing bugs and assorted nastiness as you go. Then lather, rinse, repeat with warnings.
Finally, look for calls to external programs. In Peter's experience, this is almost always an indication that there is a CPAN module that should be used instead.
By the time you've gone through all these steps, the code may not be as beautiful as it would have been if rewritten from scratch. However, it will certainly be much more maintainable than it was, you will hopefully have amassed a sizeable test suite while doing all that refactoring, and if you did a good job, all of the special knowledge burned into the code from years of use will not be lost as it would be in a fresh rewrite.
Go forth and conquer!
What was the worst mess you ever had to maintain?
Where to POD
Actually I did recommend that the POD stay in the same file, moved to the end; it was the other stuff, like higher level design notes, and especially package configuration information (author, change history, etc) that I suggested belonged elsewhere. Apologies for not making it clear.
Where to POD
Sorry about that. It's easily possible I just missed the detail there while taking notes -- that's one of the risks of doing so. :-)