Best Software Practices

Today, I’d like to share a few excerpts of this excellent paper, titled “Best Practices for Scientific Computing” by Wilson et al.

“Programmers are most productive when they work in small steps with frequent feedback and course correction rather than trying to plan months or years of work in advance. While the details vary from team to team, these developers typically work in steps that are sized to be about an hour long, and these steps are often grouped in iterations that last roughly one week. This accommodates the cognitive constraints discussed in the first section, and acknowledges the reality that real-world requirements are constantly changing. The goal is to produce working (if incomplete) code after each iteration. While these practices have been around for decades, they gained prominence starting in the late 1990s under the banner of agile development”

Takeaway: Break tasks into small, incremental chunks, ideally on the level of an hour.

“Today’s computers and software are so complex that even experts find it hard to predict which parts of any particular program will be performance bottlenecks. The most productive way to make code fast is therefore to make it work correctly, determine whether it’s actually worth speeding it up, and—in those cases where it is—to use a profiler to identify bottlenecks.”

Takeaway: Correctness comes first. Optimization, only if worthwhile, second.

“Research has confirmed that most programmers write roughly the same number of lines of code per unit time regardless of the language they use. Since faster, lower level, languages require more lines of code to accomplish the same task, scientists are most productive when they write code in the highest-level language possible, and shift to low-level languages like C and Fortran only when they are sure the performance boost is needed. (Using higher-level languages also helps program comprehensibility, since such languages have, in a sense, “pre-chunked” the facts that programmers need to have in short-term memory.) Taking this approach allows more code to be written (and tested) in the same amount of time. Even when it is known before coding begins that a low-level language will ultimately be necessary, rapid prototyping in a high-level language helps programmers make and evaluate design decisions quickly. Programmers can also use a high-level prototype as a test oracle for a high-performance low-level reimplementation, i.e., compare the output of the optimized (and usually more complex) program against the output from its unoptimized (but usually simpler) predecessor in order to check its correctness.”

Takeaway: To maximize productivity, code in the highest-level language possible. Code in a low-level language only when necessary.

I am a big advocate of this paper and findings of this research. This paper likely contains years of accrued software development wisdom. Even the table of contents sounds like it could be a “best hits” list of software practices:

“Write Programs for People, Not Computers”
“Let the Computer Do the Work”
“Make Incremental Changes”
“Don’t Repeat Yourself (or Others)”
“Plan for Mistakes”
“Optimize Software Only after It Works Correctly”
“Document Design and Purpose, Not Mechanics”

It’s worth a read!