- Don't repeat yourself
- Unless you have to
- Add tests
Don't repeat yourself/DRY is a very commonly upheld principle. In general, I agree. The logic goes that the logic should be expressed in one place and one place only so that if it has to change, it only has to be changed in one place.
Unfortunately, life is not so easy. There are many reasons why repeating yourself can be a necessary evil.
Consider the following example.
boolean hasMobileProduct(products: array of products) { for (i := 0; i < len(products); i++) { if (products[i] == "ANDROID" || products[i] == "IPHONE") { return true; } } return false; }
boolean hasPhotosProduct(products: array of products) {
}
you might be tempted to reduce the duplication by factoring out the common logic into a helper method
boolean contains(products: array of products, targets: array of products) { for (i := 0; i < len(products); i++) { for (j := 0; j <len(targets); j++) { if (products[i] == targets[j]) { return true; } } } return false; }
boolean hasMobileProduct(products: array of products) { return contains(products, {"ANDROID", "IPHONE" }); }
One of the most valuable things I've learned at Google is that sometimes these refactorings actually make the code less comprehensible and less maintainable. There's really no harm in repeating very simple logic that's not likely to change.
It's easy to go overboard throwing static functions / methods into utility classes so that they can be used in multiple libraries.
This can be problematic because
- it takes time
- it can cause other libraries to pull in more dependencies than they would otherwise need if they duplicated the function
The Go language uses this idea judiciously. In his presentation, Go at Google, Rob Pike explains this succinctly:
example of Go
http://talks.golang.org/2012/splash.slide#28
Through the design of the standard library, great effort spent on controlling dependencies. It can be better to copy a little code than to pull in a big library for one function. (A test in the system build complains if new core dependencies arise.) Dependency hygiene trumps code reuse. Example: The (low-level) net package has own itoa to avoid dependency on the big formatted I/O package.
I love that phrase - "Dependency hygiene trumps code reuse".
enumerations are frequently duplicated, usually with a comment along the lines of "this must be kept in synch with xyz". These comments are usually useless - People don't read. People are fallible. If duplication is impossible to avoid, you must put in place automated tests to catch these problems.
e.g. enum A and B, have a test in package A, B, or C which pulls in the enums from A and B and ensures the two are identical (or a strict subset, or whatever property you expect)
easy to imagine that other libraries are going to need some utility functions, so you split out into a utils package. this complicates the design speculatively
This is not to say you should copy and paste at will. The point of this is to point out that, like many things in life, there are shades of gray - it's not absolute.
While some of these examples are specific to Google, I hope they are applicable to the wider audience
comments are a form of duplication