2008-04-30 17:25 UTC Software ICs: Reuse should not always mean inheritance or configuration
Inheritance or configuration options has a cost in terms of increased complexity that can in some cases with advantage be avoided by maintaining multiple versions of the component and adding new features to new branches instead of continuing to work on a single code base, in the same way integrated circuits often exist in a wide range of similar, static, models with the same basic functionality. Better merging support in modern version control systems make this model increasingly viable for software.
One thing I've thought a lot about over the year is why software reuse is so hard. A big problem is that designing reusable software when you don't know where it might be reused is hard Over the years, a number of people have brought up integrated circuits as a model for software reuse. I tried to find one of the old articles I read about it this morning, but was unfortunately unable to track it down. But this is not an original idea. Simple ICs have a number of properties that affect how they are used:
One thing I've thought a lot about over the year is why software reuse is so hard. A big problem is that designing reusable software when you don't know where it might be reused is hard Over the years, a number of people have brought up integrated circuits as a model for software reuse. I tried to find one of the old articles I read about it this morning, but was unfortunately unable to track it down. But this is not an original idea. Simple ICs have a number of properties that affect how they are used:
- When they're "complete" they're often never changed other than possibly to fix problems. The design may evolve, but the next "version" tends to be given a new designation and is often treated as a separate product.
- There's often a myriad of different versions with smaller or larger differences - many products exists in variation rather than being configurable. Configurability often adds complexity. In hardware, complexity has a very visible impact.
- Apart from very large complex general purpose processors, most ICs tend to have a very high cohesion, because they have to in order to make financial sense.
- They are "black boxes" in that you can't (or won't) change them, but the details of how you interface with them and how they will respond is well documented and wel understood.
Distributed version control and reuse
One thing that struck me this morning was that one of the big features of distributed version control systems promise is to ease the burden of merging, and that this is a major stepping stone towards a simpler model of reuse. First of all, let me say that I am not against configurable components. I strongly believe in making classes and libraries generic and reusable in itself - specifically by ensuring low coupling and high cohesion. However, sometimes making a component highly flexible comes at the cost of reducing cohesion, of making the component try to please everyone at the same time by exposing interfaces that requires massively increased complexity in order to avoid exposing internal implementation details, or where the choice is taken to "surrender" and expose the guts of the component for everyone to hook into. Both alternatives are bad. The "software IC" idea taken to it's ultimate conclusion is this: Develop strongly cohesive components that export generic interfaces to ensure loose coupling, and "freeze" those components - refuse to add any more features or make any interface changes or adapt it. Limit changes to internals that don't change the observed behavior other than fixing bugs and improving performance characteristics. It's both incredibly powerful, and at first glance incredibly limiting. Powerful because it means that when you learn a specific "model" of a component, you have every reason to believe it won't break on you. Imagine linking to the same specific version of a library and never upgrading other than selectively for bug fixes. Incredibly limiting because software people have a feature fetish. We crave adding functionality, and go all "ohh, shiny" whenever we see something cool has been added. And that's fair, at least when it actually is helpful. I don't want to stop that. I want to take a much more conscious approach to the fact that when DingbatShell goes from version 1.x to 2.x it's a different model - a different product - than the previous version. Upgrading, even if the API seems to stay mostly backwards compatible, requires new rounds of testing and careful review.Software IC's aren't new - they're called branches and versions
This is the crux of the matter. You've been able to do this "forever" - and some have done. But very few take the conscious approach that this applies to the whole stack, including third party libraries, build tools that have any kind of effect on the final product etc. Even fewer extend this to creating a multitude of branches - a new branch for every major "niche" the component is meant to work in, or every major axis of configurability. A key reason being that in the age of version control systems that have been abysmally bad at merging changes, you really don't want to have to merge in a bug fix across 42 different versions of a component. I'm not sure we're still quite there yet, but that's almost what I'm proposing. A vital point being that such changes should be exceedingly rare exactly because you freeze features regularly, branch of new components, and continue new feature development while leaving the branches frozen. Only for critical bug fixes would you be faced with a potentially massive merge job. But if the components remain small and simple that merge job might not be so bad. This is of course where the new breed of distributed version control systems comes in. Because they're distributed, better merging has been vital. A system like GIT is heavily focused around a workflow that for many users involves frequent multi-way merges of a very high degree of complexity. We're finally getting tools that are actually specifically geared towards managing large number of branches.What are the benefits?
Whenever there's a high cost to providing configurability, either in increasing complexity or reducing performance due to complex abstractions you have a point where it's worth considering a new "model". You can:- Simplify the API - configuration options that are needed for only one or the other axis of configuration (say using a database vs a set of files as the data source, if the nature of the component is such that it's always either or) can be left out entirely. A good test for whether splitting a component into branches is a good thing is to look at how large parts of the API you can prune away or how many arguments you can remove from methods.
- Improve performance by hardwiring logic that might otherwise go through multiple level of indirection.
- Massively simplify testing, because the number of permutations of configurations may drop significantly (look for m*n effects, where configuration happens along more than one "axis" and where many combination may not make sense, but will still need to work - if branching the component can make testing against a single axis at the time it may be a big win).
No comments yet - Be the first one!