Tag: opinion

2009-05-31 01:44 UTC Google Wave as infrastructure


It's new but not new

All of the things in the Wave demo are possible without Wave. The interesting thing about Wave is not so much the application, but the infrastructure, the protocol and the underlying concepts. Many limited collaboration apps have offered various subsets of the Wave functionality, for example, and the superficial functionality can be built using existing technology, without fancy new protocols and clients etc.

But that's missing the point. Wave is interesting because the infrastructure makes it extremely open-ended. 

A lot of people have been comparing Wave to e-mail. That's missing the point too. Wave has the potential to be as important as the Web. I'm serious. Wave is taking the web and making it interactive, embeddable, recordable, and shareable on a whole different level than what we are used to. And like most great "revolutionary" ideas it doesn't actually add all that much new: 

When the web arrived, hypertext systems were well known (and most of them were far more advanced than early web browsers), for example. What the web gave us was two simple innovations: A simple system for letting anyone create content, and an addressing scheme and protocol that allowed the content to be distributed world wide. But that existed - sort of - too before the web, in the form of Gopher. The web was a very tiny evolutionary step in many ways, but they were "just the right ones": The web was more chaotic and unstructured and open-ended than gopher (which was more of a distributed catalog of files), and with a number of ideas that spurred people on to extend it in all kinds of weird and wonderful ways.

Wave could be similar. We have IM. We have e-mail. We have document sharing over the web. We have the web. All of the functionality of Wave can be achieved with existing technology. You can chat with your friends. You can share content. You can run shared whiteboard apps that you could interface apps to.

But it's not all seamlessly integrated. That is the one deceptively small evolutionary step that Wave provides, that could very well be a big game changer. Especially since Wave does it in a very simple way.


It's built on XMPP

That means:

  • It leverages a huge amount of existing infrastructure
  • It's federated by default (see below).
  • A lot of the underlying protocol is well understood by a lot of people

It's federated

Jabber / Google Talk / XMPP is powerful because anyone can set up their own server, and Google has promised the same will be the case for Google Wave - it only makes sense anyway since XMPP is built ground up to be a federated protocol, with support for server-to-server connections etc.

Federation allows you to set up your own server, and so act as gatekeeper to information that is critical to you, and so makes Google Wave palatable as an intranet tool as much as an external tool - in fact, barring issues with how the clients are built there's no reason why an intranet user could not start a public wave (on a public server) for an open discussion and then break away a private sub-wave on the intranet wave server - all in the same session - to discuss with his/her co-workers. 

This makes Wave a potentially pretty amazing collaboration tool for situations where you may have multiple parties with information of both public and private nature who wants to share different subsets with different parties. Systems like Basecamp allows this today (you can add other companies and users from those companies to your Basecamp setup, and control their access to your information) but it is fairly static and limited to the specific feature-set of Basecamp, while in Wave it is a feature of the infrastructure and completely orthogonal to the functionality the various users employ.

In fact, nothing stops users from granting access to their own, internal, Wave-enabled apps, on a user by user basis as part of collaboration (or as a general service):


It's extensible

Imagine a booking agent (at any type of business: restaurants, airlines etc.) that answers live questions. Lots of companies do that now. But instead of just talking to you, the agent shares a booking form with you, and help you fill it in, live, while answering questions.

Or someone at your bank walks you through mortgage deals, and shows you a mortgage calculator, fills in details to illustrate the deals they are presenting, and show you graphs of your repayment schedule in real-time.

Because the Wave protocol allows both sides to push relatively arbitrary content, and update it realtime, you can use it to run presentations; to edit documents together; to fill in forms together that is then updated live by automated remote services.

Consider it a sort of "remote terminal" you can run applications in. Only it's shared. And graphical. And has built in playback.


It's persistent

The Wave server is responsible for maintaining a persistent store of the waves. Depending on your wave server there may be different policies for how long they persist, but due to the openness of the protocol, there's plenty of opportunity for archiving waves in ways that provides a record the way e-mail does. Providing search and retrieval functionality for local waves is possible, and the architecture allows searching either at "point-in-time" or building search functionality that would let you search in past states of your waves. 

I.e. your client or your local wave server could either snapshot waves at specific times, or maintain a complete record of the wave operations and then let you find that mention of doughnuts in the kitchen that some greedy co-worker deleted seconds later.


It can be "gatewayed"

Many of the more exciting uses of Wave allowed by the open architecture is that you can create two-way gateways for all sorts of content using it. Nothing is preventing you from building wave support into your new spreadsheet app, for example, so you can click a button to share the spreadsheet, or a graph from it, or whatever, with other people and have them help you edit it,and comment on it. Or you can share your word processor document in a phone conference and people can add their own comments. Or you could export your blog entries, and have people use Wave to add or read comments from your web page. Or Wikipedia could export every page as a wave so those obnoxiously formatted "talk" pages could instead consist of people adding comments "inline" to the actual text. Or you could turn it into an IRC client. Or Facebook could turn the walls into waves and let people read/post to their wall via wave (though given facebooks past behavior in banning people who post too many updates, perhaps not).

Google itself has a suite of apps that could be exported as waves: Docs/Spreadsheets, Gmail, Picasa, Calendar etc. - imagine starting a wave, pulling in a document, adding your pics from Picasa, dragging in the calendar and creating an event, and turning it all into a nicely formatted event invitation e-mail that is gatewayed out via normal e-mail to your friends and family (or made available as a wave to those who use it), complete with nice pictures from the place you're inviting them too, the price list from Docs, a calendar invite etc.

It's encrypted by default

Encryption by default makes it a lot easier sell as a collaboration tool also internally in businesses, or even in many cases as a replacement for e-mail or other channels that are insecure by default and takes conscious actions to secure.

It's client independent

While Google obviously has a head start, and while extensions may (or may not?) be client dependent, nothing stops other parties from building Wave clients that add new capabilities. The underlying protocol is really simple - it uses XMPP for federation and layers a thing layer of maintaining shared XML documents and serializing multi-party updates to that document to all participants. How the client (or the server) interprets that document is up to the client (or the server). 

Some ways to use this: "Load" snapshots of waves into your word processor when you've finished collaboratively editing it, and finish cleaning it up (your word processor could be a wave client); open a wave in Finder / Explorer and drag images from the wave somewhere else (or open the wave in Photoshop and pick an image to edit). All of course assuming the various app gets wave support.




2008-05-22 18:13 UTC Reducing coupling through unit tests

After my previous post on the subject of coupling and cohesion, a lot of the feedback I've gotten has been from people who want examples of lowering coupling, or want to know how they can see if their code is loosely coupled.

The easiest way I know of doing that also has other benefits:

Whether you prefer Behavior Driven Development or test driven development, or you call it something completely different, writing unit tests provide you with one or two essential things:

  • If you write the tests up front, you will instinctively be guided to writing loosely coupled code
  • If you add tests "after the fact", you will clearly see whether your code is loosely coupled.

How unit testing help reduce and/or measure coupling

The Wikipedia article on coupling provides this definition:

Low coupling refers to a relationship in which one module interacts with another module through a stable interface and does not need to be concerned with the other module's internal implementation.

The thing is, when you write unit tests, you are putting the unit being tested into a harness. You need it to interact with your test code, and still carry out its functions.

If you write the tests upfront, you have two alternatives: Either you create a hugely complex harness with lots of mock objects, or you write your code in a way that makes it easy to call in isolation.

That is another way of formulating low coupling: A module exhibits low coupling if it is easy to call in isolation.

Unit testing under any name is a good test of the ability to call your code in isolation

Testing after the fact gives you a measure of how well you have done: If you have written code with low coupling, it should be easy to unit test. If you have written code with high degrees of coupling, you're likely to be in for a world of pain as you try to shoehorn the code into your test harness.

If you haven't bought into TDD/BDD for other reasons, consider it for this reason. Try it. The structure of your code will change if it wasn't previously loosely coupled, as you'll quickly tire of writing horrendously complicated tests.

Some indicators of the level of coupling

  • Can you test the code with trivial unit tests? (I.e. no or few mock objects, no huge amounts of setup or teardown code.) If you can, your code is likely loosely coupled.
  • Do you find yourself struggling to test modules independently of eachother? Almost certainly high coupling.
  • Does the code have lots of side-effects in low level code? This is a warning sign - it very often lead to code that is harder to test, for example because it might mutate state on lots of unrelated objects. It's not always avoidable, though, and if well managed, it doesn't need to lead to high coupling - the trick is to isolate the calls that cause the side effects from the external environment through the use of the adapter pattern or similar.
  • Is the code "controlled from the top"? In other words, does the code pass results indicating actions up the call chain instead of having side effects? Code that consistently does this tend to be looser coupled (and easier to unit test) than code that have side effects, as long as the actions passed up are as generic as possible (in other words, it doesn't help if the action passed back for example is an object that when called will invoke a specific method on an object of a specific class - in that case you're just delaying execution of a side effect rather than decoupling anything).
  • Is the code highly cohesive? That is, does each module carry a single, reasonably simple responsibility, and is all the code with the same responsibility combined in a single module? If code implementing a single feature of your application is littered all over the place, or if your methods and classes try to do many different things, you almost invariable end up with a lot of coupling between them, so code with low cohesion is a big red flag alerting you to the likelihood of high coupling as well.

The last point highlight that looking just at coupling in isolation isn't all that helpful. But thankfully, if you strive to reduce coupling, one of the key things you end up doing a lot of the time is write more cohesive code. If you're constantly focused on ensuring code can be tested in isolation, there are few ways around that without making each module cohesive.



2008-04-11 12:49 UTC Dealing with information overload

I've been thinking a lot about dealing with information overload recently. WIth the ever-increasing hype for sites like Twitter and FriendFeed, neither of which I use, and a steady stream of Facebook invitations, LinkedIn requests and invites to a continuously growing set of social networking sites and bizarre (to me) services that all add some form of social networking services, I'm more and more fatigued.

I hardly keep up with even my feeds and my e-mail, never mind my IM accounts - messages from people keep building up for days before I answer them.

And the thing is, I'm not actually connecting to large numbers of people. I'm fairly anti-social and notoriously bad at keeping in touch with people I've worked with in the past etc. (it's nothing personal folks - I'm happy to hear from people, I'm just rarely initiating contact myself because I'm always deep into focusing about something or other).

I need better tools to manage it.

Friendfeed aims to combine lots of streams of data into one huge river. The problem is that my challenge isn't to get a single view - it's to effectively manage whats there:

What should I read? What should I ignore? How do I find past information? What do I need to respond to? What should I keep and what should I discard completely?

In other words I need a personal search engine and my own personal recommendation or classification engine.

If I had time I'd build one. Building a small scale search engine for this data is trivial - there are tons of packages like Sphinx that are "good enough", so the problem is mostly ranking (and that is by no means trivial for data like this, that for many elements can be as short as a single line, and that doesn't have enough internal linking for something like pagerank to work). Building a classifier is also fairly trivial - but the problem is similar: Small snippets of data + training. Again, that's not a trivial problem.

But even a badly flawed one would be vastly superior to nothing. Getting something like that to "product quality" is hard. Getting something that is more than marginally usable for personal use should be doable, and mostly limited by my continuous lack of time. So I guess I'll paraphrase Abbie Hoffman:

-Steal this idea

Pretty please?

I want someone to build a service that'll do this for me, damn it. Even if I have to install a local client to capture some of the data. Yes, it will have a ton of privacy and security implications, and no, I wouldn't trust just anyone with it.

Let's summarize:

  • Search my e-mail, feeds, pages I've liked (via StumbleUpon, DZone or others), my IM's, the streams from any social networking sites I'm on (any not covered by my RSS feeds), comments I've written on other blogs, and generally my whole "online footprint".
  • Give me a filtered view that clearly shows me what I a) need to respond to, b) need to read, c) is likely to find interesting, it's source, and relevant/related items (replies/comments from other people, other blogs referring to the same thing, etc.)
  • Oh, and if it could recommend new information sources and show me which ones I really shouldn't bother with(of the ones I'm already following) based on my pattern of usage, that'd be nice too.

Yeah, it's a tall order.



2008-03-28 19:18 UTC The OOXML circus is making ISO increasingly irrelevant

Posted in: , ,
One of the Brazilian delegates to the BRM for OOXML has posted a lot of details of the charad.

How anyone can seriously think that the OOXML fast track process hasn't been full of undue influence and massive abuse of the ISO process, and quite possibly outright corruption is beyond me.

Groklaw is as usual on top of it, with a number of posts.

Most noteworthy in my mind, though is this post, which describes the process in Poland and notes that the EU Commission is apparently investigating the process.

It's sad that ISO is prepared to let itself be used this way. How can anyone take a standards organization seriously when it can be manipulated for one vendors purposes this easily?



2008-03-28 17:46 UTC Why coupling is always bad / Cohesion vs. coupling

In the discussion following my entry "Why Rails is total overkill and why I love Rack" several comments raised the issue of whether high coupling is always bad. My answer was that I believe it is, but at the same time it can be worth it sometimes.

It seems like a point that is worth further discussion. I'm not going to go into a terrible amount of detail, as I enjoy the discussion more than expounding on a subject that should be relatively uncontroversial.

What do I mean by coupling and cohesion

My earlier entry linked to the Wikipedia articles for these terms, because I was sure some people would misunderstand, and sure enough. So lets go into some more detail:

Two components are loosely coupled, when changes in one never or rarely necessitate a change in the other

Changes that affect external interfaces will of course require changes, and so you can't completely safeguard against changes causing ripples. You can protect against it by narrowing the interface. This is why coupling and cohesion is so tightly related:

A component exhibits high cohesion when all its functions/methods are strongly related in terms of function.

The higher cohesion and lower coupling a system has, in general the more its components exhibit strong data hiding, narrow but general interfaces and a high degree of flexibility.

Why coupling is always bad

Surely increasing dependencies on implementation details of other components isn't a good thing?

The objections I've seen typically doesn't actually usually imply that coupling is good, though, but that coupling isn't always bad because it's necessary to achieve high cohesion.

Some evils are necessary, but that doesn't make them good. I will not try to argue that increasing coupling isn't sometimes worth it - see below.

Coupling is always bad because it prevents the replacement or changes of components independently of the whole. It's hard seeing a defense against this, and indeed hard to argue for it because it appears so self evident.

What are some of the consequences of high coupling?

  • Developers / maintenance programmers need to understand potentially the whole system to be able to safely modify a single component.
  • Changing requirements that affect the suitability of some component will potentially require wide ranging changes in order to accommodate a more suitable replacement component.
  • More thought need to go into choices at the beginning of the lifetime of a software system in order to attempt to predict the long term requirements of the system because changes are more expensive.

I can't think of a single benefit of high coupling in and of itself. If anyone think they can actually defend why high coupling might sometimes be good (as opposed to just occasionally a necessary evil), I'd love you to post your comments to this post...

Cohesion vs. coupling, and why coupling is sometimes worth the cost

Cohesion is about making sure each component does one thing and does it well. The lines get blurry in a language like Ruby, where one "component" could be a library that reopens a class like Object and in effect extends every object in the system. The specifics doesn't really matter. What matters is whether the code is self contained.

It's generally easier to reduce coupling in a highly cohesive system.

It is easier, because a highly cohesive system will group the related functionality together, so that the need to communicate across component boundaries (whether those "components" are classes, separate processes, or methods injected into reopened classes by a library) is reduced.

The key point is that related code often share state. Sharing state across component boundaries increases dependencies. Increased dependencies increase coupling.

Cohesion and coupling are thus not at odds - high cohesion and low coupling are both good, and achieving one tends to make achieving the other easier, not harder. When some people think that high coupling is sometimes excusable, it is often because they confuse cohesion with consistency and ease of use.

I am sure there are many different ideas of what the appropriate tradeoff is. I put the bar pretty high (that is not to say that I don't sometimes violate my own ideals out of laziness, but then again I've been bitten by that several times too)

What can make increased coupling worth the cost

Sometimes a system is simply so large and complex that even if most of your components are highly cohesive you need to break the components into pieces, and possibly need to be able to plug other code into some of those pieces, to make the system maintainable.

In those cases, there may not be a choice. You may need to scale a system across server boundaries and have to break it into server specific components. Each processing step may need access to and knowledge of the full state to be able to continue processing no matter how you try to slice and dice the tasks.

Another case where increased coupling may be worth the cost is ease of use. A few days ago I wrote a post title URLs do not belong in the views. One of the approaches I was pondering was to put the routing/dispatch mechanism (the front controller) in charge of generating the URLs. At the same time I wanted to tie the url generation to model instances, not to named routes as Rails for example does (Rails also supports generating routes from model class names, but that's also not what I wanted).

Part of the motivation is an observation that there are many ways to generate URLs from the model objects - my posts for example, have a "slug" used to generate SEO friendly URLs, but that isn't guaranteed to be stable, and certainly isn't until the post is published, so while that is the right URL for a published, public view of the post, it's not appropriate for the admin interface, where one of the operations is to change the slug - I want the admin URLs to stay static. In this case the appropriate URL to use requires knowledge of the contents of the model. It's perfectly appropriate for the view to request data from the model, but I don't want it to make assumptions about formatting of that data.

And wouldn't it be nice if the front controller could instantiate the proper model objects too?

The point isn't what Rails can or cannot do - in this case Rails certainly can do more that a lot of frameworks I've seen, and gets part way there. If you are willing to sacrifice low coupling, allowing the front controller to create a mapping is pretty straight forward, and it certainly would be trivial to make Rails support a model like that (if it doesn't already - I don't know).

Doing those things without causing a scenario where the front controller knows about the way specific models are built (i.e. how to instantiate objects with a specific ORM), or where the views depend on a specific API of a specific front controller implementation is more work.

There are many cases where lower coupling means more work

If you, for your specific use, couldn't care less about the extra coupling because you know you'll never need to exchange a specific component (do you really know? Think long and hard about that), and the benefits in terms of additional work starts being significant, then lowering the bar and accepting higher coupling may be worth it. It's a tradeoff between the increased cost of replacing a component vs. the potentially lower cost of using the component in the first place.

My goal isn't to convince people to always strive for minimal coupling, but to make at least a few people at the very least think twice and make sure they really need to before they start adding extra dependencies to their code.

To relate this to my previous post, have Rails gotten the balance right? In my opinion it hasn't. That's not to say everything in Rails could be cut into independent reusable components without sacrificing usability.

Some thoughts on avoiding coupling

Rack is a good example. Read the Rack specification. Seriously. It's short.

There's two good things about it:

First of all t's easy to implement Rack again, or parts of it, if you really have to. If for whatever reason the current implementation doesn't meet your needs, it's easy to satisfy the requirements of the specifcation.

Secondly, it's even easier for other components to plug into the Rack infrastructure. Really, a minimal piece of Rack middleware doesn't need to do much more than this (it doesn't technically need even this, as long as it responds to #call, but doing it this way lets you chain them trivially using Rackup config files)

class RackMiddlewareExample
  def initialize app
     @app = app
  end

def call env @app.call(env) end end

Of course your middleware can (and likely will) access the environment provided to #call, but that interface is not doing much more than passing on data passed in with the request and some information about the server you're running in, just like the CGI environment. As long as you obey with the very simple Rack specification, you can build up complex behavior by layering a number of tiny classes than can be ripped out, reordered, replaced, rewritten etc. as you please.

It's an incredibly powerful model because of the low coupling. Of course, it's easy to break that by adding lots of data to the environment you pass on. It's not an automatic truth that Rack middleware components will not have high coupling, but it would kind of defeat the purpose

A few general rules to avoid high coupling:

  • Make your components as cohesive as possible. If they have more than one responsibility, try to break them in two. Identify what their responsibilities actually are.
  • Don't leak state when you don't have to. WHY are specific attributes exposed? Do they have to? Do you need to tell the world which state an object is in, or is it enough to tell the world that the object is or is not in a specific state? The more you hide data, the harder it is to accidentally increase coupling.
  • Simplify your interfaces. Can you easily reimplement a class from scratch that satisfy the interfaces? What ARE the interfaces that other components are allowed to depend on? (AND note that interfaces can be complex even if the number of methods are low, if the data passed as arguments is complex)
  • Pick interfaces that are already satisfied by components consumers of your interfaces might use. An example of this is again Rack, where the choice of using #call means that a Proc can be used to satisfy the interface requirements. It's a tiny thing in this case, but it does increase flexibility, and makes reimplementing or replacing components, or providing a facade or decorator around an existing component that much easier.

Why do you hate Rails?

Judging by some of the feedback I got, some people clearly think I hate Rails. I don't, which I hope my answers to comments etc. reflected. Rails has done a lot of really great things for Ruby and web development, and it deserves full credit for that.

I do stand by my assessment that I believe Rails is overkill, though. That doesn't mean none of the code in Rails is worthwhile - lots of it clearly is. But I do also strongly believe that Rails would be far better if it was more loosely coupled, making it easier both for alternative implementations of core components to be easily used, and for bits and pieces of Rails to be used by itself. The success of ActiveRecord is a testament to the value of being able to reuse chunks of code originating in Rails, and I'm sure there's lots of other code that would benefit a wider community.

A lot of my reluctance to use Rails boils down to the fact that I prefer to pick components that fit with what I want to do rather than adapting what I want to do to how it'd be easy to do it with a specific framework. I want the flexibility to throw out components when they don't suit me without affecting other parts of my applications.

For other people that's less of a concern, and so they are happy with Rails and want to keep using it, and that's of course their right. Choice is great. Some people are happy with PHP or even ASP too. If it works for them, then that's fine. Switching from something that works perfectly fine for you just to switch is rarely a good idea, and I'd never advocate it.



Older Entries

About me

E-mail: vidar@hokstad.com Skype: vhokstad
Twitter: vhokstad
View my LinkedIn profile.

I was born April 21st, 1975, in Oslo, Norway. Since 2000 I've been living in London, UK. I'm married and we just had our first child, Tristan Ikemefuna Hokstad.

I'm working for Aardvark Media as Director of Technology. I'm also currently on the board of SpatialQ, a startup in the GIS space, and an advisor to Skoach, a startup doing a time management app for people with ADD.

Twitter Updates

    follow me on Twitter