Erik Johnson: May 2007

Thursday, May 31, 2007

How Do Bandwagons Fit into a Project Plan?

That thunder behind you is the REST bandwagon comin’ your way. Anne Thomas-Manes at the Burton Group has proclaimed that the future of SOA lies in REST. I’m not saying the Burton group is itself just jumping on a bandwagon – but their declaration is significant enough to convince others to take a leap. The REST dust-up is in such a very early phase and it’s far from clear whether the bandwagon is going to help me or hurt me. David thinks he might jump on, but let someone else do the driving. Nothing wrong with that strategy, but does it feels like history has come around a bit, eh David?

I think the REST bandwagon, the long-time Lords of the Web, and the WS-* camps are heading toward a colossal and rather fun collision. There will be friction between the stewards of HTTP and those who warp it to meet their RESTful needs. The REST camp likes to say HTTP is all you need – and I agree. But don’t be fooled into thinking that the foresight of the authors of RFC-2616 and RFC-2396 took all this into account. The RESTafarians probably won't want to see their momentum slowed by the standards trust. And then someone is going to say, “Hello! Is that an XML document you’re handing me? Care to sign it?” The WS-* folks are already figuring out how to chase that ambulance.

I’ve wasted a lot of cycles worrying that the Lords of the Web are going to tell me I’m perverting the Web’s majesty. For whatit's worth, here is my advice. First, it’s OK to say you are doing REST even if you haven’t read the dissertation. Personally, I think the magic is in the URIs and the links embedded in the payload. But that’s just me. Second, use the HTTP methods however you see fit. I use GET and POST exclusively myself. I use the occasional URN and I don’t think it’s an evil thing. Some people use the HTTP media-type to indicate what the payload contains. But RFC-2616 “discourages” the use of media-type values that are unregistered (section 3.7). Just use media-type if it helps you. When my payloads are XML, I have a namespace (a URN, BTW) to indicate what it is. For HTML, I haven’t really settled on anything yet.

No matter how far ahead of the REST bandwagon you think you are, it *will* overtake you. I’m sure our marketing department will snap up REST as fast as we did “web services”. I blinked one day and the word “SOA” appeared on the collateral. But you don’t need a REST toolkit – use the bits you have now. More importantly, don’t let the toolkit vendors say “just make objects – we’ll hide the REST part for you”. Good musicians learn to improvise by first knowing their scales. If you know the fundamentals of data modeling, URIs, and HTTP you shouldn’t need WADL, WCF, O’Reilly, or the bandwagon to be successful.

Tuesday, May 29, 2007

Harry & David

Harry and David (not these guys) are blogging comments to each other about REST. I wasn't sure where to leave my own, so I guess it's here...

David wonders whether widely varying interpretations of REST is an interop killer. Now that is an interesting question. It’s arguable that if two systems have completely different notions of state transitions and URI constructs, interop will suffer and you’re back to writing glue – or pitching for standards. On the other hand, there are those who argue that RFC-2616 has everything you need to be “good enough”, which, conveniently for them, makes interoperability an app domain issue. I think the web wonks have a good point. But it’s hard to know if resource-orientation (sorry) is a way forward unless you first let go of interfaces. One thing that scares the crap out of me is that someone will wire URIs and BPEL together and declare “Mission Accomplished”.

In Perfect Land, you can look at a URI and know what it does. URIs aren’t overloaded by packing the query string. Behavior is consistent and payloads for a given URI have the same format no matter whether you GET, POST, or PUT. But the RFCs (rightly) do not attempt perfection – real-world experience shows you rarely POST exactly what you GET. Also, ambiguities in the specs used to be settled by the whims of browsers. But now those specs are being used to connect systems instead of browsers to servers. I’m not ready to look a trading partner in the eye and say “hmm, let’s see how Opera handles an HTTP 415”. I’ll instead go hide in a closet and do just that.

So, even with the flawed cast of characters you see a lot of whining about – HTTP, URI, XML, and even (gasp!) XML Schema – the pieces are there to build good systems that also make great constituents in anyone’s SOA. The specs, with one glaring exception, are easy to digest. My advice is to start by thinking about a URI strategy that people can follow intuitively. Don’t underestimate the value of a good URI set or the design skills it takes to build. I would also ignore worries about pissing off the Old-Hands of the Web by somehow not using the Web exactly as intended. Their crystal balls weren’t clearer than anyone else’s and these guys do put pants on the usual way (not that I’ve personally witnessed it).

And on Harry’s point that REST and CRUD, I agree that Tim was simply advising people not to limit their comprehension of REST around entities accessed via GET and PUT. REST resources are like views that may or may not be underpinned by an entity model. REST state transitions use URIs to label and invoke services which may or may not use an entity/CRUD programming model under the hood. I agree with Tim in that the presence of those URIs as links is what actually defines REST – NOT simply that you’ve labeled data with URIs.

Wednesday, May 16, 2007

REST Protocols are the Service Layer

Business operations look a lot like protocol state machines. One of the areas I’ve been looking at (for several years) is how to better leverage state machine thinking into data-driven applications design. I read the REST dissertation, but didn’t see how Roy Fielding’s notion of state transitions applied to my projects. A couple of years ago, David and I looked at how server-side state transitions could drive a service layer (and more). We talked about writing up a paper, but I flaked out so David went ahead with a post.

So, I read Tim Ewald’s posts about REST (first post is here) with a lot of interest. Tim’s description of REST is centered on client protocol state transitions and, critically, ensuring responses to GET requests include the URIs (links) that transition between client states. Up until now, I ignored the idea of URIs for client states that are not mapped to system state transitions on the server. I also questioned the wisdom of including links at all in REST-style responses. Here’s why.

A classic blunder in applications development is designing the programming model (and it's API) around a specific client interaction style. An acid test for an API is how well it works for different calling scenarios, especially those unknown at design time. On that second point, only time can tell you if you have a winner. SOA itself was invented largely to counter unsuitable APIs (and fill space on collateral). Service layers usefully loosen the coupling (you don’t need my binaries to call me) and serve as adapters between an application and the potentially numerous and radically different callers.

So, isn’t it a slippery slope to presumptuously inject URIs into messages for operations that might not apply to the client’s intention? In Tim’s example, the server returns a list of itineraries along with URIs that transition the client to one of 2 potential next client states: getDetails or Reserve. One of these URIs changes system state and the other doesn’t. Rightly, you use GET on one of the URIs and POST for the other (although the example doesn’t indicate which is which). The “getDetails” tag bothered me because in a non-trivial case there might be numerous operations available relative to the resource. Or, the resource might be a summarization (like a report). If the data has a thousand customers IDs and there are 10 or more links possible per customer – you get the point. Why offer up a bunch of links that the client won’t use and potentially outweighs the actual data? Is it even possible to determine the complete set of links?

What I failed to remember, until now, is that data and protocols are independent. There can be multiple protocols with the same data as the starting point – but not the same URI. The URI identifies what data the caller wants and under which protocol.

Data + protocol = resource -> URI

The protocol embodies the caller’s intentions and the server’s constraints. Protocols are easy to secure (URI = security descriptor) and they work like interfaces for data hiding. My mistake until now was in believing that:

Each entity (I work on ERP apps and we still think about entities) is a resource that has a protocol defined by a system state sequence.
The only useful links in resources are URIs for posting value changes or transitioning states
Views of data representing many resources have no links because it’s difficult to generate a complete set and the caller’s intentions are unknown.

After working through Tim’s description, I’ve reached different conclusions about how to design a REST-style system:

I still like defining a state sequence and change constraints for entities in the application. It’s a good programming model for ERP because the server-side state transitions are handy places to hang workflow logic. Also, entity/life-cycle thinking aligns well with the business operations.
Systems have protocols that describe data interactions with reasonably specific purposes in mind. The views passed in and out of the system are resources.
A URI identifies a resource, a protocol, and where in that protocol the caller sits.
Resources contain links to other resources or invocation points specific to the protocol. This solves my issue with (potentially) large numbers of unhelpful links.
Protocols buffer the application from conversations – protocols are the new “service layer”.
Getting back to the first point, a “default” protocol can be derived from the server-side state sequence for an entity.

Protocols can carry links to places other than the initial server and resources can be more than just XML representations of entity data – think of the mash-up potential. The URI – via the protocol – tells the system what data to get, what format to use, what stylesheet to apply, and which links are useful to the caller. Adding new protocols to a system needs to be an easy thing to do.

I thought I knew REST and had taken a decent crack at applying it to ERP applications. This re-think has me wondering what else I haven’t figured out yet. In the meantime, I’m looking back at our web services, metadata, and customer-driven scenarios to see what features protocol definitions might need. The goal is to find a way to make it easy for our developers and our customers to create protocols. I’ll write up what I find.

Erik Johnson