Fork me on GitHub

LD.

Music, software, life… and stuff.

[ Twitter ] [ GitHub ] [ LinkedIn ]

Ratpack’s execution model, part 1.

A not well known aspect of Ratpack is its execution model. It’s quite well known that Ratpack uses the non-blocking/asynchronous paradigm, but its execution model is more than this. Asynchronous programming is full of traps (like any kind of programming), and Ratpack’s execution model avoids a certain class of these traps.

Let’s quickly cover what the implication of non-blocking is for a Ratpack app.

Let’s imagine an app that uses JDBC to read from a database. JDBC calls are blocking. You ask for some data by calling a method, IO happens between the client library and the database, then the result is returned to the method caller. While the IO is happening, the executing thread is blocked and can’t be used. The alternative is asynchronous IO. With this approach, the IO is initiated in a way where the the calling thread does not wait for the IO to complete, but instead continues on with other work and relies on the operating system to notify in someway when the IO is complete. The idea being that the thread can go on doing other useful stuff while the IO is “happening”. The implication of this is that you don’t need very many threads because they are always busy doing work, and this is the basis of claims that non blocking systems are more performant than blocking ones. Blocking apps (by far the majority) employ large pools of threads for handling requests. Non blocking apps employ surprisingly small thread pools (usually one or two threads per core) and the golden rule is that you can’t block on these threads (preferably not at all, but definitely not on these threads), which means doing things asynchronously.

Machines fundamentally perform tasks asynchronously. Synchronous programming is just an abstraction; a very prevalent and useful one. It allows programmers to reason about an operation as being continuous and deterministic. You give this up with the asynchronous/non-blocking paradigm, and therein the problem begins. What you gain in terms of “performance” you pay for in convenience. At its worst, multithreaded asynchronous programming is completely non deterministic and disjointed. Ratpack tries to bridge the performance/convenience gap via its execution model.

(all subsequent code in this post will be in Groovy, but could easily be Java)

The “execution model” is wrapped up in the ratpack.exec package. Notably, ExecControl which defines the interface for performing asynchronous operations. Ratpack request handlers work with a Context which implements ExecControl.

We’ll use the blocking(Callable) method as our asynchronous operation.

So, performing an asynchronous operation looks like this:

handlers {
  get {
    blocking {
      getValueFromDb() // returns string
    }.then {
      render it // it is the return of getValueFromDb() 
    }
  }
}

The blocking() method returns a Promise. The then() method effectively takes a callback that gets executed when the promised value becomes available. What’s being promised here is the result of the function (as a Groovy Closure here). Because the function performs blocking IO, we have to execute it in this way. What’s going to happen is that the function is actually executed on a separate thread, from the blocking thread pool. When it completes, the value is returned back to the non-blocking thread pool and given to the “then” callback. This is asynchronous.

This is pretty standard stuff from promise based asynchronous APIs. Promises aren’t really anything special, to the extent we’ve used them so far. This code would look fairly similar with raw callbacks.

Ok, so let’s dig deeper.

handlers {
  get {
    print "1"
    blocking {
      print "2"
      getValueFromDb() // returns string
    }.then {
      print "3"
      render it // it is the return of getValueFromDb() 
    }
    print "4"
  }
}

What’s the cumulative output of the print's here? Based on what we know so far, and with most asynchronous APIs, you can't know; it's not deterministic. It's probably going to be 1423, but could also be 1243 or 1234. It’s going to depend on how long it takes for the blocking operation to start and finish, and how long it takes for the result to be given back to the non blocking thread pool.

What about this?

handlers {
  get {
    print "1"
    blocking {
      print "2"
      getValueFromDb() // returns string
    }.then {
      print "3"
      render it // it is the return of getValueFromDb() 
    }
    sleep 10000
    print "4"
  }
}

Very likely 1234, but again you can’t be sure. You don’t know how much time it’s going to take to start the blocking operation.

Ratpack’s execution model guarantees that the output will be 1423.

Let’s look at a more interesting example.

handlers {
  handler {
    next()
    throw new Exception("ex1")
  }
  get {
    blocking {
      getValueFromDb() // returns string
    }.then {
      render it // it is the return of getValueFromDb() 
    }
  }
}

What you need to know about Ratpack here is that next() passes control to the next handler, which is the get {} handler. After that handler is “done”, the next() call will return.

This particular example is a little contrived, but it does represent a real problem in asynchronous programming. A race is on.

With a lot of asynchronous APIs, when you initiate an asynchronous operation you inherently create a race because you split the execution. The execution is going to continue with the callback, and execution has to unwind the call stack. If we consider this in the context of a logical operation, such as servicing a request, we now have to reason about a non deterministic outcome. What should happen if when the promised value arrives and the exception was already thrown? How should the exception be handled if the promised value arrived and was written to the output before the exception was thrown? Such a logical inconsistency is not always this easy to spot, and chances are it makes itself known for the first time in production.

Ratpack’s execution model guarantees that the exception being thrown is the logical outcome. In fact, the background operation is never initiated.

The point here is that the given examples are deterministic in Ratpack, while they aren’t with most asynchronous APIs. Ratpack serializes the segments of a logical execution to avoid races.

The given examples so far represent a single logical execution, responding to a request. We can break it down into 3 pieces…

  1. Everything up to the asynchronous call
  2. After the asynchronous call (i.e. unravelling the call stack)
  3. The callback, initiated when the async operation completes

The problem we have been discussing is that #2 and #3 are are racing because they can be executed concurrently.

Ratpack breaks this into 4 pieces:

  1. Everything up to just before the asynchronous call
  2. After everything up to just before the asynchronous call
  3. The asynchronous call
  4. The callback

The key difference here is that the asynchronous call is not initiated straight away. It’s deferred until the stack has unwound and there is no more code to execute. This avoids races.

To see how, let’s look at promise() (which is what blocking() builds on). The promise() method is for initiating arbitrary async operations.

handlers {
  get {
    // (1)
    promise { fulfiller ->
      // (3)
      someCallbackTakingAsyncApi {
        // (4)
        fulfiller.success(it) // it is the result of the async call
      } 
    }.then {
      // (5)
      render it
    }
    // (2)
  }
}

The number comments indicate the guaranteed sequence of events.

The promise() method takes a function that initiates an asynchronous operation. It integrates with the execution machinery to wait until the current execution segment is complete before executing it. This effectively serializes the segments of an execution and avoids races.

There’s an assumption built in here. We assume that the initiation of the async operation is atomic, in that once it has been initiated nothing else interesting can happen until it completes. That is, it cannot throw an exception or make any kind of state change. This is the same assumption that naive async APIs make about the call stack; that unwinding it will not produce errors or side effects. Ratpack reduces the risk/scope of the assumption by initiating the async operation with a new call stack of constrained scope. With the naive approach, there’s no real way to know what’s waiting to happen while the call stack unwinds which is riskier.

Why does this matter? Determinism. You want as much as you can get. It reduces the mental burden in reasoning about what a system is doing, and more importantly will do.

What are other ways of being deterministic?

  1. Synchronous/blocking thread-per-request (e.g. traditional Servlet API)
  2. Asynchronous, single threaded (e.g. Node.js)

The problem with #1 is performance (generally speaking). The problem with #2 is under utilization on multi core hardware, unless you introduce multiple processes which introduces other problems.

I mentioned that synchronous programming allows reasoning about an operation as something that is continuous and deterministic. We’ve discussed how Ratpack’s execution model allow asynchronous programming to be deterministic, via the promise() method. We have discussed how it supports continuousness, which refers to cumulative state, contextual error handling and resource management, which will be the subject of a follow up post. Another future post may be how this all relates to streaming data while supporting determinism and continuousness.

To round things out, let’s look at a more complex example. If you can reason that the net result of this code is that the string '1,2,3' is rendered to the response, and that the collection l does not need to be concurrent safe, then you’ve got it under control.

handlers {
  get {
    def l = []
    promise { f ->
      Thread.start {
        sleep 3000
        f.success("1")
      }
    }.then {
      l << it
    }
    promise { f ->
      Thread.start {
        sleep 2000
        f.success("2")
      }
    }.then {
      l << it
    }
    promise { f ->
      Thread.start {
        sleep 1000
        f.success("3")
      }
    }.then {
      l << it
      render l.join(",")
    }
  }
}

If you want to go implementation exploring, you can browse the code and tests.

Posted: Sep 13th, 2014 @ 10:11 pm

Tags: #software  #ratpack  

Comments

Ratpack update: more than a micro framework

It’s been a long time since I posted anything about Ratpack (or much else for that matter), so an update is well overdue.

We are steadily iterating on the codebase and documentation, but the inadequate growth in documentation doesn’t really capture what has happened over the past 12 and a bit months nor what Ratpack is shaping up to be.

The first thing to say is that Ratpack is not just me. The project has been fortunate enough to attract many contributions and some new core committers, particularly Rus Hart and David M. Carr. Thanks to everyone who has taken the time to contribute.

When I picked up Ratpack, its goal was to be a port to Groovy of the popular Ruby framework Sinatra. The situation is now very different, and none of the original code remains. My original intention was to do a small amount of work on the project for a friend, but I quickly became intrigued about the possibilities of creating a new tool for HTTP applications. 18 months later here we are. The website and documentation intentionally makes no reference to Sinatra, as Ratpack has really become something else.

First of all, Ratpack is not a Groovy framework. It’s purely implemented in Java 7, soon to be Java 8. It is however designed with Groovy in mind, and has particular (optional) support for Groovy by way of the ratpack-groovy library that makes some APIs more concise through use of Groovy’s closures. The ratpack-groovy library also brings in additional features such as Groovy based templates, building responses with MarkupBuilder and reading/writing JSON via Groovy’s built in JSON support.

One thing to note about the Groovy integration is that it’s cutting edge. Ratpack has been pushing the boundaries of Groovy’s support for static typing and type inference. What this means in practice is that Ratpack’s use of Groovy is 100% strongly typed. You can use @CompileStatic everywhere and still write amazingly concise code thanks to new features such as @DelegatesTo and improved type inference for Closure-to-SAM-type coercion. A good example of this can be seen in the code examples for the RxRatpack.observe() method). There are Java and Groovy examples given there, both 100% statically compiled and strongly typed. No need to declare any explicit types in that Groovy example thanks to Groovy’s type inference. What this also means is that IDEs (particularly IntelliJ IDEA) fully understand all of your Groovy code and can provide amazing editing support on par with working with Java code. As the API is completely static, refactoring and usage analysis just works without any special Ratpack plugins for the IDE.

Which brings us to documentation. We’ve put a lot of work into our documentation infrastructure, and to the content. One of the goals of the project is to have high quality, always accurate, documentation. The only way to keep the code samples in the documentation accurate over time is to test them, which is what we do. Every single code snippet in the manual and Javadoc is statically compiled and executed on every code change. Over time we are adding more and more examples, particularly to the Javadoc. I personally find that looking at some code that uses a feature is usually far more useful than textual descriptions. We’ve got a long way to go with the documentation, but it is improving every release. On a project that only evolves by people contributing their spare time, adding to the documentation is always a challenge. If you’re interested in contributing to the docs (that would be great!), or are interested in more detail about how it works you can checkout our docs on the docs.

Ratpack initially started off in the Sinatra inspired genre of “smallest possible hello world app”. While one thing about Ratpack is that it’s very convenient to write small apps in (using either Groovy or Java), this is no longer the goal. Now, the goal is scalability in every sense of the word. Scalable application complexity, and scalable performance.

A problem of any software development project is dealing with framework boundaries and constraints as your codebase inevitably gets more complex, i.e. scaling the complexity. All those productivity features that were so useful at the start of the project start to get in the way and force undesirable design choices. Ratpack tries to address this problem in the following ways:

  1. A small set of pervasive abstractions (e.g. parsers, handlers and renderers)
  2. Composition through functions (e.g. request processing is just traversal through a graph of handler functions)
  3. Adapt, don’t abstract (e.g. don’t abstract over JSON handling, adapt different libraries)

What this means in practice is that Ratpack somewhat bucks the opinionated convention approach that is popular in frameworks. You have much freedom to wire an application together out of different libraries and techniques. This often means that you will have to do a little bit more wiring with Ratpack than other more opinionated tools, but it also means that you are in full control. This is critically important as your application grows. However, this doesn’t mean you are completely on your own. Ratpack’s integration with Google Guice and its Registry abstraction provide a way to stitch things together without a lot of busy work.

Scalable performance mostly comes down to Ratpack being non blocking via Netty (though we also keep a close eye on and measure Ratpack’s internal efficiency). There are many articles on the web about the performance and scale benefits of non blocking server applications so I won’t rehash that here. What I will say is that Ratpack is completely non blocking and provides mechanisms that take some of the complexity and pain out of asynchronous programming. In particular, Ratpack integrates with RxJava. RxJava support building complex, asynchronous and non blocking, processing out of composable functions without callbacks. RxJava works very well with Ratpack.

Another key area where Ratpack shines is testing. Ratpack makes it easy to unit test handlers and handler chains. It also makes it easy to integration/functional test your entire application (from within your IDE with no special plugins), but also importantly you can do the same kind of testing on arranged subsets of your application by creating mini embedded applications at test time (here and here). The embedded app support is also very useful for creating test time doubles of HTTP services that your app depends on. You can very easily create a small Ratpack app that mimics the service your application calls out to. It’s also extremely easy to use Geb to browser test your app, and there also integration with Groovy Remote Control for injecting code into the application under test for setting up test data and simulating failure conditions. Much more documentation is needed on the test time support, as it really is a very strong feature of Ratpack. Simple and fast test execution, within the IDE, without the need for IDE plugins or command line tools.

If you haven’t seen them, it’s also worth having a look at our docs on Gradle build integration and deploying to Heroku.

There is still much more to say about Ratpack. I haven’t mentioned the integrations with Pac4j, Handlebars templates, CodaHale Metrics, Jackson, Hystrix and more. Hopefully you get the idea that Ratpack is about more than very small “Hello World” applications. Documentation for all of these features will arrive eventually.

With regards to the Roadmap, there is no ETA for 1.0. We will continue to release a new version on the first of each month for the foreseeable future. Once we are happy with the core API (and documentation) we will go 1.0 and freeze the API in terms of breaking changes. The more people use Ratpack and the more we get feedback, the quicker this is going to happen. Also, we are always looking for more contributors.

Hopefully this, admittedly long, rant about Ratpack has convinced you that it might be worth a look at, or worth a second look if you haven’t checked it out in a while.

Posted: Jun 5th, 2014 @ 10:23 am

Tags: #software  #ratpack  #java  #groovy  #web  

Comments

Marcin Erdmann is now the Lead Developer of the Geb project

I’m happy to announce that I’m handing over the “lead” hat for the Geb project to my good friend Marcin Erdmann. Marcin has been a serious contributor to Geb (and other projects that I work on) for the past few years and has been doing an excellent job.

I’ll still be heavily involved in Geb development for now and into the future, but I’ll be proposing changes to Marcin instead of the other way around.

I created Geb during November 2009. I have learned a lot about running projects, communities and delivering tools to developers from the process. I’ve also been fortunate enough to speak about Geb at conferences around the world. However, it’s now time to let someone else take the reigns.

This means that yet again I will fail to take a personal project across the 1.0 milestone. Hopefully I can break this pattern with Ratpack.

I’ve got no doubt that Marcin will do a great job as Lead and that this is a good outcome for users of Geb.

Posted: May 15th, 2014 @ 6:01 pm

Comments

Jetty or Vert.x for Ratpack?

Update (2013-02-06): It turns out I had some bad assumptions about Vert.x. It’s not practical for embedding (for Ratpack’s purposes) and requires a full Vert.x runtime/platform environment. This is a deal breaker. So… this means it’s looking like Jetty… or maybe better still… Netty.

Last year, the effervescent Tim Berglund made me aware of Ratpack, which could have been accurately described as a Groovy version or Ruby’s Sinatra micro web framework. I became interested in it primarily as a way to write example apps for Geb examples/demos/classes and ultimately the Geb website itself.

First things first, why not Grails? Grails is awesome, that’s not in question. It’s just more than needed for these simple apps. What’s more, I needed it to be happy to be part of a Gradle toolkit (I’m still working on the grails-gradle-plugin) and I wanted something light. I also am attracted to new shiny things.

Ratpack fit the bill of what I was looking for in concept. It was originally developed by a bunch of people about two years ago, then stalled when the lead (who I only know as bleedingwolf) formally announced he was no longer working on it. Sometime after this Tim picked it up and became the official maintainer (with bleedingwolf’s blessing). The official GitHub home of the project (after a few moves) is now github.com/ratpack/ratpack.

When I started looking at it, I wanted to make some changes (if you’ve ever worked with me this won’t surprise you). Given that the project had been dormant for a while, I started doing this (with Tim’s blessing) without any real regard for backwards compatibility. There were some fundamental issues that, given where the project was at, just made more sense to thoroughly sort out. Two major functional changes were made:

  1. You didn’t need to restart the app for changes to take effect (when changing the routing file, which is really the application)
  2. Back away from J2EE and just embed Jetty (i.e. don’t try to produce a WAR file)

Along with this I worked on improving the Gradle plugin to reflect that it’s now a standalone app instead of a WAR. This means basing the plugin on the Gradle Application plugin rather than the WAR plugin. This made it simple to build a standalone, self contained, portable app.

A little while ago I started to wonder about different models for Ratpack than J2EE, servlets and all that noise. Ratpack was already divorced from that stuff, in that you never saw it, but it was still based on Jetty and was fundamentally implemented as a servlet. I started looking at Vert.x. As an experiment, I took a branch and set about transplanting Ratpack on top of Vert.x. I’m happy with the outcome.

The question now, is what to go forward with. I don’t think maintaining two versions going forward is the right thing to do and I certaintly don’t want to do that. Underpinning all of this is a certain amount of trust in the async IO argument. That is, this async model will ultimately lead to more performant, more scalable applications. I’m taking that at face value because people smarter than me have made this argument. On the other hand, there is no doubt (at least in my mind) that async based programming is more difficult. That’s the price you pay. Ratpack on top of Vert.x can take away some of the danger of async programming make it less error prone (and I’d say it already does).

What does Ratpack give you over raw Vert.x?

  1. Templating - fully async rendering, and statically compiled (optional) using Groovy 2.1 (and indy)
  2. Error handling (i.e. 500 pages) - this is no small thing in an async world
  3. Not found handling (i.e. 404 pages)
  4. Routing - Vert.x already has this, but Ratpack’s is integrated with 404 and 500 handling (and runtime reloadable)
  5. Static file serving - Goes above what Vert.x gives you and support HTTP caching headers
  6. Session support
  7. Runtime reloading - for routes, and if you’re using the Gradle plugin for all of your application code (via SpringLoaded)
  8. Higher level abstractions - for example Request and Response
  9. Decent integration with a capable build tool (Gradle)

With the Jetty version, the goal is to hide Jetty and servlet stuff in general. For the Vert.x version, this would not be the case. The point is not to abstract over Vert.x and hide it. It’s to add some convenience for doing small web apps while fully leveraging Vert.x’s runtime features (e.g. messaging, sockjs). Unsure how Vert.x modules would play at this point.

Here’s how I see the pros/cons of Jetty or Vert.x as the basis for Ratpack.

Vert.x

Pros
  1. Performance (though Jetty is still very fast by all accounts)
  2. Fewer dependencies
  3. Embedded message bus
  4. Embedded SockJS support, with message bus spanning to client
  5. Built in clustering (for eventing at least)
  6. No J2EE (abstractions not needed for this kind of thing)
Cons
  1. Async programming has challenges (debugging being one)
  2. Less trusted than Jetty
  3. No WAR deployment (this is appealing to some)
  4. No built in session support (I’ve added my own for Ratpack) and some other stuff that Jetty gives

Jetty

Pros
  1. Well known, trusted
  2. All of the HttpServletRequest convenience (header parsing etc.)
  3. Familiar, threaded model
  4. Can potentially use servlet filters and all that junk
  5. WAR packaging
Cons
  1. No message bus
  2. More work to get SockJS or any server/client messaging (by no means impossible though)
  3. Clustering becomes more complicated
  4. More dependencies (more weight)
  5. Not async (assuming that impacts performance in the general sense)

There are probably more, but that’s how I see it right now.

Left to my own devices, I’d keep going with the Vert.x version because I find it the most interesting to work on. However, I’d also prefer to work on something that’s useful to other people. That’s why I’d like your opinion on the matter. Should Ratpack build on Jetty (and J2EE generally) or Vert.x?

There’s precious little documentation available right now on either the Jetty or Vert.x version of Ratpack. I’d like to sort out this question before investing in such a thing. The best that is available right now is a (partial) port of the Groovy Web Console to Vert.x Ratpack on GitHub. There’s a readme there for playing with it. Of course, there’s always the Ratpack source (the master branch is Jetty, the vertx branch Vert.x).

There’s also another (very unstable, may disappear) app that I’m working on (to explore Shiro and Vert.x’s event bus) on GitHub. This is also a shake out of predominantly using Java instead of Groovy with Ratpack. Don’t take it too seriously.

Posted: Feb 1st, 2013 @ 11:56 pm

Tags: #software  #groovy  

Comments

Gr8Conf EU, now with more me

I’m excited to be presenting a GR8Conf EU this year. This will be my first time at GR8 Conf and I’m looking forward to meeting some of the Groovy community that I’ve not had the opportunity to meet before.

I’ll be presenting on Geb, as both an introduction for the uninitiated and to discuss the new features that have appeared in the last few months. It will be an interesting change to present this material to a crowd already comfortable with Groovy; quite different to my upcoming session at SeleniumConf.

Of course, I’ll also be there representing Gradleware and helping Peter give a 3 hour Gradle Bootcamp that will get anybody up and running with Gradle. I’ll also be doing another (new) Gradle session on releasing software. This will be a tutorial style presentation that will arm you with everything you need to build, test and release open source software to Maven Central with Gradle. There will also be a dive into the Spock and Geb builds to look at how they managing building and testing Grails plugins as part of their builds and automating their release.

Hope to see you there!

Posted: Apr 3rd, 2012 @ 8:17 pm

Tags: #software  #groovy  #conference  

Comments

Archive · RSS · Theme by Autumn