Insights and discoveries
from deep in the weeds
Outsharked

Thursday, June 7, 2012

Async web gets and Promises in CsQuery

More recent versions jQuery introduced a "deferred" object for managing callbacks using a concept called Promises. Though this is less relevant for CsQuery because your work won't be interactive for the most part, there is one important situation where you will have to manage asynchronous events: loading data from a web server.

Making a request to a web server can take a substantial amount of time, and if you are using CsQuery for a real-time application, you probably won't want to make your users wait for the request to finish.

For example, I use CsQuery to provide current status information on the "What's New" section for the ImageMapster web site. I do this by scraping GitHub and parsing out the relevant information. But I certainly do not want to cause anyone to wait while the server makes a remote web request to GitHub (which could be slow or inaccessible). Rather, the code keeps track of when the last time it's updated it's information using a static variable. If it's become "stale", it initiates a new async request, and when that request is completed, it updates the cached data.

So, the http request that actually triggered the update will be shown the old information, but there will be no lag. Any requests coming in after the request to GitHub has finished will of course use the new information. The code looks pretty much like this:

    private static DateTime LastUpdate;
    
    if (LastUpdate.AddHours(4) < DateTime.Now) {

        /// stale - start the update process. The actual code makes three 
        /// independent requests to obtain commit & version info

        var url = "https://github.com/jamietre/ImageMapster/commits/master";
        CQ.CreateFromUrlAsync(url)
           .Then(response => {
               LastUpdate = DateTime.Now;
               var gitHubDOM = response.Dom;
               ... 
               // use CsQuery to extract needed info from the response
           });
    }

    ...

    // render the page using the current data - code flow is never blocked even if an update
    // was requested

Though C# 5 includes some language features that greatly improve asynchronous handling such as `await`, I dind't want to "wait", and the promise API used often in Javascript is actually extraordinarily elegant. Hence I decided to make a basic C# implementation to assist in using this method.

The `CreateFromUrlAsync` method can return an `IPromise` object. The basic promise interface (from CommonJS Promises/A) has only one method:

    then(success,failure,progress)

The basic use in JS is this:

    someAsyncAction().then(successDelegate,failureDelegate);

When the action is completed, "success" is called with an optional parameter from the caller; if it fails, "failure" is called.

I decided to skip progress for now; handling the two callbacks in C# requires a bit of overloading because function delegates can have different signatures. The CsQuery implementation can accept any delegate that has zero or one parameters, and returns void or something. A promise can also be generically typed, with the generic type identifying the type of parameter that is passed to the callback functions. So the signature for `CreateFromUrlAsync` is this:

    IPromise CreateFromUrlAsync(string url, ServerConfig options = null)

This makes it incredibly simple to write code with success & failure handlers inline. By strongly typing the returned promise, you don't have to cast the delegates, as in the original example: the `response` parameter is implicitly typed as `ICsqWebResponse`. If I wanted to add a fail handler, I could do this:

    CQ.CreateFromUrlAsync(url)
        .Then(responseSuccess => {
            LastUpdate = DateTime.Now;
             ...
        }, responseFail => {
             // do something
        });

CsQuery provides one other useful promise-related function called `WhenAll`. This lets you create a new promise that resolves when every one of a set of promises has resolved. This is especially useful for this situation, since it means you can intiate several independent web requests, and have a promise that resolves only when all of them are complete. It works like this:

    var promise1 = CQ.CreateFromUrlAsync(url);
    var promise2 = CQ.CreateFromUrlAsync(url);

    CsQuery.When.All(promise1,promise2).Then(successDelegate, failDelegate);

You can also give it a timeout which will cause the promise to reject if it has not resolved by that time. This is valuable for ensuring that you get a resolution no matter what happens in the client promises:

    // Automatically reject after 5 seconds

    CsQuery.When.All(5000,promise1,promise2)
        .Then(successDelegate, failDelegate);

`When` is a static object that is used to create instances of promise-related functions. You can also use it to create your own deferred entities:

    var deferred = CsQuery.When.Deferred();
    
   // a "deferred" object implements IPromise, and also has methods to resolve or reject

   deferred.Then(successDelegate, failDelegate);
   deferred.Resolve();   // causes successDelegate to run

What's interesting about promises, too, is that they can be resolved *before* the appropriate delegates have been bound and everything still works:

    var deferred = CsQuery.When.Deferred();

    deferred.Resolve();
    deferred.Then(successDelegate, failDelegate);   // successDelegate runs immediately

I may completely revisit this once VS2012 is out; the `await` keyword cleans things up a little but and the `Task.WhenAll` feature does the same thing as `When.All` here. By the way - the basic API and operation for "when" was 100% inspired by Brian Cavalier's excellent when.js project which I use extensively in Javascript.

No comments:

Post a Comment