homedark

Understanding Javascript Promises

Apr 26, 2014

Learning promises is an interesting process. Like many tools, there are common pitfalls to get stuck on. What makes promises somewhat unique is that it tends to be obvious that you're doing it wrong. Why? Promises are designed to solve a clear and well-understood problem: nested callbacks. As you start to use them however, you're code isn't that different. Obviously there's something you don't grok, but what?

The example that I'm going to walk through is a login. This involves:

  1. Opening our connection
  2. Getting the user's row from the DB based on email
  3. Comparing the passwords
  4. Updating the DB with a `token`
  5. Releasing our connection
  6. Returning the user data and token

If you're familiar with callbacks, you should be able to visualize the flow of this code. (We're using the asynchronous bcrypt.compare method for our password check, so don't forget to visualize that nesting).

Before we dive into code, I think much of the confusion around promises stems from understanding how chaining works. Without this understanding, you're likely to end up with nested promises and resolvers, which ends up being worse than callbacks.

Getting past the confusion comes down to remembering what you already know: these two samples are different:

p = new Promise (resolve, reject) ->
   # do stuff here and resolve or reject
p.then (value) -> #do something
p.then (value) -> #do something
return p

# vs

new Promise (resolve, reject) ->
   # do stuff here and resolve or reject
.then (value) -> #do something
.then (value) -> #do something

In the first case the two thens are attached to the same original promise. Furthermore, the original promise is returned to the caller. In the second case, the first then is still attached to the original promise, but the second one is attached to the return of the first. Similarly, the return value is the last then's return value. then returns something known as a thenable which acts a lot like a promise.

What's confusing is that we forget what is and isn't asynchronous. The execution of the handlers themselves (the functions passed to then) happens at some point in the future. However, the chain itself is built synchronously. At first that might not be intuitive, we create the promise and think of everything else happening after; only the promise itself, as the entry point, exists and thus gets returned. But forget about synchronicity, this is just method chaining. In the above paragraph we talked about one then being attached to the return value of the previous one. This return value isn't what our callback returns (that will be executed in the future), but what then itself returns, which is a thenable.

The asynchronous part is that the return value of a callback gets passed into the callback of the following thenable. The thens are linked synchronously, the callbacks are linked asynchronously. What a promise is guarantee that the chain will be executed in order and that the last link, which the caller has, is going to get the final data, which is what the caller wants.

Onto the code. First we start by opening our connection (I'm using bluebird as my promise library). We'll start off by promisifying the pg library. This creates an *Async version of each function, giving us a promisable flow instead of a callback flow. We do this against both the main pg library as well all the instance methods of pg.Client:

pg = Promise.promisifyAll(require('pg'))
Promise.promisifyAll(pg.Client.prototype)

# also promisify the bcrypt library
bcrypt = Promise.promisifyAll(require('bcrypt'))

This means that we can now call connectAsync against pg and get a promise (rather than calling connect which expects a callback). Similarly, we can now call queryAsync against any instance of pg.Client. With this in place, we can open our connection and get the row:

@find_by_credentials: (email, password) ->
  pg.connectAsync(connection_string)
  .then (conn) -> conn.queryAsync("select * from users where email = $1", [email])

This says "create a connection and pass it to me so that I can find the row". The next step says "pass me the row so I can verify the user":

@find_by_credentials: (email, password) ->
  pg.connectAsync(connection_string)
  .then (conn) ->
    conn.queryAsync("select * from users where email = $1", [email])
  .then (result) ->
    if result.rows.length == 0
      false
    else
      data = result.rows[0]
      bcrypt.compareAsync(password, data.password)

Notice that our 2nd handler isn't nested within the first, but rather is attached to it, which is why its input is the result of the of our query. In our next handler we'll run into a serious problem, can you spot why the following code won't work?

@find_by_credentials: (email, password) ->
  pg.connectAsync(connection_string)
  .then (conn) ->
    conn.queryAsync("select * from users where email = $1", [email])
  .then (result) ->
    return false if result.rows.length == 0
    user = result.rows[0]
    bcrypt.compareAsync(password, user.password)
  .then (is_match) ->
    return null unless is_match
    user.token = crypto.createHash('sha256').update(Math.random().toString()).digest('hex')
    conn.queryAsync("update users set token = $1 where id = $2", [user.token, user.id])
  then ->
    user

The last two handlesr have a major flaw: neither user nor conn are in scope. There are a couple ways to solve this. The one that I'll use is bind. The bind function lets us define what this is within our chain. We can bind to anything. In our case, we'll bind to an empty hash and use it to attach state:

@find_by_credentials: (email, password) ->
  # Notice the extra call to bind here
  pg.connectAsync(connection_string).bind({})
  .then (conn) ->
    # this (or @ in coffeescript), is what we bound to. In this case, an empty
    # hash, which is all we need to pass some state around
    @.conn = conn
    conn.queryAsync("select * from users where email = $1", [email])
  .then (result) ->
    return false if result.rows.length == 0
    @.user = result.rows[0]
    bcrypt.compareAsync(password, @.user.password)
  .then (is_match) ->
    return null unless is_match
    @.user.token = crypto.createHash('sha256').update(Math.random().toString()).digest('hex')
    @.conn.queryAsync("update users set token = $1 where id = $2", [@.user.token, @.user.id])
  .then -> @.user

There's one last thing we need to do. Despite what our above code claims, pg's connect function doesn't return a single connection object. It returns both a connection as well as a callback which should be used to release the connection back into the pool. We could easily program against the array. Alternatively, we can use spread instead of then which flattens out the array into parameters:

@find_by_credentials: (email, password) ->
  pg.connectAsync(connection_string).bind({})
  .spread (conn, release) ->
    conn.queryAsync("select * from users where email = $1", [email])
    ...

Next we'll want to call release when we're done with our with our code. Importantly, we want to do this whether or not an error happens. For this reason, we'll use finally:

@find_by_credentials: (email, password) ->
  pg.connectAsync(connection_string).bind({})
  .spread (conn, release) ->
    @.conn = conn
    # keep track of this
    @.release = release
    conn.queryAsync("select * from users where email = $1", [email])
  .then (result) ->
    return false if result.rows.length == 0
    @.user = result.rows[0]
    bcrypt.compareAsync(password, @.user.password)
  .then (is_match) ->
    return null unless is_match
    @.user.token = crypto.createHash('sha256').update(Math.random().toString()).digest('hex')
    @.conn.queryAsync("update users set token = $1 where id = $2", [@.user.token, @.user.id])
  .then -> @.user
  .finally -> @.release()

It's important to note the reason that finally is called last. It isn't some magic property of finally to execute itself at the end of the chain. The reason finally is called when it's called, is merely a function of where we attach it. If we attached finally after our first then, our last handler would end up with an invalid connection. The reason for finally's name is simply that it's called in both the normal and error flow. (if you throw in one of the then's, the following thens won't execute, whereas finally will.)

Hopefully this example helped. You can see that the flattening isn't artificial. It really comes down to understanding that the chain is created now but executed in the future, in sequence. Also, while we focused on this single type of flow (which I think is where most developers get stuck), there are other useful patterns. In particular, there are times where you'll want to attach multiple handlers to the same promise, rather than create a chain. What's important though is that you now know what both of these will output, and why:

Promise = require('bluebird')

p = new Promise (resolve) -> resolve(1)
p.then (value) -> value + 1
p.then (value) -> value + 1
p.then (value) -> console.log(value)
#vs
new Promise (resolve) -> resolve(1)
.then (value) -> value + 1
.then (value) -> value + 1
.then (value) -> console.log(value)

update

As a last-ditch effort to make up for my failure in explaining this, I suggest you strip out all the noise and look at the bare code:

@find_by_credentials: (email, password) ->
    pg.conn(cs).then(function(){...}).then(function(){...}).then(function(){...})

What does this function return? It doesn't return the connection. It doesn't return the asynchronouslsy executed functions. It returns whatever the return value of then() is (which is a thenable).

Now expand it just a little:

@find_by_credentials: (email, password) ->
    pg.conn(cs).then(function(CONN){return "A"}).then(function(A){return "B"}).then(function(B){...})

  #vs
  @find_by_credentials: (email, password) ->
    p = pg.conn(cs)
    p.then(function(CONN){return "A"})
    p.then(function(CONN){return "B"})
    p.then(function(CONN){...})

This should help you see how everythign relates to each other