Rethinkdbdash and client side backtraces posted on 15 August 2015

A bit of context

I released rethinkdbdash 2.1.3 (and 2.1.4) a bit earlier today. This release besides adding a few missing syntaxes for 2.1 (r.union and condition.branch(...)), fixed an issue with the client-side backtraces, though to be honest, the client-side backtraces were not really working before.

If a RethinkDB query throws an error, the server will provide a backtrace and the driver will reconstruct the query and underline the broken part. For example if a table is missing you will see this error:

ReqlOpFailedError: Table `test.users` does not exist in:
r.db("test").table("users").filter(function(var_1) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^                         
    return var_1("age").gt(18)
})

The query is printed back and the broken parts are highlighted. These backtraces are in my opinion of the little one of the little things that makes RethinkDB enjoyable to use. However a few queries cannot be sent to the server and therefore no backtraces are available. Because RethinkDB stores JSON document, some JavaScript values are forbidden, typically NaN and Infinity. When the driver finds a forbidden value, it used to just throw an error saying NaN cannot be converted to JSON. This is painful for two reasons:

  • The presence of NaN is an operational error and the driver should just reject the query, not throw.
  • NaN can easily propages and you basically have to inspect all possible variables. The bigger your query, the harder debugging it is.

What's new

Since 2.1.3, Rethinkdbdash now builds backtraces for these errors and will point exactly where the problem is.

var r = require('rethinkdbdash')();

var ADULT_AGE = 18;
var ADULT_AGE_US = NaN; // oops

r.db('test').table('users').merge(function(user) {
  return r.branch(
    user('location').eq('US'),
    { canDrink: user('age').gt(ADULT_AGE_US) },
    { canDrink: user('age').gt(ADULT_AGE) }
  )
}).run().then(console.log).error(function(error) {
  console.log(error);
});

What will be printed is:

ReqlRuntimeError: Cannot convert `NaN` to JSON in:
r.db("test").table("users").merge(function(var_1) {
    return r.branch(var_1("location").eq("US"), {
        canDrink: var_1("age").gt(NaN)
                                  ^^^ 
    }, {
        canDrink: var_1("age").gt(18)
    })
})

Pretty cool uh? You can pin exactly where the NaN value is.

Feedback/suggestions? Ping me on Twitter via @neumino or shoot me an email at [email protected].

Thinky 2.1.1 - Updating only relations posted on 13 August 2015

I released thinky 2.1.1 a bit earlier today.

Thinky was built on the assumption that you always retrieve all the documents from the database because updating a relation. Typically this was the expected workflow.

var thinky = require('thinky')();
var type = thinky.type();

User = thinky.createModel("User", {
  id: type.string(),
  email: type.string().required(),
  name: type.string().required(),
  adult: type.boolean().default(false)
}

User.hasAndBelongsToMany(User, "friends", "id", "id");

User.get("3851d8b4-5358-43f2-ba23-f4d481358901")
    .getJoin({friends: true}).run().then(function(user) {
  // user is fully defined with its friends
  user.friends = [];
  return user.saveAll({friends: true});
}).then(function(user) {
  // user 3851d8b4-5358-43f2-ba23-f4d481358901 has no more friends!
});

Looking at the GitHub issue tracker, many developers had issues. Trying to save a relation with pre-existing documents can throw errors like #245 and basically wasn't clear for many #309. Thinky 2.1.1 takes a stab at all these problems and introduces two new commands:

 - addRelation(field, joinedDocument)
 - removeRelation(field[, joinedDocument])

You can chain these commands on a query that returns one document and add/remove a relation.

Example: Add a new friend

User.hasAndBelongsToMany(User, "friends", "id", "id");

User.get("3851d8b4-5358-43f2-ba23-f4d481358901")
    .addRelation('friends', {id: '0e4a6f6f-cc0c-4aa5-951a-fcfc480dd05a'})
    .run()

Example: Remove a new friend

User.get("3851d8b4-5358-43f2-ba23-f4d481358901")
    .removeRelation('friends', {id: '0e4a6f6f-cc0c-4aa5-951a-fcfc480dd05a'})
    .run()

Example: Remove all friends

User.get("3851d8b4-5358-43f2-ba23-f4d481358901")
    .addRelation('friends')
    .run()

The second argument to addRelation needs to provide just enough information to create a relation.

  • In the case of a hasOne or hasMany relation, only the primary key is required.
  • In the case of a belongsTo or hasAndBelongsToMany relation, the primary key or the right key is required - one is enough!

The same argument for removeRelation is not always required and if it is defined, it also needs just enough information to find the relation to delete.

  • For hasOne and belongsTo relations, the joinedDocument is not required and should not be provided.
  • For hasMany and hasAndBelongsToMany relations, the absense of the joinedDocument makes the command delete all the relations for the given document. If a document is provided, the primary key or the right key is required - again, one of them is enough.

This update should enable people easier to manipulate relations without the need to retrieve the documents from the database. Beyond a simpler interface, it should also ease a little the load on the database.

Feedback/suggestions? Ping me on Twitter via @neumino or shoot me an email at [email protected].

Rethinkdbdash bug #103 posted on 29 June 2015

I just fixed a sneaky but in rethinkdbdash and thought it would be worth to quickly write about it.

Here is the broken snippet:

// Opening a socket
self.connection = net.connect({
  host: self.host,
  port: self.port,
  family: family
});

self.connection.on('end', function(error) {
  // ...
});
self.connection.on('close', function(error) {
  // ...
});
self.connection.setNoDelay();
self.connection.on('connect', function() {
  // Do the handshake
  // var initBuffer = ...
  // var lengthBuffer = ...
  // var authBuffer ...
  // var protocolBuffer = ...
  self.connection.write(Buffer.concat([initBuffer, lengthBuffer, authBuffer, protocolBuffer]));
});
self.connection.once('error', function(error) {
  // ...
});
self.connection.once('end', function() {
  self.open = false;
});

self.connection.on('data', function(buffer) {
  // Handle handshake and responses
});

I have cleaned a bit the code, but the main idea is to open a TCP connection, send a "handshake" and listen for data (the response of the handshake and the response for the queries we will send).

This code however can throw with

Process exit at 2015-06-27T15:15:18.897Z
events.js:141
      throw er; // Unhandled 'error' event
            ^
Error: write after end
    at writeAfterEnd (_stream_writable.js:158:12)
    at Socket.Writable.write (_stream_writable.js:203:5)
    at Socket.write (net.js:615:40)
    at /home/u/app/node_modules/rethinkdbdash/lib/connection.js:81:23
    at Object.tryCatch (/home/u/app/node_modules/rethinkdbdash/lib/helper.js:156:3)
    at Socket.<anonymous> (/home/u/app/node_modules/rethinkdbdash/lib/connection.js:80:12)
    at emitNone (events.js:72:20)
    at Socket.emit (events.js:166:7)
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1029:10)

The reason is that self.connection.write is not safe there.

The main problem is that the TCP connection can be already closed when you enter the callback for the connect event. In this case, we will try to write on the socket, but this will immediately emit an error. Because we bind the listener for error after the one for connect, we end up with an error that nothing will catch that will eventually crash the worker.

The solution is simply to bind your listener for error before trying to write anything on the socket. A better solution would probably be for node to emit the event at the next tick like mentionned in the TODO.

On building a REST API with thinky posted on 18 April 2015

ReQL is an incredibly powerful beast, and thinky wields most of its power by using the exact same API. While thinky can be used like classic ORMs like mongoose, it is a shame to miss some of its really nice features.

This article describes how to build the thinky part of a REST API for Express, and hopefully is the first of a serie that will showcase what thinky can do. This article focuses only on thinky. If you want to learn how Express work, there is a plethora of tutorial on the Internet.

I. Create the model

var thinky = require('thinky')();
var type = thinky.type;
var r = thinky.r;

var User = thinky.createModel('User', {
  id: type.string(),
  name: type.string().required(),
  email: type.string().email().required(),
  createdAt: type.date().default(r.now())
});

The model has 4 fields:

  • id: a simple string.
  • name: a required string.
  • email: a required string that should be a valid email.
  • createdAt: the date at which the user was created.

The table will be automatically created under the hood. If you immediately fire queries while the table is not ready, the queries will be queued.

II. Insert a new user

function insert(request, response, next) {
  var user = new User(request.body);

  user.save().then(function(result) {
    // user === result
    res.send(JSON.stringify(result));
  }).error(function(error) {
    // Duplicate primary key, not valid document, network errors etc.
    response.send(500, {error: error.message}
  });
}

There are a few things to note here:

  • The object request.body needs to only provide two fields name and email.
  • The field id is the primary key and if undefined, will be automatically generated by RethinkDB when the document is inserted.
  • The field createdAt, when undefined, is automatically set to r.now() by thinky and will be replaced in the database by the time at which the query is executed.

III. Get one document by primary key

function get(request, response, next) {
  User.get(request.id).run().then(function(user) {
    res.send(JSON.stringify(user));
  }).error(function(error) {
    // Document not found, network errors etc.
    response.send(500, {error: error.message}
  });
}

IV. Update a user given its primary key

function get(request, response, next) {
  User.get(request.id).update(request.body).run().then(function(user) {
    res.send(JSON.stringify(user));
  }).error(function(error) {
    // Document not found, not valid document, network errors etc.
    response.send(500, {error: error.message}
  });
}

If you look at this snippet, a unique query is executed, not two. ORMs usually require you to write something like below, which executes two queries.

// Works with thinky, but you do not have to run two queries.
function get(request, response, next) {
  User.get(request.id).run().then(function(user) {
    user.merge(request.body);
    return user.save()
  }).then(function(user) {
    return JSON.stringify(user)
  }).error(function(error) {
    response.send(500, {error: error.message}
  });
}

So what does thinky do in the first snippet?

  • It first validate all the fields passed in update.
  • It run the update query in RethinkDB.
  • It validates the whole new document (returned by the update query).

Thinky validates the whole document again because it can also validation accross multiple fields (like check that a user is more than 21 if he lives in the US, else check that the user is more than 18).

In the most common case, you just validate the type of each field, so the third step will never fails. If it does the document will be reverted (and only in this case two queries are executed).

Note: The user may be returned as undefined if the update is a no-op query. This is currently a regression with 2.0 (see rethinkdb/rethinkdb#4068 to track progress).

V. Delete a user given its primary key

function get(request, response, next) {
  User.get(request.id).delete().execute().then(function(result) {
    res.send(JSON.stringify({status: "ok"}));
  }).error(function(error) {
    // Document not found, network error etc.
    response.send(500, {error: error.message}
  });
}

We use execute here and not run because no document will be returned.

VI. Return all users

function all(request, response, next) {
  User.run().then(function(users) {
    res.send(JSON.stringify(users));
  }).error(function(error) {
    // Network errors etc.
    response.send(500, {error: error.message}
  });
}

VII. Pagination

var perPage = 50;

function range(request, response, next) {
  var start = (request.start) ? request.start: r.minval;
  User.between(start, r.maxcal).limit(perPage).run().then(function(users) {
    res.send(JSON.stringify(users));
  }).error(function(error) {
    response.send(500, {error: error.message}
  });
}

Pagination here is done via primary key with between/limit, and not with skip/limit for performance reasons.

Need the number of users?

function range(request, response, next) {
  User.count().execute().then(function(count) {
    res.send(JSON.stringify(count));
  }).error(function(error) {
    response.send(500, {error: error.message}
  });
}

This is it! Stay tuned for the next article!