Monday, 14 January 2013

Playing with NodeJS and MongoDB (Part 1)


As I tell all the time I love JavaScript for being dynamic and accessible. Today's story about a bit strage but technologically an interesting co-mutation project :) Why I call it like that? It really is. Nowadays almost anything has JS binding and with Node connecting them together is so easy. Caveat: to make it nice is so hard on the other side, though. That I call art.


I was on a dating site looking for matches and realized there are certain match patterns associated to different regions on Earth. So I was wondering how could I see the best where are the best matches for me. The dating site offered a simple flat list. You can see some relevant result but no real statistical or visual value. Why not collect them on a map - so you could see it like a heatmap - sounds like an evil plan ]:D

What do we need? First we need to get the data. List of ladies with their location and match score. Having the site no public api we need a crawler. That's not too hard, we can make a simple bookmarklet that you can press whenever want to add a list to your collection. We store them and then fetch it on a map. We better do caching so the geolocation API won't be mad at us.

So our recipe list is: a bookmarklet, a NodeJS server listening to the incoming lady data, a basic NodeJS routing service - so we can use the same instance, MongoDB to save the information, Google Maps API to build the map and present the markers and finally a little geolocation support. And tea, that's very important. Green.

The bookmarklet looks like this:

      var _ = 'getElementsByClassName';
      var $ = 'innerHTML';
      var u = document[_]('match_row');
      for (var i = 0; i < u.length; i++) {
        var l = encodeURIComponent(u[i][_]('location')[0][$]);
        var m = parseInt(u[i][_]('match')[0][_]('percentage')[0][$]);
        var n = u[i][_]('username')[0][$];
        var url = 'http://localhost:8888/save?location=' + l + '&match=' + m + '&username=' + n;
        console.log(url);
        var xmlHttp = new XMLHttpRequest();
        xmlHttp.open("GET", url, true);
        xmlHttp.send(null);
      }


I made some shortcuts so we fit fine in the URL. It does one simple thing - iterates through the match list and collect the information and send it to our keen NodeJS endpoint. If you develop a similar thing you might want to save a sample HTML snippet and create your bookmarklet first as a function. Why? Cause at the end it looks like this:

<a href="javascript:var _='getElementsByClassName';var $='innerHTML';var u=document[_]('match_row');for(var i=0;i<u.length;i++){var l=encodeURIComponent(u[i][_]('location')[0][$]);var m=parseInt(u[i][_]('match')[0][_]('percentage')[0][$]);var n=u[i][_]('username')[0][$];var url='http://localhost:8888/save?location='+l+'&match='+m+'&username='+n;console.log(url);var xmlHttp=new XMLHttpRequest;xmlHttp.open('GET',url,true);xmlHttp.send(null)};void(0);">Ladycrawler</a>


Also, not sure if you see we cross the line of the miraculous SOP by calling a different domain (our localhost) - but don't worry, we only need the request to be received and not be returned ;) Yay.

So now let's write the server. You've downloaded the NodeJS install package and set it up. Let's make our little servlet:

var http = require('http');
var url = require('url');
var server = http.createServer(onRequestReceived);

server.listen(8888, 'localhost');

function onRequestReceived(request, response) {

  var parsed_url = url.parse(request.url, true);

}


It's a bit more already, we have the URL parser and the HTTP server running at this point. Let's start to write our dead simple router for '/save' path:

  switch (parsed_url.pathname) {
    case '/save':
      doDataSave(response, parsed_url);
      break;
    default:
      response.writeHead(200, {'Content-Type': 'text/plain'});
      response.end('NodeJS server is working' + "\n");
      break;
  }


And the call handler for that:

function doDataSave(response, parsed_url) {
  if (!parsed_url.query.hasOwnProperty('username')) {
    return;
  }

  // save the data somehow

  response.writeHead(200, {'Content-Type': 'text/plain'});
  response.end('Request has been processed' + "\n");
}


You can see we do not much, return if there is data missing and provide 200 otherwise. Now we should set up the model backend. Let's install the MongoDB binaries and the MongoDB NodeJS connector.

When you work with MongoDB in NodeJS you will see it's rather event driven than not. It gives a little uncertainty when opening and closing db instances, so I decided to keep one connection alive and use that for storage actions.

Let's make a separate file for db handlers and initiate the connection:

var mongodb = require('mongodb');
var host = process.env['MONGO_NODE_DRIVER_HOST'] != null ? process.env['MONGO_NODE_DRIVER_HOST'] : 'localhost';
var port = process.env['MONGO_NODE_DRIVER_PORT'] != null ? process.env['MONGO_NODE_DRIVER_PORT'] : mongodb.Connection.DEFAULT_PORT;
var db_name = 'okcupid';
var collection_name = 'girl';
var db = new mongodb.Db(db_name, new mongodb.Server(host, port, {}), {native_parser: true});

exports.openCollection = function(callback) {
  db.open(function(err, db){
    onDBOpen(err, db, callback);
  });
}

function onDBOpen(err, db, callback) {
  db.collection(collection_name, function(err, collection){
    onCollection(err, collection, callback);
  });
}

function onCollection(err, collection, callback) {
  callback(collection);
}


What it does is looking for the Mongo server on the default access path and creates a proxy instance. What we have to do is to get this instance - so business logic can do its work with it. That's the openCollection call. In NodeJS you extend the basic - how to say - object (or whatever) and by doing that you will access to module functions - because this MongoDB handler is a module now.

Let's write the accessor in the main server file:

var okcMapStorage = require('./okcmap.storage');

var collection = null;

okcMapStorage.openCollection(function(_collection) {
  collection = _collection;
  console.log('Db is ready to use');
});


Out global (agile) collection variable now is ready to work :) Let's add the save action to our doDataSave function to save the person info:

  collection.findOne({username: parsed_url.query.username}, function(err, item) {
    if (!item) {
      collection.insert(parsed_url.query);
    }
  });


We check if the person is already in the database - if it isn't then we add it. Brilliant, we have our storage working just fine.

If you want to check you can use the command line mongo client:

cd MY_MONGO_PATH/bin
./mondodb
# Now server is running.
./mongo okcupid
db.girl.find().count();
# Total number.
db.girl.find();
# List.


In the second part we will create the Map that displays the items, the geolocation service and the associated NodeJS backend.

---

Do you have a similar mutant project? I'd love to hear about the concept. Please, share.

Peter

2 comments:

  1. Nice one Peter!

    Have you considered using upsert (http://docs.mongodb.org/manual/applications/create/#crud-create-update) instead of insert when saving the lady-data? In that case you will not loose track of them when they relocate. :P

    ReplyDelete
  2. Oh man, that's brilliant! Thanks a lot Balint! Also it could filter out the high-maintenance (likes travelling) girls ;)

    ReplyDelete