Long Polling – tutorial, part 2

how to mimic “server push” with “client pull”?

As You could already see at wikipedia, long pooling means just requests (in our case ajax requests) which can wait quite a long time for server response. Server doesn’t send a single bit of information until something interesting (for the request) appears.

There are two possible results:

  1. something appears (in our case – new post in database), so server sends this info, request hits a success and do something with received data
  2. nothing appears for some time, and request time-outs.

Both possible paths ends the same – request starts another one to wait for something to happen.

You probably already have some ideas how to solve it. If so – please try it (and if it differs from mine solution – I’d love to see it, please post it in comments.)

There are some pitfalls, though. I’ll show most of them.

assumptions

Thing to remember at this point is where request are always going. So it’s not a good way to send html code back and forth. XHttpRequests should expect JSON responses (cleaned of white-spaces if possible).
Another conclusion is: if there’s nothing to send from server it’s better for xhr to timeout than receive response “nothing new”. Even though this is not elegant solution (time-out is still some kind of error).

We should decide how much of information should be send to server in order to get new messages in the thread. The minimum is the thread’s root id and last message id (biggest one). With that information it is possible to get all messages that are in the subtree which is rooted in particular id and their id is bigger than the very last one. But in that case we at least need to make two queries: one to get root lft and rght values, second to get all his children (which ids are between roots lft and rght).

The second approach is to send all messages ids in thread and just get all messages that have parent_id in that set.

My approach is that one should use cleaner solution (1st one) because we don’t have enough data to decide which one is faster (when the application is in production mode You could do some research. Of course we could make some fancy simulations but it is not the point of this tutorial).

So create an action that takes two parameters (tree root id and last message id in the thread) and sending back new messages (if any) or do nothing (not even sending empty response).

//controllers/posts_controller.php
	function get_new($rootId, $lastMessageId){
		$root = $this->Post->read(null, $rootId);
		
		$posts = array();
		
		while(empty($posts)){
			$posts = $this->Post->find(
				"all",
				array(
					"conditions"=>array(
						"and"=> array(
							"Post.lft > " => $root["Post"]["lft"],
							"Post.lft < " => $root["Post"]["rght"],
							"Post.id >" => $lastMessageId
						)
					)
				)
			);
			if(empty($posts)){
				sleep(2); // sleep for two seconds
			}
		}
		$this->set("posts", $posts);
		$this->render("get_new", false);
	}

// views/posts/get_new.ctp

You can try it out in Your browser. But if You choose /posts/get_new/1/8 when 8 is the very last message in this thread it would choke your server up ;) For testing use parameters where at least one message will appear (in the example: /posts/get_new/1/7).

Server issue

This clash on server is connected to sessions. Apache waits to the end of every scripts until it saves session data, at the end it implicitly closes session. For some reason Apache cannot release thread for this request if session is not closed (even if You hit cancel in Your browser, what triggers kill for the processing request in the most of the cases). What you need is to add session_commit(); at the very beginning of actions which are long polled. (if You need to write to session – do it as fast as it’s possible before while loop).

//controllers/posts_controller.php
	function get_new($rootId, $lastMessageId){
		session_commit();
		$root = $this->Post->read(null, $rootId);
		//...

Now we have nice action which gives us JSON string with new messages. Lets load them to our thread.

First of all we’ll need root id and last post id stored in Thread object. Root id is in the first element’s parent_id, last post we’ll find while generating thread tree.
Let’s add those properties to Thread class:

//webroot/js/thread.js
(function(window, document, udefined){
	window.Thread = function(_posts) {
		this.posts = _posts;
		this.rootId = false; //<==
		this.lastId = false; //<==
		this.extendJQuery();
		return this;	
	};
	
})(window, document);

Store root's and last message ids in createThread function:

//webroot/js/thread.js
Thread.prototype.createThread = function(){
	if(this.posts.length <1){
		return this;
	}
	this.rootId = this.lastId = posts[0].Post.parent_id; //<==
	var _this = this; //need to remember this, because in each method this means actual object from collection we iterate through
	$(this.posts).each(function(k, v){
		if($("#"+v.Post.parent_id).find("ul").length == 0){
			if($("#"+v.Post.parent_id).find("div.reply").length == 0){
				_this.createReplyDiv().appendTo("#"+v.Post.parent_id);
			}
	        $("
    ", {class: "posts children"}).appendTo("#"+v.Post.parent_id); } $("
  • ", { id: v.Post.id, innerHTML: "
    "+v.Post.message+"
    " , class: "post child" }). append(_this.createReplyDiv()). appendTo("#"+v.Post.parent_id+">ul"); if(v.Post.id>_this.lastId){ //<== _this.lastId = v.Post.id; } }); return this; };

Now I want to get data from get_new action, but my thread object have no idea where to get it from. Let's allow to pass endpoint url to constructor.

//webroot/js/thread.js
(function(window, document, udefined){
	window.Thread = function(_posts, _updateEndpoint) {//<==
		this.posts = _posts;
		this.rootId = false;
		this.lastId = false;
		this.updateEndpoint = _updateEndpoint; //<==
		this.extendJQuery();
		return this;	
	};
	
})(window, document);

update Thread object creation:

//views/layouts/default.ctp
		$(document).ready(function(){
			if(!window.posts) return false;
			window.thread = new Thread(
				posts, 
				"<?php echo $html->url(
							array(
							'controller'=>'posts', 
							'action'=>'get_new')
				);?>"
			).createThread();	
			console.log(thread);
		});

and create update() method. It should get JSON with new posts, and in case of either success or time-out Thread.update() should be called again:

//webroot/js/thread.js
Thread.prototype.update = function(){
	if(!this.updateEndpoint || !this.rootId || !this.lastId){
		alert("Thread corrupted");
		return false;
	}
	_this = this;
	$.ajax({
		url: this.updateEndpoint +"/"+ this.rootId+ "/" + this.lastId,
		dataType: 'json',
		timeout: 5000,
		success: function(){
			_this.update();
		},
		error: function(){
			_this.update();
		}
	});
	return true;
}

bug #1

I've found a bug after adding post with id=10 to the thread. It appears that I was storing lastId - the biggest id but in... dictionary order (check in JS: "10">"9" => false). Please patch last if statement in Thread.addPost() like that:

//...
		if(parseInt(v.Post.id)>_this.lastId){
			_this.lastId = parseInt(v.Post.id);
		}
}

Call newly created method in Thread.createThread():

Thread.prototype.createThread = function(){
	var _this = this;
	if(this.posts.length <1){
		return this;
	}
	this.rootId = this.lastId = posts[0].Post.parent_id;
	$(this.posts).each(function(k, v){
		if($("#"+v.Post.parent_id).find("ul").length == 0){
			if($("#"+v.Post.parent_id).find("div.reply").length == 0){
				_this.createReplyDiv().appendTo("#"+v.Post.parent_id);
			}
        	$("
    ", {class: "posts children"}).appendTo("#"+v.Post.parent_id); } $("
  • ", { id: v.Post.id, innerHTML: "
    "+v.Post.message+"
    ", class: "post child" }). append(_this.createReplyDiv()). appendTo("#"+v.Post.parent_id+">ul"); if(parseInt(v.Post.id)>_this.lastId){ _this.lastId = parseInt(v.Post.id); } }); this.update(); //<== return this; }

You can check now if it's trying to get new data again and again. Try to add new post in other browser and see what happens with request...
As You can see - there's ok response but from now on every request is getting the same response, and the thread isn't updated. We'll fix that now. First extract the method that creates post elements from createThread method - we will reuse it...

Thread.createThread should look like that:

Thread.prototype.createThread = function(){
	if(this.posts.length <1){
		return this;
	}
	this.rootId = this.lastId = posts[0].Post.parent_id;
	var _this = this; //need to remember this, because in each method this means actual object from collection we iterate through
	$(this.posts).each(function(k, v){
		_this.addPost(v.Post.id, v.Post.parent_id, v.Post.message); //<==
	});
	this.update();
	return this;
};

and our new method Thread.addPost:

Thread.prototype.addPost = function(id, parentId, message){
	if($("#"+parentId).find("ul").length == 0){
		if($("#"+parentId).find("div.reply").length == 0){
			this.createReplyDiv().appendTo("#"+parentId);
		}
        $("
    ", {class: "posts children"}).appendTo("#"+parentId); } $("
  • ", { id: id, innerHTML: "
    "+message+"
    " , class: "post child" }). append(this.createReplyDiv()). appendTo("#"+parentId+">ul"); if(id>this.lastId){ this.lastId = id; } }

bug #2

I encountered even more important bug - we caused a memory leak in our server (just wait long enough with get_new requests running) - after xhr time-out the script is not terminating.
I added a simple counter to get_new method:

//controllers/posts.php
	function get_new($rootId, $lastMessageId){
		//...
		$counter = 2; //<==
		while(empty($posts)){
			$posts = $this->Post->find(
				"all",
				array(/*...*/)
			);
			if($counter>5){ //<==
				break; //<==
			}else{ //<==
				$counter += 2; //<==
			} //<==
			if(empty($posts)){
				sleep(2); // sleep for two seconds
			}
		}
		$this->set("posts", $posts);
		$this->render("get_new", '');
	}

Why is it incremented by two? It's because there's 2 seconds sleep, so I'm checking if the script slept for at least 5 seconds altogether (5 seconds is the timeout of ajax request).

To finish - call extracted method in success callback:

//webroot/js/thread.js
Thread.prototype.update = function(){
	if(!this.updateEndpoint || !this.rootId || !this.lastId){
		alert("Thread corrupted");
		return false;
	}
	_this = this;
	$.ajax({
		url: this.updateEndpoint +"/"+ this.rootId+ "/" + this.lastId,
		dataType: 'json',
		timeout: 5000,
		success: function(data){ //<==
			$(data).each(function(k,v){
				_this.addPost(v.Post.id, v.Post.parent_id, v.Post.message); //<==
			});
			_this.update();
		},
		error: function(){
			_this.update();
		}
	});
	return true;
}

That's all. Of course with longPooling (and soon with web sockets) you can implement some more cool stuff. You can display information about somebody typing a reply right now in some particular place.

There are some details to be polished, maybe newly added post should attract attention (blink?), but it's not the case of this tutorial.

I hope you enjoyed it, please let me know what you think about it.

Share Button

Leave a Reply

Your email address will not be published. Required fields are marked *