Announcing Orbited Live Help

August 23rd, 2007 by Jacob Rus

After a couple days working on it, Orbited Live Help, which I mentioned last month, now seems to mostly work. There are still a few features I want to add to it, and a couple bugs to work out, but you should be able to click that link, and use Live Help to join us in the #orbited IRC channel, where the Orbited developers can answer your questions. Or if you like, just stop by to say hello.

Orbited Live Help

The server-side code was left nearly untouched, from Mario’s WebIRC. But I rewrote all of the browser-side HTML, JavaScript (using jQuery), and CSS. The result is something that I think better serves its purpose as both a demonstration Orbited application (though it’s perhaps less impressive, it’s simpler to use), and as a live support chat (where extra features just get in the way). Also, it no longer has a garish color scheme: Sorry Mario, but CSS “red” should almost never be used in a design. ;)

The JavaScript code is much cleaner, even if some functionality was removed. As one example, the code to handle the orbited event where someone leaves the chat room went from this:

function userLeft(data) {
    chat_box = document.getElementById(data[1]+'text');
    div = document.createElement('div');
    div.className = "left";
    sender = document.createElement('span');
    sender.className = "sender";
    sender.innerHTML = data[0];
    message = document.createElement('span');
    message.className = "message";
    message.innerHTML = data[1];
    linker = document.createElement('span');
    linker.innerHTML = data[2];
    div.appendChild(sender);
    div.appendChild(linker);
    div.appendChild(message);
    chat_box.appendChild(div);
    if (currentChan == data[1]) {
        box = document.getElementById('box');
        box.scrollTop = box.scrollHeight;
    };
};

To this:

function userLeft(leaver, channel, message) {
  $("<div class='informative part'></div>")
    .html("<span class='user>" + leaver + "</span> " + message + CHANNEL)
    .appendTo("#chathistory");

  removeName(leaver);
};

Needless to say, I vastly prefer the latter for clarity.

The new chat now highlights lines where someone mentions you by name, and the next version will handle tab completion of nicknames. I’d like to prettily style the user list, and maybe add some icons to it, but there’s plenty of other things to do around here in the mean time.

[Update, 2007-08-29: Add image that actually shows some people in the channel]

Widespread Proxy Dependence

August 19th, 2007 by Michael Carter

When I was experimenting with Orbited early on, before it was an Open Source project, I became very, very annoyed by browser security. Specifically, for some god-forsaken reason it’s just this side of impossible to do cross domain scripting even when you are running both servers, and on the same parent domain. Sure, at this point it seems straightforward to me, but that’s only because I spent a year tackling the problem.

If you are interested in cross-domain scripting, read what some of the experts have done. Such as Abe Fettig. For the other 99.995% of you who don’t give a rat’s ass, I included a proxy in Orbited.

The proxy, as you may well already know, overcomes the problem of cross-domain scripting by putting Orbited in front of your web application such that the browser thinks orbited is the Comet server and the web app.

The proxy is a testing and development tool though. In a production environment it is a pretty bad idea to use the proxy. Not only will it make it much harder for you to scale your application and achieve true redundancy at all tiers, it will use at least twice the cpu as you should be using.

Even so, it turns out that 90% of developers only want to use the proxy. From what I’ve gleaned, most people want to create prototypes or in-house apps. I think that the idea that anyone could use Comet is so new an idea that developers of large web applications are staying away. It’s going to take a new startup (maybe meebo) getting huge before all the other sites out there start integrating Comet.

In the mean time, as the prototypes roll, I’m left holding the bag. I had absolutely no idea what I was getting into when I included the proxy.

Proxy v1

When I first started, I was using CherryPy as a backend for all of my test and demo apps. I shut off CherryPy’s multi-threading because I wasn’t interested in generating all sorts of thread synchronization bugs. As a result, I had to also disable keepalive. With a single thread and keepalive, one user would tie up the server.

Given that keepalive was off, it was very clear how to proxy incoming requests. Examine the url. If it looks like an Orbited url, then hold on to the connection. Otherwise, check the proxy rules and send it on its way. After I designated a connection as proxy, I never looked at it again.

Proxy v2

I wrote an app that used more than one thread, and I enabled keepalive. Whoops. Suddenly my orbited requests were sometimes going straight through to the webapp. Whats going on? Well, my proxy would look at the first request coming down the pipe and then never look again. So the second request, which was for Orbited, got proxied as well.

Some long hours later, and I had a new proxy. It would look at all the data as it flowed from the client to the server. At the end of each request, it turned the connection back over to Orbited. When the next request came down the pipe, Orbited ran its usual check to see if it was for Orbited or for the proxy.

Proxy v3

So along comes a django developer. He writes a simple version of the cherrychat app in django, sets up the proxy, and nothing happens. No requests ever make it back from django to the proxy. And get this — the error reported is: “<type ‘exceptions.TypeError’&rt;: exceptions must be classes, instances, or strings (deprecated), not type” with no useful line number.

Turns out pyevent 0.3 and python 2.5 on ubuntu 7.04 causes exceptions generated by Orbited to be reported incorrectly. So I patched pyevent 0.3 and tried again. After much grief, the problem became clear. Django uses a rarely seen feature of HTTP/1.1 known as Transfer-Encoding: chunked;. It allows Django (or any other web framework) to render templates incrementally and send the results as they are available. The reason this wouldn’t be possible without chunked transfer is that in HTTP/1.1 you need to specify the Content-Length header before any content. But Django can’t know how long the content will be until it’s rendered.

So I rewrote the proxy yet again, because the previous code base wasn’t very conducive to this sort of interaction. This time though, the rewrite took about two hours because I had the old version to reference. After I finished, my Django tester disappeared never to return, apparently discouraged by the previous lack of Django support. I tried to do my own Django thing but I got lost. And I couldn’t make it use HTTP/1.1, much less chunked transfer encoding. So someone please test this when you get a chance.

Proxy v3.1

After coming all this way, I stopped supporting a subset of HTTP/1.0. That is, If not Content-Length header is specified, the connection just ends when it ends. But I was assuming, as per HTTP/1.1 that it meant the length was 0. This wasn’t a very involved fix, but I’d like to thank Matthew (desmaj) for finding this bug.

So the whole Ordeal was definitely worth it — I know all about HTTP Protocol and how it effects comet, proxies, and comet proxies. The proxy is more stable, which is great. I think that it will continue to be the main method of deploying Orbited applications because it’s just so much easier. It makes the difference between a 10 minute tutorial and a 30 minute tutorial. Just because of the proxy which isn’t a viable real world tool, we probably have a 1000% higher retention rate of prospective Orbited users. It just goes to show you that ease of use is what makes something popular. No one cares about scalability until after they’ve chosen a framework.

New Logo

August 13th, 2007 by Jacob Rus

The original Orbited (then known as “Orbit”) logo, was tossed together by one of Michael’s friends in about 5 minutes. It has a certain charm in its complete simplicity, but I was never much of a fan, and decided that though I’m not much of a logo designer, once I joined the team, the logo had to go, and if someone else wouldn’t step up to make a new one, I’d have to do it myself.

See the original for yourself:

Original Logo

I doodled in Illustrator, trying to incorporate various types of “orbits” into the word “Orbited.” After a few hours, I had found a font that I liked well enough (Cronos), with a very round O that worked reasonably well as a planet or star around which to draw an orbit. After a couple of hours of fiddling, I had created a brush stroke swoosh tail with a splotch comet, orbiting the O (yes, I realize tails on comets point away from the Sun, and don’t trail behind them):

Early logo attempt.  The black Cronos font had insufficient contrast with the background.

I passed this around to a few friends, who pointed out that the black text was nearly unreadable on the dark blue background. But a couple of them, including Eben, also disliked the font and the brush stroke. He was nice enough to spend some time brainstorming with me. The two of us passed back and forth a number of ideas for adding depth to the orbit line.

One of my logo ideas, with the orbit not intersecting the text, and with a solid “O” as my sun.

A logo of Eben's, with a circular orbit .

A logo of mine trying out Futura as a typeface (which also conveniently has a completely circular capital "O")

Eben got the idea to use a gradient on the orbit to convey depth.

I liked the concept of using the line width to convey depth, as in the third of the above attempts, and the approach worked well with a clean modern font like Futura (which conveniently has a very nice round capital O), and so I finally ended up with something that looked like this:

Resulting oval orbit logo, using the Futura font.

But in passing this around to friends, a quite common response was that the new logo was too cold and impersonal. They generally preferred a re-colored (to avoid the black text) version of the brush stroke logo.

So, combining (in my opinion) the best of each, I stuck Futura into the brush stroke comet orbit, and jazzed the whole thing up with some warmer colors. The result can be seen on the main website, but here it is for good measure:

The final website logo, with white Futura Orbited orbited by an orange splotch comet with a brush-stroke tail.

Hopefully this one is good enough.

Announcing Orbited 0.1.4

August 8th, 2007 by Michael Carter
  • Rewrote the Proxy
    • Django didn’t use to work. Does it now? Calling all Django Developers
    • Added transfer-encoded: chunked
    • New keepalive Engine
    • Arbitrary keepalive timeout
    • Fixed missing transport bug (doesn’t crash now)
  • New User Group @ Google Groups

RailsChat Tutorial

August 1st, 2007 by Jacob Rus

CherryChat and the CherryChat tutorial have been on the website as long as I’ve known about Orbited, but finally they not alone, having now been joined by a RailsChat tutorial. Ruby on Rails is kind of a beast: it installs hundreds of (from my point of view) useless files and directories, and requires a developer to run a script to create a new section of a site. But I suppose it works well enough for what it’s designed for. After giving up on the discussion in the #rubyonrails channel, I asked some Ruby programmer friends how to save state, and they explained to me that adding $ before a variable name makes it global. Not long later, RailsChat was finished. The client-side code in RailsChat is pretty much copied directly from CherryChat. And the resulting apps behave pretty much identically.

It is however amusing to compare the directory structure of each version:

CherryChat:
Directory Structure

RailsChat:
Railschat Directory

And of course the money shot (a working chat app!):
RailsChat screenshot

Gaining Momentum

July 24th, 2007 by Michael Carter

I’ve contributed to Open Source projects in the past, but this is the first time I’ve stood behind a project. I must say, it’s a lot of work. But it’s worth it because we’re developing technology that’s on the cutting edge. We’re really pushing the limits of HTTP and making way for a new breed of web applications.

I’m finally starting to get all sorts of emails about the project. It’s gratifying to know that someone out there reads this stuff that I write, and even goes as far as to try out the tutorials.

In more quantitative terms,

  • I’ve been contacted by more than two dozen developers
  • There’s generally 3-4 people in the IRC channel at all times, and 1-2 conversations a day (not counting core developer conversations.)
  • There are three documentation contributors who aren’t me
  • Much wider testing against a diverse range of platforms
  • Tagged by 20 people on del.icio.us

I’ve spent so much time developing that I haven’t done enough evangelizing. I really ought to get our name out there to all the Web 2.0 developer sites. But We aren’t going to do that just yet.

We are at version 0.1.3, which sounds like a pretty low number. This is intentional because we want to scare away developers who might get a bad image of Orbited if it doesn’t work for them right away or they encounter a bug. I’m targeting a 0.2.0 release in September, and at that point I’ll start shouting from the rooftops. Not just online either. I want to publish a few papers and attend some conferences.

In short, we are up momentum. We’re still a ways from having a large user base, but we’ll get there. In a few months we’ll be unstoppable. There’s interest in Orbited and it’s here to stay.

Slimmer WebIRC = Orbited Live Help

July 24th, 2007 by Jacob Rus

As Mario already mentioned, I find his current WebIRC interface a bit confusing. It’s non-obvious how to log in, and how to join channels, and completely unintuitive for a click on the current channel’s name to close the channel. Moreover, Orbited could use a good IRC demo, to log people into the #orbited channel, so that we can help them. So I’m working on cutting the feature set down to a minimum, slapping on some nicer-looking CSS, and putting the thing up on the main Orbited website, where non-IRC-savvy users can open it up to get Orbited help.

WebIRC screenshot

I shouldn’t have peeked at Mario’s JavaScript code for WebIRC. It works, but it’s confusing (what in the world is data[0][n][1]? Who named these variables anyway?), and it’s far more verbose than necessary. I’ll probably end up completely rewriting it. Fortunately the server-side shouldn’t need to be altered much if at all.

Orbited + Cometd

July 18th, 2007 by Michael Carter

I had a revelation: Cometd and Orbited are not at odds. Cometd is not competing with Orbited — each provides a unique set of features, and they solve problems in different domains. Moreover, a hybrid of the two technologies could result in a system more flexible than either alone.

I want to back up for a moment and reference the SQLAlchemy site. This is the very first time I saw a software system praised as being able to “scale down.” What the heck is that? We’re all in a rat race to scale up and it never occurred to me that there could be benefit to moving in the opposite direction. But then I thought about my past introduction to Python. The year was 2002, and I wanted a “web framework”, and someone so politely pointed me towards Zope. I was intrigued as it was the first whole-solution system I’d ever used for building web pages. But within six months I stopped drinking the kool-aid and despaired. It was so hard to make simple apps with Zope. There were endless repositories of great Zope packages and features that eventually just got in the way of development. I had to work around them to make something simple.

So here’s my pitch for Orbited. It scales down. It allows you to use simple APIs to create as lightweight a solution as you like. For all the small-scale, in-house, or prototype applications, you just want something easy and fast to develop. That’s Orbited.

Of course, SQLAlchemy also claims to “scale up”. Turns out that the same is true about orbited. It is designed around libevent for maximum speed on a single node. More importantly though, its simplicity allows you to scale laterally such that 5 orbited nodes support exactly 5 times as many users as 1 orbited node.

Something is missing though, and it smacks of the word Enterprise. I personally detest the beastly word, but there are lessons to be learned from it. An Enterprise solution has three requirements, generally. 1) Clear Scaling Path, 2) All-in-One solution, and 3) Java, as far as I can tell anyway. Orbited has a clear scaling path, check. Orbited also has a Java client library in the works (check… basically). It is missing the second requirement.

Enter Cometd and the Bayeux Protocol. The great thing about Bayeux and Cometd is that they handle every conceivable aspect of HTTP Push that you could ever hope for. There is event confirmation, room for authentication, pubsub, pluggable transports, complete dojo integration, and much more. I think Orbited does pretty well itself in the areas of authentication and pluggable transports, but the real issue is that we are missing the publish/subscribe support. This pattern of development is an overkill for many HTTP Push apps, but it is mighty helpful for writing anything that resembles a set of chat rooms, as I suspect many comet apps will.

Which brings me back to my revelation: Cometd and Orbited are not competitors. Orbited handles the transport layer of HTTP Push. Cometd can be built on top of Orbited without much difficulty. It’s a bit of work to implement the Bayeux protocol, but it’s clear how to go about it. The real issue I have with Cometd is that its current architecture is inherently hard to distribute across server nodes. Please note: this is not a theoretical shortcoming of Cometd or Bayeux; rather, the technology is scalabilty-bound by current implementation details. I am ready to jump on the Cometd bandwagon, and I bring Orbited to the table. From Orbited to Cometd: Let us handle the difficulties of scaling, both vertically (libevent) and laterally (share nothing), and we’ll let you take care of the protocol, Bayeux, and the browser-side development, dojo toolkit.

The last piece of the puzzle is as easy to find as your favorite IRC daemon. Or perhaps, for the truly hip, Jabber conferencing. Any distributed, channel-based system that scales laterally can effectively handle the subscription functionality necessary to implement the Bayeux protocol. I’ve started work on a Jabber-Orbited interface, and Mario Balibrera has already made substantial progress on a IRC-Orbited interface. The jury is still out on names for these two interfaces, so send me an email if you have any ideas.

I’ve identified all the pieces, so at this point I give this fledgling union of cutting edge technologies a name: TailSpin. That’s certainly how I feel when I try to wrap my head around the scope of operations necessary to power a modern web application. And it takes elements of both Orbited and Cometd.

Here’s some visualization of the proposed system. You can see how it compares to other Comet-style stacks at our Stack Comparison page.

TailSpin stack

In summary, I don’t think Cometd is a bad idea. I think that packing so many layers of functionality into a single project is a bad idea. From Cometd we should take Bayeux, but leave the rest of the stack to technologies better suited to solve lower-level problems. The result of decoupling these layers is that experts can tackle each field directly without the headache of interaction bugs. Furthermore, end-developers can use whichever parts of the stack they feel they need. If they want low-level but easy-to-use HTTP Push, then they can just build on Orbited directly. If they are creating limited-user-base applications, then they might be better served by something like a current Cometd implementation—a single process solution for Bayeux. If they want publish subscribe and don’t need the browser at all, then they can run an IRC server or run Jabber.

Nobody Does It Better (than Jacob)

July 16th, 2007 by Mario Balibrera

So Jacob loves WebIRC. Seriously, he can’t get enough. I think we’re going to have an intervention soon.

The server’s going strong without any major changes since I wrote it two long months ago. In fact, today is WebIRC’s two-month birthday, and someone didn’t send a card. Anyway, it’s about as stable as you can get. I had a computer connected for 52 hours before it crashed for unrelated reasons. Seriously, it does that all the time, I need to reformat. In fact, 52 hours might be the longest it’s gone without crashing lately.

Jacob’s really been doing some exciting things with the application, taking it in a direction I never really considered. He’s simplifying it, for starters, stripping away all the functionality that seemed cool 15 years ago but just confuses people used to Google Talk. He’s developing it into something that should debut in the next month or so on Orbited.org: a very small (in pixels), lightweight little marvel that you can just stick in the corner of any old website. That’s just plain awesome.

Ok, if you really miss it that much, the old WebIRC will run here for a while.

HaloD — Dynamic Load Balancing for Orbited

July 10th, 2007 by Michael Carter

HaloD is a daemon that performs dynamic load scaling for an orbited cluster.

Before you get too excited, note that this is a proposed project. It is in the planning stage. No code exists yet, and I am not going to tackle the project until I feel that Orbited is 100% ready for production use. (What’s the point of scaling Orbited if you can’t use it?) I welcome any contributions or developers who wish to join the project. This isn’t for the faint of heart though. Implementing HaloD requires writing low-level C code that interfaces directly with the TCP/IP stack in the kernel via the Netfilter project. Another side-effect of this is that HaloD will be platform specific: Linux. This is acceptable to me though because most web apps that use open source software tend to be running on Linux anyway.

With that said, I’ll get on with it.

Name

The name HaloD comes from a combination of the terms “halo orbit” and “daemon”. It is pronounced like the english word “hallowed” but “halo dee” is also an acceptable pronunciation.

“A halo orbit is an orbit around a Lagrange point between two larger bodies. Because the orbit tends to be unstable stationkeeping is required to keep an object such as a satellite in this orbit” — Wikipedia.

How it Works

HaloD accepts incoming connections from the orbited nodes. Whenever an orbited node starts, if configured to do so, it immediately opens a connection with the HaloD server. The HaloD server decides to either 1. Keep the node in reserve, or 2. Add the node to the active cluster.

Browsers and Orbited clients connect to the cluster by connecting to the HaloD machine. For every port that HaloD exposes to browsers, there will be a corresponding port exposed to Orbited Clients (web apps).

The key to understanding HaloD is to think of a typical commodity router. It has a single ip exposed to the outside world, yet allows access to a multitude of intranet computers. It does this by designating certain ports to certain machines and keeping track of which ports map to which machines. So if machine1 makes a request to google.com, the router sends the request out on port 23432, and remembers that incoming traffic on port 23432 maps to a local port on machine1.

Similarly, HaloD is a custom router. It listens on a range of ports, and then maps them to the Orbited cluster nodes behind it. The reason this is helpful is that the end destination of a request to port 500, let’s say, might change, but the browser or web app is none the wiser. Hopefully the following diagram will help you visualize HaloD.

HaloD Architecture

The diagram only shows how browsers connect to HaloD and are forwarded to Orbited nodes, but the same method is used for web applications connecting to HaloD.

Routing Algorithm

Assume for a moment that there is a max of N nodes; let N be 24. For the first node, no matter which of the 24 ports you connect to HaloD on (for either the browser or the client), the packets will be routed to the one (and only) orbited node. So there are 24 routes to a single orbited node. When a second node is added, 12 of the routes are moved over. They are now directed at the second node. When a third node is added, 1/3 of the routes are moved to the third node.

It wouldn’t do the choose any arbitrary routes to move upon adding a node, we need to do it systematically. The formula to determine which routes are moved is as follows:

Algorithm setup

Assign an id from 1 to N to each route.

Adding each new node (algorithm)

  1. The new node is given an id of M + 1 where M is the largest id of an existing node (if no nodes are present then M is 0)
  2. If no other nodes exist, all N routes are assigned to the new node. Finish.
  3. Sort the nodes by number of routes assigned (k), and secondly by node id
  4. Moving in descending order down the list, move assignment of the highest route id for the given node to the new node until floor(N / (M + 1)) routes have been reassigned.
  5. If this quota hasn’t been met but the list has been compeltely traversed, then start at the top and repeat the process (step 4)
  6. For all routes reassigned, send a ‘route moved’ message to each route’s former orbited node. The orbited node will issue a reconnect event to any users connected via that route if necessary. (It is necessary for the Iframe transport, but not for the XHR transport. For XHR just close all open connections and the browsers will re-connect.)

Progression of Routes as Nodes are added

The result of the above algorithm can most easily be understood by examining a picture. If we have 24 routes, the following diagram shows which routes are assigned to each of n nodes, for n between 1 and 8.

HaloD node routing algorithm

Failover

A major reason to use dynamic load balancing is to provide a failover mechanism. If one of your nodes fails, you don’t want to know about it right away. Rather, you want the entire system to continue functioning and you want as little downtime for any user as possible. Here is how we do this with HaloD:

If HaloD loses connection with an orbited node and is not able to immediately re-establish connection, then it reassigns all routes from that node to a failover node. This is the quickest and easiest way to keep service up at all times.

But sometimes you don’t have a failover node available. In this case, HaloD reassigns the node with the highest id (node M) to replace the failed node, then uses a reverse of the route reassignment process describe above to reassign all former node M routes to the appropriate orbited nodes given that we now have total nodes = N = M-1. Note that the client-side orbited implementation (javascript) needs to support automatic reconnection if the connection is lost. Because HaloD immediately recognizes an Orbited node failing, it can swap the routes in a matter of a few cpu cycles, so before the browsers even realize their Orbited Node is down, it’ll be back up.

Implementation Details

As I said before, this project requires per-packet processing, which is only possible at the kernel level, either directly or via the Netfilter project. That means that HaloD must be written entirely in low-level C. We also need to work at a level a bit higher than packet processing, as HaloD needs to accept incoming connections from the Orbited nodes for monitoring purposes. Assuming we want to put 1024 nodes in a single cluster, then we need to quickly process heartbeat signals from each of the nodes. This isn’t a lot of bandwith, but we want it to be snappy. I think the best solution here is to use either libevent or epoll directly. This should work well with our low latency, high concurrency requirements. And who knows, eventually someone may want to put 2048 nodes or more in a single cluster.

The routing table itself is just a hash table in memory. It may be a good idea to dump the routing table to a database or network device every time it’s updated. This allows the possibility of creating real-time failover solutions for the HaloD server itself.

Conclusion

I know this may seem an ambitious project at first glance, but the hardest part is testing it with a realistic setup. The routing algorithm is simple, as will be the heartbeat protocol. The project is necessary though to the future of real-time web applications. Maybe not HaloD in particular, but a similar solution will be needed, and sooner than you might imagine. I think that the hardest part will be attracting web-developers who understand the need for such a system, and are comfortable working in C. Most web developers today have little history with low-level languages — as it should be.