Sean McGary

Software Engineer, builder of webapps

Goodbye LAMP, Hello NodeJS

by on

When I first started developing web applications about five years ago, PHP was my language of choice because it was super easy to learn and a LAMP stack was trivial to setup on my local machine to get started. At the dawn Web2.0, PHP was booming as a web language. People began developing and rapidly prototyping applications with PHP and MySQL. You knew C or C++? PHP was even easier for you to dive into and get started with. When I started, I jumped on the Codeigniter bandwagon as it made development quick and provided the minimal amount of structure needed via the MVC design pattern so that you didnt just end up with tons of files of spaghetti code that wasnt structured at all. After a while with Codeigiter, I realized it was too simple. At the time, it lacked an autoloader and didnt play well with libraries that didnt comply to its limited interface provided to work with libraries. So I decided to build my own framework. Foundation-PHP (as I decided to call it) provided a similar structure to Codeigniter, but also allowed me to change how classes were loaded and how objects were created, including the features that Codeignter lacked. At this point, I also had been introduced to MongoDB, so I decided to make that the default database handler instead of MySQL or Postgres. My framework wasn’t quite on par in terms of features compared to Codeignter, but it included all the features that I needed to build applications quickly.

Javascript comes to the server

During this time that I was hacking out applications built on PHP, other frameworks such as Ruby on Rails, or Django, or Pylons were starting to mature and developers were starting to move away from the battle tested LAMP stack. People started throwing around buzzwords like “high availability”, “big data”, “NoSQL”. PHP started its decent from a popular web language whith people starting to move to languages like Ruby and Python. Things like HTML5 and CSS3 were skyrocketing in popularity and usage. Browsers, with tons of help from Google Chrome and its rapid release cycle, started to incorporate these features that were’t even finalized by the W3C into their new versions so that people could start using these new exciting features. Javascript in particular, started to become very popular in due to developers using it to create applications with an almost desktop experience.

Then in 2009 NodeJS, created by Ryan Dahl, appeared. NodeJS consists of Google’s V8 Javascript engine along with a set of core libraries providing APIs to expose common lower level features such as file I/O, sockets, IPC mechanisms, etc. Because NodeJS is built with the help of Javascript, it is event driven providing purely asynchronous I/O thus minimizing overhead compared to traditional blocking I/O patterns.

Due to the asynchronous nature of NodeJS, it was able to perform very well when writing servers. A community quickly formed around NodeJS and started to contribute to its feature set. Developers started writing libraries and modules to work with databases, the underlying operating system, websockets, and graphics just name a few. Soon, developers jumped at the fact that NodeJS makes a fantastic web application server. Hell, an HTTP server library is built into Node’s core.

The great migration

Not to long ago, I decided to say farewell to PHP and being moving over to NodeJS. The first thing I did? (Build a web framework)[https://github.com/seanmcgary/NodeWebMVC]. Node is still very young and does not (yet) have all the frameworks and tools that more mature languages such as PHP, Ruby, and Python have. To me, having a small framework to get up and running is a huge advantage t orapid protyping and development of ideas.

Jumping in the express lane

By default, NodeJS comes with the necessary tools to create an, albeit simple, webserver. In less than 20 lines of code, you can have a “functioning” webserver that will send data to a browser upon connection. That however doesn’t do us much good when trying to create an application. Fortunately, the guys over at Visionmedia decided to build a little library called expressjs to make developing HTTP servers a bit easier. Express is built on a number of connect libraries giving you features like HTTP routing, sessions, and parsing POST, PUT, GET, and DELETE requests. Having these features puts express on par in terms of features and functionality with libraries such as Sinatra. You can quickly build a server that has routing and session handling, great for developing API servers, but it still lacks a little bit more structure needed for rich web applications.

Adding some structure

For my framework, I took express as a base and started to build around it. I very much liked the MVC pattern of Codeigniter, so I decided to sprinkle some MVC on top of express. Doing this allows the developer to clearly seperate code into controllers, models and views instead of just setting functions for specific HTTP requests. This also allows the developer to write code that is a bit more modular, taking advantage of inheritence for controllers and models. The one thing that I am loving about Javascript that is making development much easier is using Mustache (in this case Handlebars.js, a fork or mustache) for templating and view parsing. Now instead of needing to write code in view files, they are done in HTML with Mustache placeholders and then parsed serverside before being sent to the user. This makes everything much cleaner and less intrusive. This also makes the flow of data from datastore/database to view much simpler. View content can be fetched from the database and with very minimal modifications, sent directly to the view to be parsed. This is a HUUUUGE advantage.

Lets build some apps!

Since building this small framework to provide a bit of structure, Ive been able to start rapidly prototyping applications. Right now, the app I am writing, (Markdownwiki.com)[http://blog.markdownwiki.com], is a living test of my framework and allows me to constatly add features into the framework that I feel might be commonly needed by developers.

Now that my migration to NodeJS from PHP is nearly complete, I will be continually building out my NodeWebMVC framework for people to use. So if you’re looking for a framework to get started with building an app, I encourage you to check it out. I would love to get some feedback on it. As I develop it, I’ll tag stable points along the way so that you wont be confused if it for some reason it doesnt work. If you run into bugs, file them here on github and I’ll get to fixing them .Alternatively, feel free to fork it and fix them yourself and submit a pull request with your fix.

With NodeJS and server-side Javascript becoming a very prominent technology in web applications today, I figured Id give an introduction to NodeJS to get everyone up and running. “But how can Javascript run in a server environment, I thought it required a browser and was only used to make my website interactive”. NodeJS allows you to build applications written in Javascript with the help of Google’s V8 Javascript rendering engine that is at the heart of the Google Chrome web browser. Coupled with the asynchronous nature of NodeJS, V8 contributes greatly to Node’s performance and scalability.

"Okay, thats cool and all, but why should I use it?"

Theres really two reasons - Performance and the fact that Javascript is becoming a pretty universal language. If you’ve ever dealt with web development of any kind, chances are you’ve had some experience with Javascript. The performance comes from a combination of Google’s V8 engine, the other is the fact that Node is asynchronous and event driven, very much like JS applications that run in your browser. NodeJS is all a single thread.

"But wait, how does that make it more efficient than multi threaded applications? Wouldnt a single thread be a bottle neck?

If this were a traditional server implementation that would be true, but Node is a bit different. Instead of needing to manually identify events and spawn new threads based on those events, you simply register events that will act as callbacks when fired - very similar, if not exactly the same, as most applications of Javascript that will run in your browser. Since we’re not manually creating new threads for each new event, and each event is asynchronous, none of the events will block the NodeJS event loop. This allows NodeJS to execute concurrent connections very efficiently and quickly by receiving an event and then pushing it to the background to run, with the process notifying the server when its done.

"Thats pretty cool, how do I get started then?"

First off, you’re going to need a computer with a non-windows operating system (so anything Unix, Linux, or Mac OS X will work). They’re working on a build to work with Windows, but its not quite done yet. Next, grab the source from Github here. We’re going to be building version v0.4.x. NodeJS only has two things needed for building and installing - python 2.4 (or higher) and libssl-dev (if you plan on using SSL encryption). Now to build Node, cd into the directory you checked out and run the following:

This will build Node and install it to your path. Now, all you need to do to run Node on the command line is just run: $ node <your file name>.

Now, lets take a look at a simple NodeJS TCP server implementation:

Now lets take a more detailed look at what this is doing. First, we need to include the “net” library that is built into Node. This is done using the “require” function. Node comes with a bunch of libraries that allow you to create sockets, interact with the filesystem, make system calls, and much much more. You can also create your own libraries, both in C++ and plain javascript, to include in your Node applications. Now that we have the “net” library included, we can proceed with creating our server object. The server takes a single variable, we’ll call this “socket” because thats essentially what it is. We’ll use this to write and read data to/from our connection. Because Node is asynchronous, we need to add some listeners - connect, data, and end. The first two are pretty self explanatory, with the data event firing whenever the client sends data to the server. So, instead of creating a thread loop to block and wait for data from the client, we simply register an event for it that will listen while the rest of the server runs unblocked. When the server receives data, the callback function will be called and the logic contained will be executed. The last line in our file simply tells the server to start listening on a provided port and host. And thats it! It’s very simple, you just set some listeners and then just forget about it while it runs.

Send Mail Through Google Apps With PHP

by on

Recently while I was working on a site that Im creating, I needed a way to easily send email out to users. Like a lot of people that have domain names and dont want to run their own mail server, I have decided to let Google Apps handle all of my email and app related needs. Now way you dont have a mail server and dont want to maintain one. Maybe you dont know how or just dont want to deal with maintaining such a service. Well as it turns out, you can use Google Apps as your mail server. In this example, Im using PHP to send out a message using the SMTP server and an account that Ive setup on my Google Apps domain (one that is not my primary, admin account).

First we have to install some libraries through PEAR

Now that we have our two dependancies installed, we can write our code to send our mail. Basically what this does, is it connects to Gmail’s (or Google Apps) SMTP server and sends a message. This is pretty much the same thing that happens when you send an email when using a desktop client like Thunderbird or Apple Mail. All you do is supply it with Google’s SMTP server, port, and your username and password. NOTE: if you are using Google Apps, the username is going to be your-account-name@your-domain.com.

And voila! You can now send mail through Google!

Threaded TCP Server in Python

by on

Recently I finally decided to take some time to learn Python. So I figured the best way to learn something new is to dive right in and write an application. This application happened to be a new server for Computer Science House’s networking vending machine(s) ‘Drink’. The general idea behind Drink is that it’s a ‘communal refrigerator’ for CSH that the members ‘donate’ money to in order to stock it with delicious drinks (such as Coke products since RIT is exclusively a Pepsi campus). Being the geeks we are, these vending machines that we have on our floor must be accessible via the internet in some way, thats where the server comes in. The server needs to facilitate connections to each machine as well as accept incoming connections from clients that want to drop drinks and then shuffle those requests off the to tini-boards that control the physical machines. So immediately I was thrown into learning threading and sockets in Python.

Well thats cool and all, but youre probably asking why youre here. I mean, the title does hint that you’ll be learning something. This is true, we’re getting there so just sit tight for another minute. So one of the issues that needed to be overcome while writing this server was having an instance of a server that can server multiple clients at the same time and not have one client blocking the socket connection. Python, being the flexible language it is, offers you multiple ways to handle sockets, threads, and the combination of the two. In this process what we want to happen is have a server bound to a specific address and port, but once the connection is accepted, we want the server to scoop that connection to a semi-random port in its own thread so that we dont block other clients from connecting. First lets take a look at the server implementation:

First we start off by importing the socket and thread modules. Now, if we wanted to make this a threaded class, we could ‘from threading import Thread’ so that our server could inherit from Thread (this is an example of one of those many modules for threading I mentioned). Now if we look at our main “method” we define our host, port, and buffer size and we create a tuple called ‘addr’ to hold the host and port. Now in this implementation, we have created a SOCK_STREAM socket which is the same as a TCP socket. When making this a server, you have to remember to bind the socket to the addr tuple (this is not the case with the client we will implement). And finally we tell the socket to listen for connections. The 2 we pass to the listen function call tells the server that it can queue up 2 connections before it starts to refuse them.

Now for the magic/voodoo awesomeness of the server. In the while loop you’ll notice that when we call serversocket.accept() we get two variables back: a clientsocket and the clientaddr. Now, we take that socket connection and we hand it off to a thread by calling thread.start_new_thread. In this call, we pass it ‘handler’, which is the function you see defined at the top with the clientsocket and clientaddr as parameters. This function then runs, receiving and sending data with the client. Because we spawned a new thread to handle this connection, the server is free to keep accepting connections from other clients.

Now lets take a look at a simple client to interact with our server.

99% of this should look the same as the server we just created. The only difference here is that we arent creating new threads and we’re not binding our socket to our address. Instead we are just connecting to the server and then looping while we send/receive data that we receive from standard in.
Hopefully this has been helpful. There are a lot of Python resources out there, but it took me a while to find an implementation that worked in my situation. So hopefully this is simple enough for anyone to modify to fit their needs.

reCaptcha for Codeigniter 2

by on

Every time I make a new website with a user registration, I usually end up using a reCaptcha somewhere in the process. A while ago, I discovered a reCaptcha library on the Codeigniter forums. And since then, Ive modified it a little bit to work with Codeigniter 2.0and have placed it on Github where everyone can access it. Below is just an example of the Controller (included in the repository) so you can see how it all comes together.

Improving Database Performance With Memcached

by on

When it comes to web application performance, often times your database will be the largest bottleneck and can really slow you down. So how can you speed up performance when you have a site or application that is constantly hitting your database to either write new data or to fetch stored data? One of the easiest ways is to cache the data that is accessed the most. Today, I cam going to show you a brief example of how to do this with Memecached using PHP and the Codeigniter framework. First off, what exactly is Memcached? “Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.” Basically Memcached is a distributed system that dumps key value pairs to RAM for super fast access. If you need to scale, all you do is add more RAM or more nodes with more RAM. Lets get started.

First, we need to install memcached, so I will show you how to do so with a Debian based system (Debian/Ubuntu). First off, run the following from your commandline.

sudo aptitude install memcached libmemcached2 libmemcached-dev

This installs the base Memcached server along with the development libraries. Now we need a way for PHP to interact with Memcached, so we’re going to go and grab the Memcache PHP extension from Pecl. If you haven’t installed anything via Pecl before, its basically a PHP extension repository and manager and works similar to aptitude, but specifically for PHP. To use Pecl, you need the PHP5 dev package as well as Pear.

sudo aptitude install php5-dev php-pear

Once you have those installed, you can install the PHP Memcache extension:

sudo pecl install memcache

Depending on which operating system you’re using, you may need additional packages or libraries. For me on Ubuntu 9.10, I needed to install make and g++ in order to build the extension. So just take a look at the end of the output when you install the extension, and it will tell you what is missing.

Now we’re ready to start some coding. For this tutorial Im going to be using Codeigniter since I really like its object oriented structure and its nice database class. I will be using the new (though not yet official) version 2.0. You can find it over on BitBuckt and either clone the repository (note, you’ll need Mercurial to do this) or just download the source in a compressed format. Now we’re going to need a database to grab data out of. You can call it whatever you want. The only thing that you need is a table called “users”. Since I did this with 1000ish “users”, you can make the table yourself, or you can grab a dump of the table here.

Now that we have everything set up, we can start some coding. Connecting to Memcache is really simple and is done in two lines:

nSo now, what we are going to do is create a controller and a model. In this case, the controller is just there to call our model and display the returned data. The model will be doing all actions associated with accessing the database and accessing Memcache. So we are going to create two files: a controller called main.php and a model called user_model.php:

In our controller we load the model up in the constructor since we will be using it with every function. The function “cache_users()” is used to call the model to tell it to take all the users in the database and put them into the cache. We’ll get to the specifics of that in just a minute. The function get_user tells the model to go into the cache and find a user based on their unique user_id.

The model is a bit more involved process. First off in the constructor, we connect to Memcached and assign it to a class level variable so that all of our other methods can access it. The first function, “get_users” is an all-in-one example. Before I explain the function, lets first figure out how to access Memcached. Memcached stores things as a key-value pair in memory. Keys need to be a unique value and can be as large as 250 bytes. The value can be anything - string, array, object, pretty much anything in PHP that can be serialized can be stored as a value. This however does not include database connections or file handles. For our purposes, we are going to be storing each row of the user table in Memcached. So a way to do that in a unique way would be to store a hashed version of the SQL query that we would use if we were accessing the user from the database. So for example, if our user had the user_id of 43, we would hash the query used to access him from the database:

"SELECT * FROM users WHERE user_id=43"

The get_users function is going to store ALL users in a single index in Memcached, so the first line we come to is the query to access all the users from the database. The we perform an MD5 hash on it and assign that to a variable. Now we check and see if that key is in Memcached by performing $this->memcache->get($key). If that key does not exist, it will return NULL. So we check to see if it’s null. If it is, we know that we have to hit the database to grab the data. So we do that and while we’re at it, we put the resulting data into Memcache so that when we need to get it again, it’s now there. And of course if the key does exist, we don’t even touch the database. It’s all a pretty simple and straight forward process.

Lets take a look at cache_users(). Here what we are doing is grabbing all the users from the database and looping over all of them. The idea behind this is that we want each user to be in the cache individually versus all together like in the previous example. So while we are looping over the returned users, we prepare a SQL statement for them as if we were going to get them from the database, and then we store the user in its own row in Memcache. Now to store something in Memcache, we call $this->memcache->set($key, $value, $compression, $time). $key and $value are pretty self explanatory. $compression is a boolean value (0 or 1) that specifies if you want your data compressed or not. $time is the amount of time that you want the data to stay in the cache (set in seconds). Once that time has expired, the row is flushed from the cache. Now that we have all of our users in the cache, we can call fetch_user_from_cache and you will get your user!

Hopefully this shouldve given you a decent overview and an idea of how caching works so that you can apply it to your own applications. If you have any troubles or questions, leave a comment and I’ll help you out!

Scaling Apps With Message Queues

by on

When it comes to scaling a web application, one of the easiest ways to boost performance is with an asynchronous queue. Since web apps have started to become as complex as native desktop applications, users are expecting them to perform like such. This is where using asynchronous queues comes in to play. Typically with high traffic sites like Facebook, digg, twitter, et al, not everything needs to happen instantaneously, it just needs to look like it. For example, when you choose to send a message to someone on facebook, chances when you click that “send message” button, facebook takes your message and shoves it in a queue of other messages to be processed. To you the user, it looks like its already been sent, but in reality there might be a little bit of a delay. This makes it so that facebook doesnt have to handle sending thousands of messages per minute when you immediately click that button. Granted this is just a hypothetical example, I honestly have no idea how they actually handle such requests, but it makes for a good working example so that you guys can have an idea of what’s going on.

So thats the general gist of an asynchronous queue, now lets dive in a little deeper. When looking for an asynchronous queue, or messaging system or messaging broker, as some are called, there are a variety of options to consider. Today we’re going to be looking at Apache Active MQ because it takes advantage of the Java Messaging Service (JMS) and also integrates with a wide variety of languages including Java, PHP, C++, C, C#/.NET, and a wide variety of others. The example Im going to show you will be using PHP through the Stomp protocol. First off, lets install ActiveMQ.

First off, we need a server environment. Currently Im using Ubuntu Server 9.04. ActiveMQ is a Java application, so we need to install Java.

sudo aptitude install openjdk-6-jre

We’re also going to need Maven to pull in any necessary Java dependancies.

sudo aptitude install maven2

Now somewhere (Id recommend your home directory) download the ActiveMQ source and unpack it:

wget [http://mirrors.ecvps.com/apache/activemq/apache-activemq/5.3.2/apache-activemq-5.3.2-bin.tar.gz](http://mirrors.ecvps.com/apache/activemq/apache-activemq/5.3.2/apache-activemq-5.3.2-bin.tar.gz)

tar xvf apache-activemq-5.3.2-bin.tar.gz

Now, cd into that directory and run the following:

chmod 755 bin/activemq

Now we’re ready to build ActiveMQ using Maven

mvn clean install

And thats it! Simply run ./bin/activemq and ActiveMQ will start right up.

So now that we have a message broker set up, we need a way to start sending it messages. To do this, we are going to use the Stomp protocol with PHP. Im going to show you a simple example that opens a Stomp connection, sends a message to the queue, and then retrieves the sent message and displays it on the screen. Typically you would separate this file into a producer (a script that enqueues messages) and then a consumer (usually a daemonized background process) to retrieve the message and decide what to do with it. The nice thing about ActiveMQ is that your producers can be of the same language, or different languages all together depending on your processing needs.

To start, go and grab the Stomp PHP library and unpack it. Since we will be including Stomp.php in our script, you are probably going to want to add the Stomp library to your php.ini include_path.

Now with that all installed, we are going to create our script.

The code in this example pretty much explains itself. Since this is a simple example, it sends a message, then reads it back. Typically your consumer would be running in a loop, checking the queue for new messages and then processing the messages when it receives them.

That is pretty much it. I wanted this to be just a short introduction, so hopefully in a week or so, I’ll come back and give a more involved tutorial that will show how you can separate that script into a consumer and producer to really get work done.

Codeigniter Google Calendar Library

by on

So I was playing around with the Google Calendar portion of the Gdata API the other day and did some searching and found that there wasnt a Codeigniter library for it, probably because it seems that Google has teamed up with the guys that are working on the Zend Framework to bring Gdata to the PHP world. So I took the ZendGdata API for Google Calendar and implemented it in Codeigniter so that you just need to make a few simple function calls to gain authorization to a calendar, add events, query events, etc. This is my first attempt at writing a "library" for something, so hopefully it turned out well. If you find any errors in it or have any suggestions for improvement, just let me know!

Files you’ll need:

I also have it managed through a Git repository on on Github. Feel free to clone it or pull from it.

Setup:

First off we need to edit your config file. Open up your application/config/config.php file. Scroll down to the uri_protocol option and change it to PATH_INFO. Then go to the uri_allowed_chars setting and place a question mark after ‘a-z’. This allows you to put question marks in the URL. That?s it!

  • Place the Gcal.php file in your Codeigniter syste/application/libraries direcetory
  • Install the ZendGdata library with one of the two options:

    *   Make a directory anywhere on your machine/server
    
    • Place the ZendGdata directory in it
  • Open you php.ini file and find the include_path line.

    *   Add to the include_path the directory that you placed the ZendGdata directory in. `/your/directory/ZendGdata/library`
    
    • Save and restart Apache
  • Place the ZendGdata directory where you want.
  • In the Gcal.php library file, modify the require_once so that it points at the directory where ZendGdata is located. IE: require_once(\"/your/directory/ZendGdata/Library/Zend/Loader.php\");
  • Place the calendar.php file in your controllers directory

The calendar.php file is a sample controller. In it you’ll find an example of a call to each function in the library. Also with each one, I go through and show how to manipulate the data returned since it can get a little confusing at times (the return values are often many-dimensional arrays that are a bit difficult to interpret just by looking at them).

AuthSub

AuthSub is basically Googles version of OAuth. Like OAuth, AuthSub sends the user to log into their Google account and then returns them to a specified URL with an access token as a GET variable.

ClientLogin

ClientLogin is your basic, straight forward authentication with a users Google account username and password. It’s usually used for installed applications, but I included it anyway if you want to use it.

And finally here is a list of the functions included in the library: