SeanMcGary.com

  • rss
  • archive
  • The state of RSS and news feeds

    The nice thing about standards is that you have so many to choose from.

    This quote comes to mind a lot while I have been trying to build an RSS/news feed reader. For something like RSS and Atom feeds that have nicely defined specs, it seems that everyone doesnt really follow them. A lot of feeds have extra fields that are undefined, custom namespaces, or are missing fields all together. Why is it so hard to follow a spec?

    RSS vs. Atom

    Atom and RSS set out to solve what is effectively the same problem; provide a means for news syndication. Atom, supposedly, boasts an IETF spec making it better than RSS (whose spec wasnt all that official and had numerous shortcomings that Atom sought to fix) yet Ive seen the same problems and inconsistencies in both. Part of me thinks that these problems stem from the fact that both Atom and RSS are done using XML. At this point in time, XML seems like such an old and verbose markup language. Maybe it’s time to move on to something like JSON which is more lightweight and can be parsed easily by basically every programming language, not to mention it’s easy to read.

    Lack of attention to detail

    I think the main problem however is that people simply don’t know what they are doing half the time. For example, one feed that I managed to pick up through an automated crawler I built was a feed hosted by Rackspace. This feed seemed to be used for keeping track of the status of something. Turns out they were serving all 13,000 entries (probably since the beginning of time) ALL AT ONCE. I was stumped for a little while as to why my feed collector was choking and taking forever until I realized that it was waiting to download each of these then insert them all one at a time into the database. By the way, this is all within the RSS 2.0 spec.

    A channel may contain any number of <item>s.

    Now if only there were a way to paginate through past entries…

    Pick a mime-type, any mime-type

    This makes me think that people set up their RSS feeds, build there own, use Wordpress or whatever and never properly set up their web server to serve the correct type of content. Ive seen feeds served up as:

    • application/xml
    • text/xml
    • application/rss+xml
    • text/html
    • application/rdf+xml
    • application/atom+xml

    Sure all of these will produce some kind of valid XML document, but Im of the belief that you should be sending the correct headers for your document. Sending an RSS feed? Your Content-Type should be application/rss+html. Sending an Atom feed? You should be using application/atom+xml. C’mon, is it really that difficult? (hint, no its not).

    At least provide the important fields

    In the world of news, one of the more important fields is the date on which the article, story, event, item, whatever was published. Some feeds neglect to provide this important piece of information (thats right, Im looking at you HackerNews).

    Defining the bare minimum

    The way we consume news and media has changed a lot in the last few years. No longer are we looking at just a page of words. As we can see with apps like Flipboard, content is king. People like to see pictures and images. RSS doesn’t have a field to provide such images and the spec for its thumbnail image is too small. Atom has a generic “media:thumbnail” element, but some people (*cough* Mashable *cough*) like to be difficult and define their own namespace for their thumbnails (e.g. “mash:thumbnail”). So lets get some things straight here:

    On the top level, we need to describe the feed:

    • title
    • description
    • last updated
    • link
    • site logo

    These are pretty standard. Its the feed/item/article definition where things get a bit messy. But heres what we need in a world like today:

    • title
    • publish date
    • authors name
    • tags/categories
    • content
    • description (should be shortened preview of content)
    • image
    • unique id
    • original link/permalink

    One of the more important fields in that list would be the unique id. Currently, it is rather difficult to determine if an article is unique. You can’t go on title alone as someone could easily have two articles in the same feed with the same name. So it ends up being a comparison of normalizing a bunch of fields like the permalink/article link, title, and the feed its come from in order to tell if its unique or not. So why not include something like a UUID? With a UUID, you could then determine uniqueness on a feed by feed basis which is more than acceptable.

    Personally, in the end Id love to see a new protocol built with JSON that people actually adhere to. The internet is already series of APIs and web services using JSON as a payload medium, so why not extend that to RSS and other news type feeds? Why not make it more like an API where you can actually request a number of entries, or a date range for enries, or at the very least paginate through entries so that you arent sending 13,000 of them all at once?

    • 3 weeks ago
    • #rss
    • #atom
    • #xml
    • #news feeds
    • #feedstash
    • #json
    • #syndication
    • #pit-falls
    • #standards
    • #web services
    • #api
  • Getting Started with Native NodeJS Modules

    NodeJS has quickly gained in popularity since inception in 2009 due to its wide adoption in the web app community, a lot due to the fact that if you already know javascript, very little has to be learned to begin developing with it. As evident by its modules page on github, you can pretty much find the library you are looking for (it’s sort of starting to remind me of PHP where if you could think of something, chances are there is a module for it).

    One of the things that I think people forget is the fact that you can develop NodeJS modules not only in javascript, but in native C/C++. For those of you that forgot, NodeJS is possible because it uses Googles V8 javascript engine. With V8 you can build extensions with C/C++ that will be exposed to javascript. Recently, I decided to dive into the world of native modules because I needed a way to use the imagemagick image manipulation library directly from javascript. All of the libraries currently listed on the NodeJS modules page take a rather round-about approach by forking a new process and calling out to the commandline binaries provided by the imagemagick library. This is VERY VERY VERY SLOW and since image manipulation can be very intense, being able to use the library dirctly will make things MUCH faster.

    Part one: babby’s first native module

    This is the first part in what will hopefully be a multipart tutorial as I write a native module for imagemagick. Today, we will take a look at making the most basic of of native modules and how to use it with Node.

    To start off, lets create a file called testModule.cpp. This is where everything (for now) will happen. Heres what we need to start:

    Note, this is assuming you have NodeJS installed already and in your path (if you dont, go do that!). We need to import both the Node header and the V8 header.

    To build our module, we will be using the node-waf tool that comes bundled with NodeJS. In the same directory as testModule.cpp create a file called wscript and put the following stuff in it:

    The wscript file sets up needed environment variables and libraries that need to be linked at compile time. Think of it as some kind of makefile. The t.target property needs to match up to the name of the export property in your module (I’ll point this out when we get there).

    Now, to build your module simply run the following:

    Alright, now that we have those basics out of the way, lets make a module that when called, returns the string “Hello World”.

    So to quote the V8 handbook:

    A handle is a pointer to an object. All V8 objects are accessed using handles, they are necessary because of the way the V8 garbage collector works.

    A scope can be thought of as a container for any number of handles. When you’ve finished with your handles, instead of deleting each one individually you can simply delete their scope.

    So if we were to think of this in javascript, we’d basically have something like:

    So now that we have a function that can do some kind of work, how do we expose it to Node? Lets take a look:

    The function TestModule takes an object handle and basically shoves our function in it. This is how exports work in C++. In javascript we’d have:

    Now, a note on the NODE_MODULE(…) line. Before when I said t.target needed to match, this is where it needs to. The first argument of NODE_MODULE needs to be the same as your target value.

    Once you have all of that, build your new node module. To try it out, run node and import your module.

    That’s it! You now have your first native module. In my next post, we’ll dig deeper into building something a bit more substantial. One of the main draws of NodeJS is its asynchronous nature, so next time we’ll take a look at how to go about building a module with asynchronous function using libuv that its at the very core of NodeJS.

    • 6 months ago
    • #nodejs
    • #c
    • #c++
    • #javascript
    • #modules
    • #native
  • Drag and Drop File Uploads with Javascript

    The other day I was working on building a file upload interface in Javascript where a user could drag and drop files to upload to a server. I already knew that this was possible using the drag and drop api. Users could drag files from their desktop or other folder to a defined dropzone on the page and it would pull a FileList from the drop event. I use Google Chrome as my default browser, so heres what I started with:

    First thing to keep in mind is that by default, browsers will try and open a file when you drag it into the window. To prevent that, we need to predent the default action as well as prevent the event from propagating up the DOM tree. After that, we are able to access the FileList. Then I fired up Firefox just to check to make sure it worked across different browsers, knowing sometimes that just because it works in Chrome doesn’t mean it will work in other browsers. Upon trying it in Firefox, it loaded up the file in the browser. Turns out, non-Chrome browsers require a bit of an extra step; you need to listen for the ‘dragover’ event and prevent that from propagating and taking effect. Here’s the revised code:

    Now our drop event will work in Chrome, Firefox, and Safari. I havent had a chance yet to try it in Internet Explorer, but according to caniuse.com it looks like IE10 with Windows 8 will support drag and drop events. For those curious, here’s a jsfiddle of the above example.

    • 7 months ago
    • #drag
    • #drop
    • #javascript
    • #file
    • #upload
    • #FileList
    • #html5
    • #jquery
  • A Lack of Usability in the Photo Sharing World

    Recently Ive noticed that photo sharing sites (eg. flickr, Smugmug, etc) have rather poor user interfaces and user experience. UIs seem to have become overly complex, pushing rudimentary features out of the way to places that are not immediately accessible or take a bit of work to find. When Im using a photo sharing site, there are a couple of really basic features that I believe should be very prominent upon logging in.

    The “upload” button should be VERY easy to find.

    When I first log into flickr, it is not immediately obvious how I can upload photos or videos.

    Flickr

    When I first see this page, I immediately look at the top row where the various navigation items and menus are. Theres nothing at the root level that links to an upload page. However, under the “You” dropdown is an upload link. For a site that relies on users uploading photos, Id think that they would make it a single click away and very obvious. Instead, they nest one of the most import actions in a dropdown menu. As you make it down the page, you’ll realize that there IS an upload link on the main page, however its the same style as their section headers and doesn’t immediately stand out. This really should be styled more like a call to action so that it stands out better. Smugmug, in comparison, has an upload link on their top row of navigation, but they require you to first create a gallery if you dont have one, or pick a gallery to upload to if you have already created one. This is definitely a step in the right direction. I’ll get to my issues with their gallery structure in a little bit.

    Smugmug

    Galleries, sets, and categories oh my!

    For those of you that are old enough, think back ten years or so. Chances are your mother, grandmother, or some other family member has a closet full of photo albums of photos from your childhood, or even their own childhood. Photo albums are the most basic and rudimentary method for organizing photos. Have a bunch of photos that happend all at once? Maybe a vacation, birthday party, or other special event? Put them all in one place, like a photo album, so that you can find them later! Even Facebook has this down. You can upload photos and then organize them into albums. Its simple and easy. Flickr and Smugmug on the other hand, make it a bit more difficult. Flickr has the notion of “sets”. Sets are essentially the same idea as an album; you name the set and select photos to put in the set. Beyond that its really easy to get lost. Organizing photos into sets is relatively simple; select “your sets” from the “organize” dropdown. Though, upon doing this you are brought to an entirely different user interface than you were just on. The entire root site navigation has disappeared and you are shown a pseudo full-screen page. Adding photos is pretty easy - just drag and drop from your “photostream” on the bottom. The flow of this organize process could be better handled as it seems like they are trying to cram too many features into one page and have thus made it a bit complex to navigate. As a note, my mother who is not the greatest with computers, has never been able to figure out how to use this particular interface on flickr.

    Well how about smugmug; How does it stack up against flickr? The first thing that drives me nuts is that you HAVE to create a gallery (equivelant of an album) in order to upload any photos at all. Flickr allows its users to simply upload photos then organize later. Smugmug also doesnt allow you to include one photo in multiple galleries. There is absolutely no way to organize your photos a la flickr. Everything MUST go into a gallery, and one gallery only, and only into that gallery by uploading something to it. Smugmug also has categories. An extra level of hierarchy and organization seems like a good idea, but their flow is very limiting, much like their flow for adding photos to albums.

    Help help, my photos are being help hostage!

    One large point of contention on the internet right now is over reclaiming your own data from a website that you are using. Google+ does a great job with addressing this by allowing users to “liberate” their data by download a zip archive of it. Recently, my flickr pro subscription ran out. Currently I have a few hundred photos hosted through them. However, when your pro subscription runs out, flickr essentially holds your photos hostage allowing you to only have access to the 200 most recent photos. What if I didnt have a backup of my photos? (Yes, stupid I know). The only way to get them back would be to pay $25 to upgrade just to download them all. Flickr also doesnt provide a batch download feature to reclaim all of your uploaded photos. Smugmug is a little bit different. They dont follow a freemium model like flickr. They are purely a subscription service that you pay yearly. So once your subscription runs out, you have to renew it to get back to your photos. Then again, Smugmug targets professional photographers that are using them as a showcase of their work as well as for white label printing. Smugmug, as far as I can tell, also doesnt have a way to batch download photos that you have uploaded.

    How I would do it differently

    Both flickr and Smugmug have features that are good and features that are not so good. And if they were combined and implemented a little bit differently than you would have one hell of a site. So I am going to attempt to do just that; take features and ideas from both and improve upon them. The internet is a much different place now than it was when flickr and smugmug first launched in the early 2000’s. There is now a larger focus on building social communities and applications that are incredibly easy to use. Here is how I would do it:

    Freemium Model

    This app is going to be a “pay for what you use” type of deal. There will be a pricing model with a free tier where the amount you pay is based on the amount of space that you are consuming. The free tier will offer a certain amount of space rather than a limit on the number of photos. If you want to compress your photos down to a few kilobytes and upload a few hunderd, go for it! However, if you want to upload photos at their full resolution that are a few megabytes a piece, then you may want to look into one of the paid tiers. This way, people that are into “casual” photography have a place to upload and share photos for free or cheap, and professionals have an affordable way to host their photos as well. Pricing will be focused around space consumed, and not necessarily features available to you. There might be a point where some features appear that are more geared toward professionals and might be offered to paying users only, but for the most part everyone will be on an equal playing field as far as features are concerned

    Focus on the basics

    As I explained above, doing the simple things uploading and organizing photos and albums has gotten rather difficult. In this application, viewing photos and albums, uploading photos, and organizing your photos will be a very primary focus. The interface is designed in a way that even my own mother will be able to use it without having to call me up to walk her through the process. I figure if she can do it without help, then most everyone else should be able to as well.

    Building communities

    Whats use would a hosting site be if you couldnt interact with people and talk about your love for photography? Users will have the option of enabling commenting on albums and photos. Flickr has some very large communities because of their commenting system, but there is also a high level of spam comments as well. Users will be able to monitor and moderate comment threads on their own photos and albums to hopefully prevent spam from coming up. Users will also be able to favorite photos and albums as well as follow other users. If two users follow each other, they will be classified as friends. On your dashboard, you will see activity on the things you have followed; when a new photo is added to an album, when comments are made, when a user creates a new album or uploads some photos. User activity and engagement will play a key role in this new application.

    Liberate your data

    Users will be able to download a zip file containing all of the photos that they have uploaded. ‘nuff said.

    • 8 months ago
    • #photography
    • #sharing
    • #social
    • #network
    • #flickr
    • #smugmug
    • #interface
    • #design
    • #experience
    • #ux
    • #ui
    • #communities
  • Building an Editor for MarkdownWiki

    The MarkdownWiki Editor

    In my free time lately, I have been building a web application to refresh the wiki market. MarkdownWiki is a new cloud hosted platform that allows users to create and collaborate wiki pages. It preserves the original purpose of wikis - provide a place for users to present their knowledge, information, notes, documentation. The possibilites have become endless.

    The reason I started building MarkdownWiki was to build a wiki platform that was up-to-date with today’s latest and greatest technologies. The first thing that I decied to start with was building an editor that makes editing and creating wikis easy for everyone (maybe even for my own mother!). In this post on the MarkdownWiki blog, I talk about the built in editor and how it will make creating wiki pages many times easier than it currently is.

    • 12 months ago
    • #markdown
    • #wiki
    • #refresh
    • #github
    • #editing
    • #interface
    • #design
    • #code mirror
    • #markdownwiki
    • #media wiki
    • #syntax
    • #highlighting
    • #javascript
    • #nodejs
    • #ruby
    • #sundown
    • #robotskirt
  • Goodbye LAMP, Hello NodeJS

    When I first started developing web applications about five years ago, PHP was my language of choice because it was super easy to learn and a LAMP stack was trivial to setup on my local machine to get started. At the dawn Web2.0, PHP was booming as a web language. People began developing and rapidly prototyping applications with PHP and MySQL. You knew C or C++? PHP was even easier for you to dive into and get started with. When I started, I jumped on the Codeigniter bandwagon as it made development quick and provided the minimal amount of structure needed via the MVC design pattern so that you didnt just end up with tons of files of spaghetti code that wasnt structured at all. After a while with Codeigiter, I realized it was too simple. At the time, it lacked an autoloader and didnt play well with libraries that didnt comply to its limited interface provided to work with libraries. So I decided to build my own framework. Foundation-PHP (as I decided to call it) provided a similar structure to Codeigniter, but also allowed me to change how classes were loaded and how objects were created, including the features that Codeignter lacked. At this point, I also had been introduced to MongoDB, so I decided to make that the default database handler instead of MySQL or Postgres. My framework wasn’t quite on par in terms of features compared to Codeignter, but it included all the features that I needed to build applications quickly.

    Javascript comes to the server

    During this time that I was hacking out applications built on PHP, other frameworks such as Ruby on Rails, or Django, or Pylons were starting to mature and developers were starting to move away from the battle tested LAMP stack. People started throwing around buzzwords like “high availability”, “big data”, “NoSQL”. PHP started its decent from a popular web language whith people starting to move to languages like Ruby and Python. Things like HTML5 and CSS3 were skyrocketing in popularity and usage. Browsers, with tons of help from Google Chrome and its rapid release cycle, started to incorporate these features that were’t even finalized by the W3C into their new versions so that people could start using these new exciting features. Javascript in particular, started to become very popular in due to developers using it to create applications with an almost desktop experience.

    Then in 2009 NodeJS, created by Ryan Dahl, appeared. NodeJS consists of Google’s V8 Javascript engine along with a set of core libraries providing APIs to expose common lower level features such as file I/O, sockets, IPC mechanisms, etc. Because NodeJS is built with the help of Javascript, it is event driven providing purely asynchronous I/O thus minimizing overhead compared to traditional blocking I/O patterns.

    Due to the asynchronous nature of NodeJS, it was able to perform very well when writing servers. A community quickly formed around NodeJS and started to contribute to its feature set. Developers started writing libraries and modules to work with databases, the underlying operating system, websockets, and graphics just name a few. Soon, developers jumped at the fact that NodeJS makes a fantastic web application server. Hell, an HTTP server library is built into Node’s core.

    The great migration

    Not to long ago, I decided to say farewell to PHP and being moving over to NodeJS. The first thing I did? (Build a web framework)[https://github.com/seanmcgary/NodeWebMVC]. Node is still very young and does not (yet) have all the frameworks and tools that more mature languages such as PHP, Ruby, and Python have. To me, having a small framework to get up and running is a huge advantage t orapid protyping and development of ideas.

    Jumping in the express lane

    By default, NodeJS comes with the necessary tools to create an, albeit simple, webserver. In less than 20 lines of code, you can have a “functioning” webserver that will send data to a browser upon connection. That however doesn’t do us much good when trying to create an application. Fortunately, the guys over at Visionmedia decided to build a little library called expressjs to make developing HTTP servers a bit easier. Express is built on a number of connect libraries giving you features like HTTP routing, sessions, and parsing POST, PUT, GET, and DELETE requests. Having these features puts express on par in terms of features and functionality with libraries such as Sinatra. You can quickly build a server that has routing and session handling, great for developing API servers, but it still lacks a little bit more structure needed for rich web applications.

    Adding some structure

    For my framework, I took express as a base and started to build around it. I very much liked the MVC pattern of Codeigniter, so I decided to sprinkle some MVC on top of express. Doing this allows the developer to clearly seperate code into controllers, models and views instead of just setting functions for specific HTTP requests. This also allows the developer to write code that is a bit more modular, taking advantage of inheritence for controllers and models. The one thing that I am loving about Javascript that is making development much easier is using Mustache (in this case Handlebars.js, a fork or mustache) for templating and view parsing. Now instead of needing to write code in view files, they are done in HTML with Mustache placeholders and then parsed serverside before being sent to the user. This makes everything much cleaner and less intrusive. This also makes the flow of data from datastore/database to view much simpler. View content can be fetched from the database and with very minimal modifications, sent directly to the view to be parsed. This is a HUUUUGE advantage.

    Lets build some apps!

    Since building this small framework to provide a bit of structure, Ive been able to start rapidly prototyping applications. Right now, the app I am writing, (Markdownwiki.com)[http://blog.markdownwiki.com], is a living test of my framework and allows me to constatly add features into the framework that I feel might be commonly needed by developers.

    Now that my migration to NodeJS from PHP is nearly complete, I will be continually building out my NodeWebMVC framework for people to use. So if you’re looking for a framework to get started with building an app, I encourage you to check it out. I would love to get some feedback on it. As I develop it, I’ll tag stable points along the way so that you wont be confused if it for some reason it doesnt work. If you run into bugs, file them here on github and I’ll get to fixing them .Alternatively, feel free to fork it and fix them yourself and submit a pull request with your fix.

    • 2 years ago
    • #nodejs
    • #php
    • #lamp
    • #mongodb
    • #markdownwiki
    • #server
    • #web application
  • Getting Started with NodeJS and Server Side Javascript

    With NodeJS and server-side Javascript becoming a very prominent technology in web applications today, I figured Id give an introduction to NodeJS to get everyone up and running. “But how can Javascript run in a server environment, I thought it required a browser and was only used to make my website interactive”. NodeJS allows you to build applications written in Javascript with the help of Google’s V8 Javascript rendering engine that is at the heart of the Google Chrome web browser. Coupled with the asynchronous nature of NodeJS, V8 contributes greatly to Node’s performance and scalability.

    “Okay, thats cool and all, but why should I use it?”

    Theres really two reasons - Performance and the fact that Javascript is becoming a pretty universal language. If you’ve ever dealt with web development of any kind, chances are you’ve had some experience with Javascript. The performance comes from a combination of Google’s V8 engine, the other is the fact that Node is asynchronous and event driven, very much like JS applications that run in your browser. NodeJS is all a single thread.

    “But wait, how does that make it more efficient than multi threaded applications? Wouldnt a single thread be a bottle neck?

    If this were a traditional server implementation that would be true, but Node is a bit different. Instead of needing to manually identify events and spawn new threads based on those events, you simply register events that will act as callbacks when fired - very similar, if not exactly the same, as most applications of Javascript that will run in your browser. Since we’re not manually creating new threads for each new event, and each event is asynchronous, none of the events will block the NodeJS event loop. This allows NodeJS to execute concurrent connections very efficiently and quickly by receiving an event and then pushing it to the background to run, with the process notifying the server when its done.

    “Thats pretty cool, how do I get started then?”

    First off, you’re going to need a computer with a non-windows operating system (so anything Unix, Linux, or Mac OS X will work). They’re working on a build to work with Windows, but its not quite done yet. Next, grab the source from Github here. We’re going to be building version v0.4.x. NodeJS only has two things needed for building and installing - python 2.4 (or higher) and libssl-dev (if you plan on using SSL encryption). Now to build Node, cd into the directory you checked out and run the following:

    This will build Node and install it to your path. Now, all you need to do to run Node on the command line is just run: $ node <your file name>.

    Now, lets take a look at a simple NodeJS TCP server implementation:

    Now lets take a more detailed look at what this is doing. First, we need to include the “net” library that is built into Node. This is done using the “require” function. Node comes with a bunch of libraries that allow you to create sockets, interact with the filesystem, make system calls, and much much more. You can also create your own libraries, both in C++ and plain javascript, to include in your Node applications. Now that we have the “net” library included, we can proceed with creating our server object. The server takes a single variable, we’ll call this “socket” because thats essentially what it is. We’ll use this to write and read data to/from our connection. Because Node is asynchronous, we need to add some listeners - connect, data, and end. The first two are pretty self explanatory, with the data event firing whenever the client sends data to the server. So, instead of creating a thread loop to block and wait for data from the client, we simply register an event for it that will listen while the rest of the server runs unblocked. When the server receives data, the callback function will be called and the logic contained will be executed. The last line in our file simply tells the server to start listening on a provided port and host. And thats it! It’s very simple, you just set some listeners and then just forget about it while it runs.

    • 2 years ago
    • 1 notes
    • #node
    • #nodejs
    • #google chrome
    • #google v8
    • #javascript
    • #browser
    • #asynchronous
  • Send Mail Through Google Apps With PHP

    Recently while I was working on a site that Im creating, I needed a way to easily send email out to users. Like a lot of people that have domain names and dont want to run their own mail server, I have decided to let Google Apps handle all of my email and app related needs. Now way you dont have a mail server and dont want to maintain one. Maybe you dont know how or just dont want to deal with maintaining such a service. Well as it turns out, you can use Google Apps as your mail server. In this example, Im using PHP to send out a message using the SMTP server and an account that Ive setup on my Google Apps domain (one that is not my primary, admin account).

    First we have to install some libraries through PEAR

    Now that we have our two dependancies installed, we can write our code to send our mail. Basically what this does, is it connects to Gmail’s (or Google Apps) SMTP server and sends a message. This is pretty much the same thing that happens when you send an email when using a desktop client like Thunderbird or Apple Mail. All you do is supply it with Google’s SMTP server, port, and your username and password. NOTE: if you are using Google Apps, the username is going to be your-account-name@your-domain.com.

    And voila! You can now send mail through Google!

    • 2 years ago
    • #php
    • #email
    • #mail
    • #smtp
    • #google
    • #google apps
    • #gmail
    • #pear
  • Threaded TCP Server in Python

    Recently I finally decided to take some time to learn Python. So I figured the best way to learn something new is to dive right in and write an application. This application happened to be a new server for Computer Science House’s networking vending machine(s) ‘Drink’. The general idea behind Drink is that it’s a ‘communal refrigerator’ for CSH that the members ‘donate’ money to in order to stock it with delicious drinks (such as Coke products since RIT is exclusively a Pepsi campus). Being the geeks we are, these vending machines that we have on our floor must be accessible via the internet in some way, thats where the server comes in. The server needs to facilitate connections to each machine as well as accept incoming connections from clients that want to drop drinks and then shuffle those requests off the to tini-boards that control the physical machines. So immediately I was thrown into learning threading and sockets in Python.

    Well thats cool and all, but youre probably asking why youre here. I mean, the title does hint that you’ll be learning something. This is true, we’re getting there so just sit tight for another minute. So one of the issues that needed to be overcome while writing this server was having an instance of a server that can server multiple clients at the same time and not have one client blocking the socket connection. Python, being the flexible language it is, offers you multiple ways to handle sockets, threads, and the combination of the two. In this process what we want to happen is have a server bound to a specific address and port, but once the connection is accepted, we want the server to scoop that connection to a semi-random port in its own thread so that we dont block other clients from connecting. First lets take a look at the server implementation:

    First we start off by importing the socket and thread modules. Now, if we wanted to make this a threaded class, we could ‘from threading import Thread’ so that our server could inherit from Thread (this is an example of one of those many modules for threading I mentioned). Now if we look at our main “method” we define our host, port, and buffer size and we create a tuple called ‘addr’ to hold the host and port. Now in this implementation, we have created a SOCK_STREAM socket which is the same as a TCP socket. When making this a server, you have to remember to bind the socket to the addr tuple (this is not the case with the client we will implement). And finally we tell the socket to listen for connections. The 2 we pass to the listen function call tells the server that it can queue up 2 connections before it starts to refuse them.

    Now for the magic/voodoo awesomeness of the server. In the while loop you’ll notice that when we call serversocket.accept() we get two variables back: a clientsocket and the clientaddr. Now, we take that socket connection and we hand it off to a thread by calling thread.start_new_thread. In this call, we pass it ‘handler’, which is the function you see defined at the top with the clientsocket and clientaddr as parameters. This function then runs, receiving and sending data with the client. Because we spawned a new thread to handle this connection, the server is free to keep accepting connections from other clients.

    Now lets take a look at a simple client to interact with our server.

    99% of this should look the same as the server we just created. The only difference here is that we arent creating new threads and we’re not binding our socket to our address. Instead we are just connecting to the server and then looping while we send/receive data that we receive from standard in. Hopefully this has been helpful. There are a lot of Python resources out there, but it took me a while to find an implementation that worked in my situation. So hopefully this is simple enough for anyone to modify to fit their needs.

    • 2 years ago
    • #tutorial
    • #example
    • #connections
    • #sockets
    • #threads
    • #serve
    • #python
    • #tcp
  • reCaptcha for Codeigniter 2

    Every time I make a new website with a user registration, I usually end up using a reCaptcha somewhere in the process. A while ago, I discovered a reCaptcha library on the Codeigniter forums. And since then, Ive modified it a little bit to work with Codeigniter 2.0and have placed it on Github where everyone can access it. Below is just an example of the Controller (included in the repository) so you can see how it all comes together.

    • 2 years ago
    • 1 notes
    • #codeigniter
    • #recaptcha
    • #github
  • Improving Database Performance With Memcached

    When it comes to web application performance, often times your database will be the largest bottleneck and can really slow you down. So how can you speed up performance when you have a site or application that is constantly hitting your database to either write new data or to fetch stored data? One of the easiest ways is to cache the data that is accessed the most. Today, I cam going to show you a brief example of how to do this with Memecached using PHP and the Codeigniter framework. First off, what exactly is Memcached? “Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.” Basically Memcached is a distributed system that dumps key value pairs to RAM for super fast access. If you need to scale, all you do is add more RAM or more nodes with more RAM. Lets get started.

    First, we need to install memcached, so I will show you how to do so with a Debian based system (Debian/Ubuntu). First off, run the following from your commandline.

    sudo aptitude install memcached libmemcached2 libmemcached-dev

    This installs the base Memcached server along with the development libraries. Now we need a way for PHP to interact with Memcached, so we’re going to go and grab the Memcache PHP extension from Pecl. If you haven’t installed anything via Pecl before, its basically a PHP extension repository and manager and works similar to aptitude, but specifically for PHP. To use Pecl, you need the PHP5 dev package as well as Pear.

    sudo aptitude install php5-dev php-pear

    Once you have those installed, you can install the PHP Memcache extension:

    sudo pecl install memcache

    Depending on which operating system you’re using, you may need additional packages or libraries. For me on Ubuntu 9.10, I needed to install make and g++ in order to build the extension. So just take a look at the end of the output when you install the extension, and it will tell you what is missing.

    Now we’re ready to start some coding. For this tutorial Im going to be using Codeigniter since I really like its object oriented structure and its nice database class. I will be using the new (though not yet official) version 2.0. You can find it over on BitBuckt and either clone the repository (note, you’ll need Mercurial to do this) or just download the source in a compressed format. Now we’re going to need a database to grab data out of. You can call it whatever you want. The only thing that you need is a table called “users”. Since I did this with 1000ish “users”, you can make the table yourself, or you can grab a dump of the table here.

    Now that we have everything set up, we can start some coding. Connecting to Memcache is really simple and is done in two lines:

    nSo now, what we are going to do is create a controller and a model. In this case, the controller is just there to call our model and display the returned data. The model will be doing all actions associated with accessing the database and accessing Memcache. So we are going to create two files: a controller called main.php and a model called user_model.php:

    In our controller we load the model up in the constructor since we will be using it with every function. The function “cache_users()” is used to call the model to tell it to take all the users in the database and put them into the cache. We’ll get to the specifics of that in just a minute. The function get_user tells the model to go into the cache and find a user based on their unique user_id.

    The model is a bit more involved process. First off in the constructor, we connect to Memcached and assign it to a class level variable so that all of our other methods can access it. The first function, “get_users” is an all-in-one example. Before I explain the function, lets first figure out how to access Memcached. Memcached stores things as a key-value pair in memory. Keys need to be a unique value and can be as large as 250 bytes. The value can be anything - string, array, object, pretty much anything in PHP that can be serialized can be stored as a value. This however does not include database connections or file handles. For our purposes, we are going to be storing each row of the user table in Memcached. So a way to do that in a unique way would be to store a hashed version of the SQL query that we would use if we were accessing the user from the database. So for example, if our user had the user_id of 43, we would hash the query used to access him from the database:

    "SELECT * FROM users WHERE user_id=43"

    The get_users function is going to store ALL users in a single index in Memcached, so the first line we come to is the query to access all the users from the database. The we perform an MD5 hash on it and assign that to a variable. Now we check and see if that key is in Memcached by performing $this->memcache->get($key). If that key does not exist, it will return NULL. So we check to see if it’s null. If it is, we know that we have to hit the database to grab the data. So we do that and while we’re at it, we put the resulting data into Memcache so that when we need to get it again, it’s now there. And of course if the key does exist, we don’t even touch the database. It’s all a pretty simple and straight forward process.

    Lets take a look at cache_users(). Here what we are doing is grabbing all the users from the database and looping over all of them. The idea behind this is that we want each user to be in the cache individually versus all together like in the previous example. So while we are looping over the returned users, we prepare a SQL statement for them as if we were going to get them from the database, and then we store the user in its own row in Memcache. Now to store something in Memcache, we call $this->memcache->set($key, $value, $compression, $time). $key and $value are pretty self explanatory. $compression is a boolean value (0 or 1) that specifies if you want your data compressed or not. $time is the amount of time that you want the data to stay in the cache (set in seconds). Once that time has expired, the row is flushed from the cache. Now that we have all of our users in the cache, we can call fetch_user_from_cache and you will get your user!

    Hopefully this shouldve given you a decent overview and an idea of how caching works so that you can apply it to your own applications. If you have any troubles or questions, leave a comment and I’ll help you out!

    • 3 years ago
    • #database
    • #performance
    • #memcache
    • #memcached
    • #php
    • #mysql
    • #codeigniter
    • #web application
  • Scaling Apps With Message Queues

    When it comes to scaling a web application, one of the easiest ways to boost performance is with an asynchronous queue. Since web apps have started to become as complex as native desktop applications, users are expecting them to perform like such. This is where using asynchronous queues comes in to play. Typically with high traffic sites like Facebook, digg, twitter, et al, not everything needs to happen instantaneously, it just needs to look like it. For example, when you choose to send a message to someone on facebook, chances when you click that “send message” button, facebook takes your message and shoves it in a queue of other messages to be processed. To you the user, it looks like its already been sent, but in reality there might be a little bit of a delay. This makes it so that facebook doesnt have to handle sending thousands of messages per minute when you immediately click that button. Granted this is just a hypothetical example, I honestly have no idea how they actually handle such requests, but it makes for a good working example so that you guys can have an idea of what’s going on.

    So thats the general gist of an asynchronous queue, now lets dive in a little deeper. When looking for an asynchronous queue, or messaging system or messaging broker, as some are called, there are a variety of options to consider. Today we’re going to be looking at Apache Active MQ because it takes advantage of the Java Messaging Service (JMS) and also integrates with a wide variety of languages including Java, PHP, C++, C, C#/.NET, and a wide variety of others. The example Im going to show you will be using PHP through the Stomp protocol. First off, lets install ActiveMQ.

    First off, we need a server environment. Currently Im using Ubuntu Server 9.04. ActiveMQ is a Java application, so we need to install Java.

    sudo aptitude install openjdk-6-jre

    We’re also going to need Maven to pull in any necessary Java dependancies.

    sudo aptitude install maven2

    Now somewhere (Id recommend your home directory) download the ActiveMQ source and unpack it:

    wget http://mirrors.ecvps.com/apache/activemq/apache-activemq/5.3.2/apache-activemq-5.3.2-bin.tar.gz

    tar xvf apache-activemq-5.3.2-bin.tar.gz

    Now, cd into that directory and run the following:

    chmod 755 bin/activemq

    Now we’re ready to build ActiveMQ using Maven

    mvn clean install

    And thats it! Simply run ./bin/activemq and ActiveMQ will start right up.

    So now that we have a message broker set up, we need a way to start sending it messages. To do this, we are going to use the Stomp protocol with PHP. Im going to show you a simple example that opens a Stomp connection, sends a message to the queue, and then retrieves the sent message and displays it on the screen. Typically you would separate this file into a producer (a script that enqueues messages) and then a consumer (usually a daemonized background process) to retrieve the message and decide what to do with it. The nice thing about ActiveMQ is that your producers can be of the same language, or different languages all together depending on your processing needs.

    To start, go and grab the Stomp PHP library and unpack it. Since we will be including Stomp.php in our script, you are probably going to want to add the Stomp library to your php.ini include_path.

    Now with that all installed, we are going to create our script.

    The code in this example pretty much explains itself. Since this is a simple example, it sends a message, then reads it back. Typically your consumer would be running in a loop, checking the queue for new messages and then processing the messages when it receives them.

    That is pretty much it. I wanted this to be just a short introduction, so hopefully in a week or so, I’ll come back and give a more involved tutorial that will show how you can separate that script into a consumer and producer to really get work done.

    • 3 years ago
    • #web application
    • #message queue
    • #jms
    • #java
    • #php
    • #c
    • #c++
    • #activemq
    • #stomp
    • #message broker
    • #facebook
    • #twitter
  • Codeigniter Google Calendar Library

    So I was playing around with the Google Calendar portion of the Gdata API the other day and did some searching and found that there wasnt a Codeigniter library for it, probably because it seems that Google has teamed up with the guys that are working on the Zend Framework to bring Gdata to the PHP world. So I took the ZendGdata API for Google Calendar and implemented it in Codeigniter so that you just need to make a few simple function calls to gain authorization to a calendar, add events, query events, etc. This is my first attempt at writing a "library" for something, so hopefully it turned out well. If you find any errors in it or have any suggestions for improvement, just let me know!

    Files you’ll need:

    • Codeigniter-Gcal (Link fixed)
    • ZendGdata

    I also have it managed through a Git repository on on Github. Feel free to clone it or pull from it.

    Setup:

    First off we need to edit your config file. Open up your application/config/config.php file. Scroll down to the uri_protocol option and change it to PATH_INFO. Then go to the uri_allowed_chars setting and place a question mark after ‘a-z’. This allows you to put question marks in the URL. That?s it!

    • Place the Gcal.php file in your Codeigniter syste/application/libraries direcetory
    • Install the ZendGdata library with one of the two options:
      • Make a directory anywhere on your machine/server
      • Place the ZendGdata directory in it
    • Open you php.ini file and find the include_path line.
      • Add to the include_path the directory that you placed the ZendGdata directory in. /your/directory/ZendGdata/library
      • Save and restart Apache
    • Place the ZendGdata directory where you want.
    • In the Gcal.php library file, modify the require_once so that it points at the directory where ZendGdata is located. IE: require_once(\"/your/directory/ZendGdata/Library/Zend/Loader.php\");
    • Place the calendar.php file in your controllers directory

    The calendar.php file is a sample controller. In it you’ll find an example of a call to each function in the library. Also with each one, I go through and show how to manipulate the data returned since it can get a little confusing at times (the return values are often many-dimensional arrays that are a bit difficult to interpret just by looking at them).

    AuthSub

    AuthSub is basically Googles version of OAuth. Like OAuth, AuthSub sends the user to log into their Google account and then returns them to a specified URL with an access token as a GET variable.

    ClientLogin

    ClientLogin is your basic, straight forward authentication with a users Google account username and password. It’s usually used for installed applications, but I included it anyway if you want to use it.

    And finally here is a list of the functions included in the library:

    • 3 years ago
    • #google calendar
    • #library
    • #codeigniter
    • #gdata
    • #tutorial
    • #calendar
© 2008–2013 SeanMcGary.com