Archive for development

Dealing with darn data

My next mini-experiment is to explore different ways to getting data into Kinetic diagrams.

Kinetic diagrams are stored in plain text files. Initially, my idea described in earlier posts was to create a custom domain-specific-language (DSL) using Ragel, to define all the objects in the diagram. Each object can have metadata attributes. For example, suppose there’s a “Salary” box in my finance diagram. “Salary” might have the following attributes: label=Work Salary, vendor=Coal Mines, and amount=3580.  Here’s how the DSL would describe the objects in the finance diagram:

+ Salary    | label="Work salary",    vendor="Coal Mines",  amt=3580
+ Checking  | label="Checking",       vendor="BigBank",     amt=400
+ Emergency | label="Emergency fund", vendor=" ",           amt=150
+ Savings1  | label="Savings",        vendor="MegaSavings", amt=720
+ Savings2  | label="Savings",        vendor="BigBank",     amt=214
+ Stocks    | label="Stocks",         vendor="Tradetek",    amt=1196
+ Wallet    | label="Wallet",         vendor=" ",           amt=143
+ Purchases | label="Purchases",      vendor="stores",      amt=212

But something about this approach just feels wrong. After all, can I realistically expect users to type vendor=”Coal Mines” in order to set an attribute value? Of course not. Most users aren’t programmers, nor care to be. So my next idea was to allow the user to type a textual data table:

-------------------------------------------------------------
name           | label              | vendor         | amt
-------------------------------------------------------------
Salary         | Work salary        | Coal Mines     | 3580
Checking       | Checking           | BigBank        | 400
Emergency      | Emergency fund     |                | 150
Savings1       | Savings            | MegaSavings    | 720
Savings2       | Savings            | BigBank        | 214
Stocks         | Stocks             | Tradetek       | 1196
Wallet         | Wallet             |                | 143
Purchases      | Purchases          | stores         | 212
----------------------------------------------------------- 

This textual table is much clearer and readable. The gotcha, however, is that this requires the user to format that textual table perfectly. I’d need to define some required format in order to distinguish the header values from the rest of the table, and separate columns from each other. So while readable, this approach is still too brittle for real-world use.

The new approach I’m exploring now is to just read the data from a spreadsheet or database directly. After all, wouldn’t it be great if the user can just edit the data in a spreadsheet?

This will be a fun mini-experiment…stay tuned.

Leave a Comment

The common ingredient behind software success

As a Certified Scrum Master, I continually re-assess how to best apply Scrum principles to deliver software projects. One of my main criticisms of Scrum is that it is often misunderstood and sold as a panacea for bad software practices. It seems dev managers want to believe that Scrum will allow them to extract stellar results from poor and mediocre performers.

What a real Scrum looks like:

(Photo by flickr user boocal used under a Creative Commons license.)

In my experience, the only common ingredient to all successful projects is: strong software developers. Scrum (or any other software methodology) is no substitute for solid software fundamentals (such as strong object-oriented design skills).

Katie Lucas expresses this more eloquently. Her entire post is worth reading for anyone in the software profession.

Every methodology I’ve come across has, at its kernel, a very small section labelled “do magic here.”

[A software methodology is sometimes] pushed as a way of getting normal people to do something normal people can’t do. Normal people can’t do OO design properly. I don’t mean that derogatively as such. I can’t draw still life, I can’t run 100m races…People have various different talents. One of those talents is doing OO design and some people just can’t do it. No matter how much paperwork you surround it with.

And at the core of [a software methodology] is a small area where you have to use OO design talents…. if you don’t have them, it’s like having a methodology for running the 100m.

“Step 1: write about running really fast. Step 2: Go and draw a plan of the racetrack. Step 3: go and buy really tight lycra shorts. Step 4: run really, really, really fast. Step 5: cross line first”

It’s that step 4 that’s the tough one. But if you put lots of emphasis on 1,2,3 and 5 it’s possible no-one will notice and then you could probably make a lot of money selling the methodology to would-be
athletes who think there’s some “secret” to being a 100m runner over and above being born with the ability to run fast.

Leave a Comment

Mission Capistrano: Accomplished

A couple of weeks ago, I accepted “Mission Capistrano” to automate the deployment process. I figure I owe an update: Well, it’s “Mission accomplished!”…all thanks to Capistrano.


(Photo by pasukaru76, Creative Commons license)

Meet Capistrano, my new unpaid and overworked robotic intern. Today, instead of dealing through multiple tedious steps, I simply type:

cap deploy

and Capistrano automatically takes care of the dirty work. What used to take 15 minutes now takes less than 4 seconds.

Here’s my 8-line configuration file…feel free to use it as a starting point for your projects:

set :application,  "kinetic"
set :repository,   "."
set :scm,          :none
set :deploy_via,   :copy
set :copy_exclude, [".DS_Store", "vendor"]
set :deploy_to,    "/home/kinetic/kineticdiagrams.com/"
set :user,         "kinetic"
role :app,         "kineticdiagrams.com"

With Capistrano, I can painlessly to deploy to multiple servers in parallel. And my favorite feature is that I can roll-back to any previously-deployed version. Let’s say I accidentally deploy a bug in my application code and expose my gaffe to the world. With Capistrano, I can easily roll-back to an earlier working version

cap deploy:rollback

and continue with the rest of my day, dignity intact. If only my romantic relationships had a similar “undo” button…

Leave a Comment

Mission Capistrano

Old Mission Capistrano: Any child can prove: Every year, around March 19th, the famous swallows (the pride of Capistrano) return to deploy their mud nests in the ruins of this historical mission.


Photo by sp8254, Creative Commons license.

New Mission Capistrano: This year, by July 12th, my historic mission is to swallow my pride, and nest in my famously-muddy ruins, until my Capistrano deployments are child-proof.

Leave a Comment

Days of deployment drudgery are done

My web app deployment process currently looks like this:

  1. Fire up my FTP client.
  2. Fire up my SSH client.
  3. Type username, password, and server name, to login to my web hosting provider’s FTP server.
  4. Type username, password, and server name, to login to the web hosting provider’s SSH server.
  5. Navigate to the web directory where the web app is hosted.
  6. Open up the folder on my MacBook Pro, where the web app files are located.
  7. Carefully copy over each Ruby file to the corresponding directory on the FTP site.
  8. Carefully copy over all supporting assets (images, stylesheets) to the corresponding directories on the FTP site.
  9. Navigate (via typing) to the directory containing the server restart script.
  10. Run the server restart script.
  11. Validate that the updated version of the web app has indeed been deployed, by firing up a web browser to the web app’s location.
  12. If something is amiss, then revisit steps #5-#10 as necessary.
  13. Close the FTP client.
  14. Close the SSH client.


Photo by flickr user tzofia used under a Creative Commons license.

By next Monday, I’d like it to look like this:

  1. Run a single script.

The sooner I can streamline this error-prone process, the sooner I can re-focus on fun.

Leave a Comment

Heads down

…enjoying the creative process :)

Leave a Comment

Rack up another point for the Rubyists

When it comes to web apps, the Ruby community gets it. Rack is a great example of this. What is it? In a nutshell:

Any Ruby web framework now jives great with any Ruby web server. I can scale down from a fully load-balanced production system, to a local Apache, to a quick in-process web server without even thinking about it. I can also add components to the HTTP chain in a snap. Thanks to Rack’s amazing simplicity, it took less than one year to move from Rack 1.0 to widespread Rack compliance. As Sam Ruby put it: I love it when a plan comes together.


.http://www.flickr.com/photos/russmorris/202752043/

Kinetic’s web servers run Phusion Passenger, which is Rack-compliant. Since both Rails and Sinatra are Rack-compliant, I can hook either web framework up to Passenger. Today, my choice is Sinatra for its bare-bones simplicity…perfect for prototyping! But let’s say, six months from now, I also hop on the Rails bandwagon. Thanks to Rack, that hop will be painless.

Or maybe, just maybe, I decide to swap out Phusion Passenger for Unicorn. Or add a caching server to speed up the site. With Rack, I can freely mix-and-match. I love it, it’s plain brilliant.

Leave a Comment

Baby steps

I try to inch forward a bit each day. Steps so far towards getting the basic prototype working:

  • Setup of a Ruby webserver
  • Installed Graphviz on my webserver
  • Developed a basic set of HTML templates for the initial primitive prototype
  • Created a rudimentary landing page
  • Re-familiarized myself with the core Ruby language
  • Learned the Sinatra web framework


Photo by flickr user used under a Creative Commons license.

Leave a Comment

Beware the Sirens of Technology Affluenza

Yesterday, I quoted Larry Ellison saying that software is the “only industry that’s more fashion-driven than women’s fashion.”

Peek at the current issue of any IT management magazine, and you’ll probably find an article or two covering “cloud computing” or “virtualization.” These disruptive technologies are deservedly topics du jour. I see Amazon’s EC2 as a bona-fide game-changer. I can’t live without VMware Fusion on my MacBook Pro.

However, look past the allure…these “new and hot” technologies will not fulfill their promise for every business. I think this is what Larry Ellison meant by the industry being “fashion-driven.” Many businesses will feel the push to adopt these “fashionable” technologies.

In the world of personal finance, they affectionately call it:

affluenza, n. 1. The bloated, sluggish and unfulfilled feeling that results from efforts to keep up with the Joneses. 2. An epidemic of stress, overwork, waste and indebtedness caused by the pursuit of the American Dream. 3. An unsustainable addiction to economic growth.

The problem with “keeping up with the Joneses” is that the Joneses are quickly going broke. Likewise, “technology affluenza” (my term for the uncritical adoption of “fashionable” technologies) is harmful when it doesn’t contribute to business value.

As I’m developing Kinetic, I’m often tempted to incorporate “fashionable” technologies. For me, these “Sirens of Technology Affluenza” come in varied guises: Ajax, Maven, Amazon EC2, Nutch and Hadoop, CouchDB, Google Wave, and on and on…

resist the siren's call
http://www.flickr.com/photos/gael_lin/2670110513/

For example: I’m excited about Google Wave. Not just because of the handsome instant-messaging/email/collaboration client; what really gets my creative juices going are the Google Wave APIs and the underlying Google Wave Federation Protocol.

Wow! If I implement the Google Wave APIs, multiple users can edit the same Kinetic diagram at the same time! Sally can format a box here, while Joseph can add a line there–and they’ll be able to see each other’s edits immediately! It’ll be so cool!

Last week, I officially purged this task from my TODO list.

I now realize: Core functionality must come first. I can’t (yet) justify the business value for using the Google Wave API. Respond to the other Sirens with the same reasoning:  “Wouldn’t it be cool to–”

  • –use Ajax to make the website feel more responsive?” Only if the business value justifies it.
  • –deploy Amazon EC2 instances, so that my website can scale to a million users?” Only if the business value justifies it.
  • –use CouchDB to store my diagram data?” Only if the business value justifies it.

I’m naturally susceptible to “technology affluenza.” So, like Odysseus, I’ll need to plug my ears to the siren calls.

(Did you enjoy this blog post? I’d love to hear your feedback.)

Leave a Comment

Ragel rocks (part 2 of 2)

Yesterday, I touched on two strategies for developing domain-specific language (DSL) parsers: Hardcore DSLs may warrant using a full-fledged parser generator (such as JavaCC or ANTLR) with support for BNF grammar rules. For lighter-weight DSLs, regex hackery and substring functions often suffice to get the job done.

My challenge is–Kinetic’s language waffles somewhere in the middle of the “hardcore DSL” vs. “lightweight DSL” spectrum. Language design is largely about making hard choices about your language’s limitations. Right now, Kinetic is far from “hardcore.” It doesn’t support if/else/while control statements, exception handling, class definitions, and many amenities found in modern programming languages. And I’m ok with that: The target user for Kinetic isn’t the ninja coder; rather, Kinetic is intended to appeal to less-technical folks who want an easy way to describe complex diagrams.

But my ego won’t allow me to classify it as a “lightweight DSL” either. The current Kinetic language spec supports macro definitions, variable assignments, file references, embedded scripting, and a fairly expressive node-selection syntax. And new language features will be added as the project evolves.

Lucky for me–this middle-ground happens to be Ragel‘s sweet spot.

So what exactly does Ragel do? It’s a tool to describe finite state automata (FSA). Basically, an FSA defines the DSL parser’s behavior as it encounters each and every character in the input text. An FSA describes “states” and “transitions”:


http://www.complang.org/ragel/number_lg.png

For example, if a parser sees a “a” followed by a “d” followed by another “d”, it can conclude that it saw that word “add.” As the parser hops from “state” to “state” (from letter to letter), we can instruct it to perform special actions during those “state transitions”. For example, the parser can increment a counter variable every time it transitions from “a” to “d”.

The elegance of Ragel actually makes it fun to create DSL parsers. Remember how earlier I noted that “hardcore parser-generators” offer the benefit of BNF notation? Well, Ragel’s syntax isn’t quite as powerful as BNF, but it gets us mostly there. These Ragel rules are more maintainable than regexes. And, since it turns out that every regex is in fact an FSA (every regex engine starts by converting a regex to an FSA), if you’re already familiar with regexes, you’ll find it incredibly easy to pick up Ragel’s slick syntax.

Here’s some Kinetic language grammar defined using Ragel (compare to BNF):

unquoted_string        = ([^"]+);
quoted_string          = (["] unquoted_string ["] | (["] ["]));
node                   = (upper (('_' | alpha | digit)*?))
tagname                = (lower+ ('_' | lower)*)
variable_name          = (alpha (('_' | alpha | digit)*?))

Then, Ragel performs its magic –voila! — and generates code for your custom DSL parser. Peek at the Ragel-generated code and you’ll see thousands upon thousands of lines like this:

self._parser_actions = [
 0, 1, 0, 1, 1, 1, 2, 1,
 2, 6, 22, 2, 7, 4, 2, 7,
 23, 2, 16, 24, 2, 19, 23, 2,
 19, 24, 2, 20, 24, 2, 23, 0,
 3, 1, 6, 22, 3, 6, 1, 22,
 3, 8, 13, 23, 3, 10, 18, 23,
 3, 10, 18, 24, 3, 11, 15, 23]

And that, dear friend, is precisely why you never want to write a finite state automata by hand.

The generated Ruby/Java/C++ parser code is completely standalone. In other words, once you use Ragel to generate your parser, there’s no need to include Ragel libraries in your code distribution. This is simply brilliant.

If you’ve read this far, you may also want to check out Ragel’s other features. Frankly I think Ragel is underrated, and is worth adding to your quiver.

Thanks Adrian Thurston, for your gift to us. Ragel rocks!

Leave a Comment

Ragel rocks (part 1 of 2)

Zed Shaw is spot-on: “The Ragel State Machine Compiler is one kick-ass piece of software.

As part of Kinetic development, I need a way to parse several custom-designed languages, each with its custom grammar. The term popularized by Martin Fowler is DSLs, short for “domain-specific languages.”


http://www.flickr.com/photos/dilaudid/278649026/

In the past, when faced with this task of parsing DSLs, I took one of two approaches:

Approach #1: For parsing complex DSLs, the typical starting point, and the one taught in college compiler design courses, is to first define the language’s custom syntax via a comprehensive set of BNF grammar rules.

In junior high, I was one of those kids who actually enjoyed diagramming sentences. Well, defining BNF grammars for DSLs requires a similar masochistic mindset. Parser generators such as ANTLR allow these BNF grammars to be defined, and custom logic to be executed when the constituent words/tokens are “found” in the input text.

I’m over-simplifying the process of course, as any CS major who wasted away Memorial Day weekend staring at lex/yacc debugging messages can attest. And that is in fact the primary drawback of this approach of using parser generators: it’s usually too heavy-handed unless you really want to design a fairly rich custom DSL.

It’s no surprise that parsers and compilers are written by a relatively-tiny and arcane community of computer engineers. That business developer you’re paying $90/hour for? Chances are, he or she has never written a custom DSL parser.

Thankfully, I’ve only had to go down this road twice in my programming career (both for personal projects). I absolutely love the power of this approach–using full-fledged parser generator tools gives me the freedom to define complex DSLs–but the price in terms of software complexity and time is difficult to justify for most projects.

Approach #2: For that reason, I generally opt to keep my DSLs simple. Simpler DSLs means simpler parsing. This allows me to just cobble together straightforward text-parsing routines. These routines rely on basic multi-stage regular expression (regex) matchers and core string-manipulation functions. Nothing fancy, it’s anything any engineer can do.

Yes, it’s simplistic and dirty with the taint of hackiness. But–more importantly, it gets the job done every time.

The ugly catch? Debugging and maintaining these simplistic parsers is no fun. There’s a pearl of programming wisdom:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.”

Step away from several complicated regexes, and try patching that code a few years later. I’ve had to do that on several occasions, and each time, I growled at my former self. I would’ve had more luck trying to decipher the green letters in the Matrix. Add a pair of parentheses to an existing regex, and you risk breaking your back-references (Ruby 1.9′s support of named groups only helps very slightly here). The pain of maintaining regexes and convoluted string-manipulation logic generally leads to parsing code collecting more than its fair share of dust.

Which is why I was thrilled to discover Ragel.

(to be continued tomorrow…)

Comments (3)