personal September 10, 2010 Jack No comments

New Machine

This is my first post for a topic not intended for a larger audience =). I’m building a new machine. I ordered the parts from Newegg and they should be here on Monday. Although… they are here in TX already (yes, I have the tracking number stats open in my browser right now, did you think I wouldn’t?), so I might be making a roadtrip this weekend just to have something to do other than dream about hardware.

Needless to say, I’m looking forward to it. The last computer I built in 2008 was primarily for media (our RAID box), so disk read performance was paramount over any sort of gaming etc. Juliette uses it as a Sims 3 box (via PlayOnLinux), and Scarlett plays her PBS/Disney/Nickelodeon flash games on it, so it works pretty well for that and serving up media to the PS3. But it’s an unorganized mess and even Juliette playing the Sims beats the shit out of the video card (a 7300 SE / 7200 GS I picked up for $30 just to be able to attach a monitor to the machine), so I decided to pick up an identical 460 GTX for it as well. I can’t tell you how hard I wanted to keep it a secret and SLI my box, but the Sims stop cranking the box and being able to use the HDMI output to the TV might cut the PS3 transcoding step out of watching media if I can figure out sound.

Just for shits though, I decided to lookup the benchmarks. Ouch. According to which is run by “Passmark” who I’ve never heard of (but also have no reason to doubt their relative numbers) the card that’s in there (7300 SE / 7200 GT — according to lspci) got a 66, and is ranked 656 of all of the cards they ranked. To put that in context, the Geforce TI 4400 I bought when I was… geez… a sophomore in high school (2002) got a 216 and is ranked 349. I couldn’t believe it. I mean, I knew it was a $30 card and it wasn’t going to impress anybody by any stretch of the imagination, but *damn*, being a third as powerful as a card I bought 6 years previously. Goddam. Now, I’m not sure how this benchmark really works, nor do I really care, but the 460 GTXs got 2296 for an overall rank of 7. And that’s vanilla (mine are overclocked) and with an unspecified amount of RAM (either 768M or 1G, mine are 1G).

So, quite a hefty upgrade for the RAID box. Assuming that the benchmark scales linearly (which might be a bad assumption), that means the card I’m putting into it is 35x better than the POS I’m taking out. Hope you enjoy actually being able to play the Sims, honey =).

Anyway, the stats: 3.0Ghz Core i7-950 (Bloomfield), 4GB of Kingston DDR3 1800 RAM, an overclocked Gigabyte GTX 460 video card, a Western Digital 1TB SATA 3Gb/s drive, a USB 3.0 compatible ASUS Sabretooth motherboard, a 750W Corsair PSU and a full size heavy-as-balls Antec case. Topped off with dual ASUS 24″ 1920×1080 widescreen LCDs.

programming, python September 3, 2010 Jack No comments

Using Decorators for Flexible Prompts

As a general rule, anything that you’re going to need to do many times should be abstracted to be made easy. When you’re coding something with a prompt, adding commands definitely falls into that category. With Python introspection and decorators make a new text prompt can be as simple as writing a function and defining a simple format for it. Like this:

    def sumstar(self, **kwargs):
        print sum(kwargs["terms"])

Which is safe, clear, and working. This method has a number of advantages:

  1. It easily allows for features like automatically prompting for missing arguments and argument defaults.
  2. The decorator for each command function doubles as documentation since it both simply lists the names of the keyword arguments as well as their types.
  3. Cleanly separates the function of the command from the validation of its arguments.
  4. Ideal for allowing users to write pluggable commands because actual command functions use native objects instead of strings, and they automatically receive the benefits of prompting, listing, defaults, etc. Plus, there’s only a single line of overhead when the validators are pre-defined.
  5. Allows validators to easily use each other so multiple minor variations of the same type of validation can be expressed simply.

The command_format Meta-Decorator

def command_format(*types):
    def cf(fn):
        def cfdec(self, **kwargs):

            rem = kwargs["args"]
            realkwargs = {}

            for kw, validator in types:
                validator = getattr(self, validator)
                valid, result, rem = validator(rem.lstrip())
                if not valid:
                    log.debug("Couldn't properly parse %s" % kwargs["args"])

                realkwargs[kw] = result

            return fn(self, **realkwargs)
        return cfdec
    return cf

It’s not every day that you see a triple nested def, but it’s pretty standard meta-decorator, where *types is a list of string tuples like [("keyword","validator")]. Where each keyword is the name of the keyword argument the parsed object will be passed as to the final command function and each validator is a string-reference to a method of the class this command resides in. If you’re operating outside of a class, this validator could easily be a function reference as well.

This meta-decorator is applied to every function that embodies a command. We’ll get to that later. For now, let’s look at what these validators look like.

A Simple Validator

Validators are passed one argument: the entire unparsed portion of the string you’re parsing, which can be ""/None. It returns a three tuple of objects. A boolean value whether the validator passed and, if it did, the resulting object and the remaining, unparsed portion of the string. Let’s take a look at a simple validator that will take a positive integer.

class MyCommandHandler(...)
    def uint(self, args):
        terms = args.split()
        if not terms:
            return (False, None, None)

            ret = int(terms[0])
            log.error("Couldn't parse %s as integer!" % terms[0])
            return (False, None, None)

        if ret < 0:
            log.error("Argument must be > 0")
            return (False, None, None)

        return (True, ret, " ".join(terms[1:]))

Extremely simple, it grabs the first whitespace delimited argument from the string, attempts to parse it as an integer and makes sure it’s >0 before returning that the arg passes validation.

Let’s put our command_format and the int validator to use by writing a simple sum that will take a string argument, attempt to parse out two integers and print the sum.

sum Example

class MyCommandHandler(...)
    def sum(self, **kwargs):
        print kwargs["term1"] + kwargs["term2"]

No need to do any extra validation on either argument, we know both that they exist and they’re integers, so we’re safe.

At this point, with a little wrapping code we can do the following:

jack@arpeggi:~ $ ./
sum 12 16

sum 1 b
Couldn't parse b as integer!
Couldn't properly parse 1 b

But that’s not very exciting. Let’s look a little harder and see how we can leverage these.

More Useful Validators

Automatic Argument Prompting

The first thing that pops to mind as being a useful addition to a command prompt is being a little more tolerant to missing arguments. Let’s make our uint validator prompt instead of bailing on an empty string.

class MyCommandHandler(...)
    def uint(self, args):
        if not args:
            args = raw_input("uint: ")

        terms = args.split()
        if not terms:
            return (False, None, None)

            ret = int(terms[0])
            log.error("Couldn't parse %s as integer!" % terms[0])
            return (False, None, None)

        if ret < 0:
            log.error("Argument must be > 0")
            return (False, None, None)

        return (True, ret, " ".join(terms[1:]))

NOTE: One of the attributes of the command_handler resulting decorator is that the arguments passed to each validator are stripped of leading whitespace so we know that if not args will work on any empty string passed to the validator.

Now, again with a shade of wrapper code, we have a bit more functionality.

jack@arpeggi:~/blog $ ./
uint: 13
uint: 14

sum 13
uint: 20

sum 14 15

Function on Lists of Arbitrary Types

So, at this point, we’ve basically got a sum2 function and, even though it’s nice (=P), it’s pretty inflexible. Let’s write another validator that generates a list of uints and work on that, with another appropriate function. Better yet, let’s write a validator that will make a list of *any* consistent type.

class MyCommandHandler(...)
    def listof_(self, prompt, val, args):
        l = []
        if not args:
            args = raw_input(prompt)

        while args:
            v, term, args = val(args.strip())
            if v:
                # If you wanted to be really fault tolerant, you could 
                # remove the first term and try again. We'll be really 
                # strict here though.
                return (False, None, None)
        return (True, l, "")

    def listof_uint(self, args):
        return self.listof_("uints: ", self.uint, args)

There we go. As you can see, the listof_ function will attempt to use any given validator on the input until it’s used up. A smarter implementation might not use up the whole input, but expect lists to be bracketed by symbols, or a certain length. This also includes the arbitrary prompt that we used before. Let’s put it to use implementing sum*.

    def sumstar(self, **kwargs):
        print sum(kwargs["terms"])

Done. Now we have a much more flexible sum* command. As evidenced using

jack@arpeggi:~/blog $ ./
uints: 4 7 8

sum* 3 8 10 11

sum* 4 5 6 a
Couldn't parse a as integer!
Couldn't properly parse  4 5 6 a

Closing Notes + Source

The above examples use positive integers (and yes, I’m aware that *sum* works on negative integers a well =P) because they’re a convenient type that everybody knows. One of the advantages of the system is that it doesn’t matter what sort of return object the validator gives, the command functions can just assume everything is all right because it never gets called if the args don’t parse correctly. I wrote a system very much like this for the new version of my side project Canto (an RSS reader) that allows users to give a list of story indexes, which are then converted from simple integers to actual story objects by the validators (listof_items) and passed to the corresponding function.

Also, this code is easily extensible to use any sort of input format (obviously raw_input isn’t exactly the most useful way to get a string from the user if you’re graphical or running ncurses, like I always am). Canto uses this via a Textbox class from a curses window.

The examples used in this write-up are available as executable .py files (used for the output sections).

Basic sum2 validating prompt:
Added missing argument prompting:
Added sum*:

These were tested on Linux with Python 2.6, but should work on practically any platform with any relatively modern Python. It’s also under public domain, if you actually care about the licensing of blog snippets =).


programming August 29, 2010 Jack No comments

Calm Down!

I’ve been ditching a lot of code lately. The first thing was my side-project Canto’s old codebase*. The second was this blog, which is now obviously a WordPress blog instead of a quirky but fun-loving git based CMS I wrote before. All together, I’ve shed thousands of lines of code and it feels great. But as I toss this code to the wayside, I can’t help but wonder “How did things get this way? Where did I go wrong?”

One thing that I’ve learned over the course of my programming travails is that design is more important than implementation. As I maintained and extended and improved Canto over the two or three years since I first started it, I learned a lot about showing foresight. A lot about how getting a feature into a codebase right is a better long-term strategy than getting a feature into a codebase quick.

Programming is a tough task. It requires a lot of concentration and thought and, as if that wasn’t bad enough, it’s generally done under a lot of pressure. Either from deadlines from a boss, or just expectations of your users (or your perception of either). A lot of choices I’ve made in the last couple of years have been wrong because they’ve been twisted by having the perception an audience. If someone sends you an email and they’re in distress because of a bug you feel as if they’re waiting for your response, constantly refreshing their mail client or feeling frustration with your software the whole time the flaw goes unpatched. You feel as though there are a thousand other users that have run into the same bug and just never reported it only to move on to greener pastures. You see the 10 day (or more!) turnaround time on your packages in repositories. The fact that Ubuntu is now actively serving up a broken version to its users and there’s nothing you can do but fix it for the next iteration (Salacious Salamander or whatever) six months later. Suddenly it’s as if everyone using your software is secretly unhappy with it.

It’s hard not to face these troubles and feel an urgency beyond reality. Feel the need to find it, fix it, and push it to the repo and make a release before anybody else reports the same problem. But, unless you’re dealing with a critical bug and your software is immensely popular it’s probably not as urgent as you imagine. Most likely, the user reported the bug and moved on with their lives. They either uninstalled your software, will use something else in the meantime, or will patiently wait for a fix. If you’re lucky, maybe they included a patch. The point is, what the user does is what the user does, it’s up to you to take your time and really analyze a problem before you fix it. There are plenty of potential users out there and for any user you lose over a single bug there’s another who will try your software after it’s patched. That’s why it’s important to find the right solution instead of the quick solution because the right solution will keep you from getting into this situation again (or make it easier to fix correctly when you do) and the quick solution increases the probability that you’re going to feel that sort of pressure, and lose another user over essentially the same bug.

Coding is all about trying to maximize the time you spend writing features and feeling rewarded, and minimize the time you spend bug fixing and feeling like your users are searching you out with machetes. Anything you do to sway that balance in your favor should be done without hesitation. But what can you do? I might not be the best person to ask, but these are some improvements I’ve been trying over the last few weeks and so far so good. I call my new philosophy “Calm The Fuck Down

  1. Start over. Either on the project, subsystem, or feature level. Even with recognition of the fact that the current implementation has become unmanageable, it’s always hard to look at functioning code and say “This is crap! Tear it out!” but sometimes that’s the only thing to say. Coders tend to look at their own code and not see its utility, or its elegance, but the weight of all of the obstacles they’ve overcome, or the bugs they’ve ironed out. To be passionate about good code, you have to be dispassionate about your own code. As you gain experience you see the error of your ways and have to re-evaluate practically every decision you’ve made in the light of new information and the easiest way to do this is to start over. It’s not always the right thing to do, but when you suddenly realize that your code has turned into a chain of hacks then you have no other path forward. Some developers believe they can just do a quick stub-out of code, or a quick and dirty narrow implementation now and return later. Maybe you can. I can’t. My thought is that if you can’t take the time to do it right, it’s time to hold off and give it a bit more thought.
  2. Put everything on paper first. This is something that I find helps alot, particularly if you’ve really got a good handle on the language you’re using, because it helps you layout your data structures beforehand and anticipate your needs before you even open your editor. Sometimes it doesn’t work out (I actually just rebased out of existence a commit that I had planned out on paper but had fallen victim to a number of unexpected twists), but most of the time an hour of planning can save you a full day of coding. If you get into the habit of putting everything on paper first, your first step in responding to a feature request or a bug report is grabbing a piece of paper and a pen, instead of impulsively taking a hatchet to your codebase.
  3. Don’t let the phantom specter of imagined users fleeing over bugs or missing features rush your design decisions. I’ve fought this battle for so long that it’s become meaningless. There’s always another feature, always another bug. If you rush to fix a bug, you’re probably simultaneously laying the groundwork for the next one. Getting a fix or a feature out means nothing if it bites you in the ass down the line. So unless people’s boxes are being enslaved into a malicious porn spam botnet because of a security flaw, it can wait.

As for all of that code I’ve scrapped over the last few months, I chalk it up to one thing: practice. If the wise men of yore are to be trusted, that apparently makes perfect…

Reddit tl;dr : Regardless of perceived pressure, calm the fuck down and take your goddam time.

*Tangentially, Canto 0.8.x is much stronger and in alpha for harder-core IRC lurkers.