Skis / Toys / Fun

Appeal to greatness not guilt

Skis / Toys / Fun

Before you go half cocked and build a company…

July 11th, 2008

Found this on a old blog that I’m decommissioning — insight from an old WSJ article.

  • Tip 1: Don’t slap your name on just anything to make a buck.
  • Tip 2: Find weaknesses with existing products and fix them. The you can charge more.
  • Tip 3: But raise prices cautiously; make consumers feel they got more than expected and don’t hide company inefficiencies behind price hikes.
  • Tip 4: Seek new consumer bases, and design for them.
  • Tip 5: Invest in products that consumers will always need.

Tags:

How to handle Settings for Accounts.

June 26th, 2008

I’ve run into this problem a few times, but I’m putting out a call to see if anybody has some great insight into a good solution. To date, I’ve come up with three different approaches to handling user settings, but none of which I’m in love with. Other ideas, other comments, other ways?

Working assumptions

  • It’s handy to access specific elements outside of the database (e.g. who has XYZ preference set)
  • Databases appear to be better at storing small data than large (varchar < 255 vs. BLOB of 64k)
  • There is no right answer

Small records only

First encountered this at Excite with Excite Mail, since we were using OpenWave as the backend we could store 255 bytes in one of their database tables, and it became a preferences store. Though of course it was a bit more restrive than the below SQL, it got the job done. With a bit of class wrapping ontop, it really works.

Pro:
  • Only one record type
  • Con:
  • End up with a chained dkey (signature_0, signature_1, etc.)
  • CREATE TABLE IF NOT EXISTS setting (
        account_id      int unsigned NOT NULL,
    
        dkey            varchar(20) NOT NULL,
        data            tinyblob NOT NULL,
        created_on      TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
        PRIMARY KEY     (account_id, dkey),
        FOREIGN KEY     (account_id) REFERENCES account(account_id) ON DELETE CASCADE
    ) ENGINE=InnoDB;
    

    Small and Big records

    Ok, so when you don’t want to deal with chaining of records you can create a BIG and SMALL store of data.

    Pro:

  • No chaining of records
  • Easily delete keys (just delete from both and ignore errors)
  • Con:

  • Lots of extra model level helpers to determine where data should go
  • CREATE TABLE IF NOT EXISTS user_data (
        user_id         varchar(16) NOT NULL,
    
        dkey            varchar(20) NOT NULL,
        dtype           int DEFAULT 0,
        data            tinyblob,
    
        created_on      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
        PRIMARY KEY     (user_id, dtype, dkey),
        INDEX           (user_id, dtype),
        FOREIGN KEY     (user_id) REFERENCES user(user_id) ON DELETE CASCADE
    ) ENGINE=InnoDB;
    
    CREATE TABLE IF NOT EXISTS user_data_big (
        user_id         varchar(16) NOT NULL,
    
        dkey            varchar(20) NOT NULL,
        dtype           int DEFAULT 0,
        data            blob,
    
        created_on      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
        PRIMARY KEY     (user_id, dtype, dkey),
        INDEX           (user_id, dtype),
        FOREIGN KEY     (user_id) REFERENCES user(user_id) ON DELETE CASCADE
    ) ENGINE=InnoDB;
    

    Blob of doom

    What about the idea that you don’t want your SQL store to have a clue, so save settings in one big blob (json, xml encoded, etc.)

    Pro:

  • All the parsing is at code level, record management is easy (1 per user)
  • Con:

  • it’s a blob, you can’t really access it easily outside of your program
  • CREATE TABLE IF NOT EXISTS user_blob (
        user_id         varchar(16) NOT NULL,
    
        data            blob,
    
        created_on      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
        INDEX           (user_id),
        FOREIGN KEY     (user_id) REFERENCES user(user_id) ON DELETE CASCADE
    ) ENGINE=InnoDB;
    

    Any other ideas? Cool suggestions, the magic solution I’ve forgotten?

    Tags:

    Bike shirt poetry

    June 25th, 2008

    I bike… I bike more than average…  While a Poseur Jersey would be fun, I was thinking on my ride to work today that a shirt/jersey that said (primarily on my ride to work):

         The more I bike
         The less gas I use
         The more there is
         The cheaper it is
         The happier you are

    Maybe then a few cars would be a bit more considerate.

    Tags:  

    JavaScript (spidermonkey) version

    June 13th, 2008

    Way to carried away with this… here’s the SpiderMonkey version (as opposed to the rhino) one. Biggest complaint is the File object is a bit of a pain to use…

    fd = File('p.test');
    fd.open("read",'text');
    
    mre = /^__MULTI_TOKEN__\s+(\S+)\s+(.*)\t?\s*(\d+)\s*$/;
    sre = /^__SINGLE_TOKEN__\s+(\S+)\s*\t?\s*(\d+)\s*$/;
    
    var ofd = File('full.txt');
    ofd.open('write,create,replace','text');
    first = {}
    
    while (line = fd.readln()) {
        if (m = line.match(mre)) {
            first[m[1]] = (first[m[1]] ? first[m[1]] : 0) + parseInt(m[3]);
            ofd.writeln(m[1] + ” ” + m[2] + “\t” + m[3]);
        } else if (m = line.match(sre)) {
            first[m[1]] = (first[m[1]] ? first[m[1]] : 0) + parseInt(m[2]);
        } else {
            print(”Unknown: ” + line);
        }
    }
    
    ofd.close()
    
    ofd = File(’first.txt’);
    ofd.open(’write,create,replace’,'text’);
    for (key in first) {
        ofd.writeln(key + “\t” + first[key]);
    }
    ofd.close()
    

    This is all based on the PHP/Perl/Python performance code.

    Tags:

    JavaScript version of the Perl/PHP performance test

    June 13th, 2008

    Using the Rhino compiler… sorry no benchmark numbers, but just as a placeholder. To pick a nit, wanted to do first[m[1]] += parseInt(m[3]) but that yeilded a NaN since first[m[1]] was undefined… You would think that JavaScript would make undefined == 0.

    fd = java.io.BufferedReader(java.io.FileReader('p.test'))
    
    mre = /^__MULTI_TOKEN__\s+(\S+)\s+(.*)\t?\s*(\d+)\s*$/;
    sre = /^__SINGLE_TOKEN__\s+(\S+)\s*\t?\s*(\d+)\s*$/;
    
    ofd = java.io.PrintWriter(java.io.FileWriter('full.txt'));
    first = {}
    
    while (line = fd.readLine()) {
        if (m = line.match(mre)) {
            first[m[1]] = (first[m[1]] ? first[m[1]] : 0) + parseInt(m[3]);
            ofd.println(m[1] + ” ” + m[2] + “\t” + m[3]);
        } else if (m = line.match(sre)) {
            first[m[1]] = (first[m[1]] ? first[m[1]] : 0) + parseInt(m[2]);
        } else {
            print(”Unknown: ” + line);
        }
    }
    
    ofd.close()
    
    ofd = java.io.PrintWriter(java.io.FileWriter(’first.txt’));
    //  Sigh… rhino needs an update…
    //   first.forEach(function(e, i, a) { ofd.println(i + “\t” + e); });
    for (key in first) {
        ofd.println(key + “\t” + first[key]);
    }
    ofd.close()
    This is all based on the PHP/Perl/Python performance code.
    

    Tags:

    When’s a string not a string

    June 13th, 2008

    When it’s intertwined — ok, bad joke.

    I’m currently “fighting” (playing) around with the python storm ORM system and had the following problem (not the real code, but demonstration).

    What I wanted to say is:

    find(..., employee_title == 'CEO')

    However this didn’t work, I had to say:

    find(..., employee_title == u'CEO')

    After a bit of discussion down a mailing list, I got the “classic” answer back.

    Maybe you think it’s annoying, but it’s correct, and doing it with automatic conversion is error-prone.

    While I can see the point, it’s one of my pet peeves with Python specifically… Which is that there are quite a few programmers who are overly type aware, but yet are working in a language that is “late typing”.   You would think that in a world where one of the contributions to design is duck typing, would be the idea that last mile type enforcement is a bad thing.

    What’s really funny is that when you put this in context of this post I read this morning about 100% code coverage the following can be observed.

       To simply state a goal of 100% code coverage for a test suite in the presence of late typing is a red herring. 

    def foo(flg) :
       if flg :
           x = u'TEST'
       else :
           x = 'TEST'
    
    def bar(i) :
       x = ""
       x += foo(False)
       return x
    
    def unitTest() :
      j = foo(True)
      for i in range(1,3) :
         k = bar(i)
    

    I’ve got 100% code coverage in (flawed) unit test, but the problem is that since the typing is late there is the potential for lots of latent bugs. Eiffel! Since python isn’t going to give into true typing, it might as well adopt PEP 316: Programming by Contract to enable a test suite to enforce pre/post conditions to insure a function can only return specific types.

    Parting thought after I did a bit of investigation: Python Contracts - plus it improves the documentation of functions (another personal pain point)

    Tags: