Back to basics with Linear Algebra

At work, we’re at a point where we’ve come into possession of a huge volume of data – data that is just begging to be sliced and diced, with the promise of unveiling secrets that lay unbeknownst to us for now.

My brief fling with Machine Learning along with conversations with a respected colleague led me to explore Singular Value Decomposition (SVD). The application of this technique supposedly played a significant role helping team “BellKor’s Pragmatic Chaos” take home the Netflix Prize.

So I started at the wikipedia page for SVD and found myself clueless as soon as the first equation appeared. No worries. Taking a step back, I find out that SVD is a factorization technique within a branch of mathematics called Linear Algebra.

Linear Algrebra it shall be.

My daily train commutes, where possible, have found me following the MIT lectures of one sufficiently old and distinguished Gilbert Strang teaching Introduction to Linear Algebra. The lecturer I never had.

I’m also having a go at working through the course textbook Introduction to Linear Algebra and doing the chapter-end problem sets. So far, I’ve started to grasp vector arithmetics, and some cursory idea of computing matrices. As a bonus, the concept of “singular, un-inversible matrices” has emerged in Lecture 3.

Hopefully I’ll have this SVD thing down pat in due time.

A cogent case for polyglotism in programming

When a dog owner wants to train his dog, the procedure is well-known and quite simple. The owner runs two loops: one of positive feedback and one of negative ditto. Whenever the dog does something right, the positive feedback loop is invoked and the dog is treated with a snack. Whenever the dog does something wrong, the dog is scolded and the negative feedback loop is used.

The result is positive and negative reinforcement of the dogs behavior. The dog will over time automatically behave as the owner wants and never even think of misbehaving

When a programming language trains its leashed programmer, it likewise uses positive and negative feedback. Whenever a problem is easily solvable in the constructs and features of said language, it reinforces the use of those features and constructs. And also in the same vein, if something is hard to do in the language, the programmer will shy away from thinking the idea, since it may be too hard to do in the language. Another negative feedback loop is when resource usage of a program is bad. Either it will use too much memory of too many CPU resources to carry out its work. This discourages the programmer from using that solution again.

The important point is that while all practical general purpose languages are Turing complete, the way they train programmers to behave as they want is quite different. In an Object Oriented language for instance, the programmer is trained to reframe most – if not all – questions as objects of bundled fields and methods. A functional programmer is trained to envision programs as transformations of data from the form X into the form Y through the application of a function. And so on.

(excerpt from One major difference – ZeroMQ and Erlang by Jesper Louis Andersen)

While I wouldn’t readily admit to the pertinence of  comparing a programming language to a canine owner in the way they interact with their primary subjects, the principle of positive and negative feedback loops extend beyond programming languages to the IDE‘s and platforms that we develop on. Beyond the industry, it is embedded in every tool and device we come in contact with, chipping away at every facet of daily life as we know it.

This is why I’ve spent significant pockets of my two week holiday wrapping my mind around the LISP-y Clojure and forcing myself to do it in Emacs (non-trivial, coming from many years of vimming). From the point of view of the final product, I’m hard pressed to point out any significant differences, i.e. It doesn’t matter what editor you use, or what language you code in; but from a educational and developmental angle, the benefits are rife and immense.

 

Nothing is like software

It’s not the first time this has happened: I get into a conversation about the work that I do, concepts in designing and building software, methodologies that the industry live and die by, then it comes like a whistling blow dart flying through the air:

“<insert skill or discipline or knowledge> is just like software”

Examples: “Cooking is just like software”, “Art History is just like software”, “Driving is just like software”, “Organizing a party is just like software”.

Let me catch you right there before you think I’m some arrogant bit-pusher who claims the subject of his line of work is incomparably superior to any other discipline, quite the opposite: notice the phrase was,

“<thing> is just like software”

not

“Software is just like <thing>”

To put in more concrete form, it is the difference between

“A human is just like a mannequin”

and

“A mannequin is just like a human”

See, the point I am trying to make is that software is a lump of plasticine. It bears no form of its own, it bears no identity, or agenda or bias. Software is what people use when they try to teach silicon wafers how to add numbers up, or coordinate a taxi fleet, or try understand human behavior – the great codifier of all things. It is puzzling to me when someone claims to derive anything original from a craft whose ultimate goal is mimicry.

So please, if you are looking for something original, and the glint of software catches your eye, don’t stop there. Instead, look through software, and find the thing that inspired it.

You owe me an icecream

We’ve been running TDD at work over the last 5 weeks or so since we started on this new e-commerce platform that we’re building for the business.

Five weeks in, our unit and functional tests have started to take more than 5 minutes to run, and this is after tweaks by yours truly, such as running the MySQL MEMORY tables instead of INNODB tables. 405 tests, just over 1800 assertions and consuming upward of 1.2GB of memory (bad PHP… bad bad PHP…).

Because there’s a team of 5 of us actively working on the same codebase, performing a complete git pull potentially breaks our local build/test cycles for various reasons. Given 5 people in the team (including oneself), one has high 4 in 5 probability of nailing the wrong person. If it turns out that the accuser was at fault, what ensues is a 100% chance of public humiliation.

So to soften the blow, I came up with a concept of owing an icecream. For example, when a colleague wrongly accuses you of breaking the build only to find that cause was him leaving something out, you’d say “now you owe me an icecream”.

So far it’s worked very well to express a combination of  “it’s not that big a deal, we all make mistakes” and “hey, you dumba**, don’t blame me for your incompetence” in a ratio that the receiver can tweak to taste.

“Icecream” invokes fun, carefree and light-heartedness. “Owe” bears the seriousness of the matter, and reeks of mortgage and interest rates. So the phrase becomes conversational equivalent of gunning down someone with a NERF gun or pillow-whacking someone over the head.

Symfony2 ContainerAware Callback Validation

Today I had a big win at work.

We’re building an SOA platform for the business, so different bits of data end up sitting on different services, contactable only via a RESTful API.

One of the roadblocks we hit was validating some Service B data entered into Service A. Symfony2’s built in validators do a great job when everything is available on the same box, in the same database, but falls completely flat when it needs to validate outside a service boundary.

So I figured out how to inject a DI container into a Callback validator on Service A, and proceeded to summon some other validator services that would query the existence of a particular resource entity residing on Service B via HTTP.

It’s late now, but I’ll share the code when I get a little more time. Leave a comment if you’re really desperate to find out how it was done.