Back to basics with Linear Algebra

At work, we’re at a point where we’ve come into possession of a huge volume of data – data that is just begging to be sliced and diced, with the promise of unveiling secrets that lay unbeknownst to us for now.

My brief fling with Machine Learning along with conversations with a respected colleague led me to explore Singular Value Decomposition (SVD). The application of this technique supposedly played a significant role helping team “BellKor’s Pragmatic Chaos” take home the Netflix Prize.

So I started at the wikipedia page for SVD and found myself clueless as soon as the first equation appeared. No worries. Taking a step back, I find out that SVD is a factorization technique within a branch of mathematics called Linear Algebra.

Linear Algrebra it shall be.

My daily train commutes, where possible, have found me following the MIT lectures of one sufficiently old and distinguished Gilbert Strang teaching Introduction to Linear Algebra. The lecturer I never had.

I’m also having a go at working through the course textbook Introduction to Linear Algebra and doing the chapter-end problem sets. So far, I’ve started to grasp vector arithmetics, and some cursory idea of computing matrices. As a bonus, the concept of “singular, un-inversible matrices” has emerged in Lecture 3.

Hopefully I’ll have this SVD thing down pat in due time.

A cogent case for polyglotism in programming

When a dog owner wants to train his dog, the procedure is well-known and quite simple. The owner runs two loops: one of positive feedback and one of negative ditto. Whenever the dog does something right, the positive feedback loop is invoked and the dog is treated with a snack. Whenever the dog does something wrong, the dog is scolded and the negative feedback loop is used.

The result is positive and negative reinforcement of the dogs behavior. The dog will over time automatically behave as the owner wants and never even think of misbehaving

When a programming language trains its leashed programmer, it likewise uses positive and negative feedback. Whenever a problem is easily solvable in the constructs and features of said language, it reinforces the use of those features and constructs. And also in the same vein, if something is hard to do in the language, the programmer will shy away from thinking the idea, since it may be too hard to do in the language. Another negative feedback loop is when resource usage of a program is bad. Either it will use too much memory of too many CPU resources to carry out its work. This discourages the programmer from using that solution again.

The important point is that while all practical general purpose languages are Turing complete, the way they train programmers to behave as they want is quite different. In an Object Oriented language for instance, the programmer is trained to reframe most – if not all – questions as objects of bundled fields and methods. A functional programmer is trained to envision programs as transformations of data from the form X into the form Y through the application of a function. And so on.

(excerpt from One major difference – ZeroMQ and Erlang by Jesper Louis Andersen)

While I wouldn’t readily admit to the pertinence of  comparing a programming language to a canine owner in the way they interact with their primary subjects, the principle of positive and negative feedback loops extend beyond programming languages to the IDE‘s and platforms that we develop on. Beyond the industry, it is embedded in every tool and device we come in contact with, chipping away at every facet of daily life as we know it.

This is why I’ve spent significant pockets of my two week holiday wrapping my mind around the LISP-y Clojure and forcing myself to do it in Emacs (non-trivial, coming from many years of vimming). From the point of view of the final product, I’m hard pressed to point out any significant differences, i.e. It doesn’t matter what editor you use, or what language you code in; but from a educational and developmental angle, the benefits are rife and immense.

 

Six hundred thousand

I boarded the Flinders Street train heading back to the city at the end of another work week. Sitting right across my usual spot on the train was a young girl, no more than 7 or 8 years old. She was holding up her little toddler brother while her mother spooned what would appear to be the final teaspoonfuls of pumpkin puree into the little baby brother’s mouth.

Daughter was very chatty. The whole time, they were discussing a game that both were playing. It involved breeding dragons, hatching eggs and collecting dragoncash, treats, gems (after a bit of googling, I’m led to believe that the game in question is DragonVale).

On the topic of the resources that they’d stockpiled so far, daughter mentioned that she had accumulated more than 600,000 treats and was consulting mother as to what she could best use it for.

Six hundred thousand.

This got me thinking. When I was her age, I didn’t have any tangible concept of enumeration beyond perhaps a thousand. Given, there was a vague awareness of millions, billions and gogolplexes (which likely only emerged in my consciousness when I was 11 or 12 years old), but even they were just words we used in bouts of “who can say the bigger number”. I doubt I could’ve strung number scales together (e.g. “hundred thousand”, “ten million”), and I certainly hadn’t painstakingly amass 600,000 of anything,  even till this day.

I wonder what such an awareness does to a person, and how it will go on to shape the way one goes on to interprets the world around.

Learning by reimplementing

I’ve been trying to pick up Clojure. My key motivating factor is the hope of one day being able to appreciate the gravity of the LISP family of languages, unveil the power behind S-expressions, dabble with the somewhat black art of concurrency and functional programming.

Thus far, it’s taken me the earlier half of 3 books on Clojure to get to this point where I’m relatively comfortable with the syntax and some of the idioms of the language. But because of its terseness, Clojure doesn’t afford a newbie very much “typing time” to digest and soak in the language. In Clojure, you start your first paren, and before you know it, you’ve executed the code in the REPL successfully – no mistakes to learn from, no cryptic error messages to keep in mind.

I’ve recently picked up a book called Programming Intelligence. It’s a book about data mining, machine learning and other meaty AI stuff with all its code samples in Python.

So what I’ve been doing as my latest stage in Clojure learning, is going through the book, digesting the Python code samples, and reimplementing them in Clojure.

Here’s an example of a snippet in Python:

And my version in Clojure:

I’ve only done about 3 snippets so far, but I feel like I’ve learned a lot because I haven’t simply copied code out of a book (which hasn’t really worked for reasons stated above), nor have I had to “come up” with things to implement. By reimplementing code from another language in Clojure, I get to practice the syntax while copying the algorithms.