Friday, June 12, 2009

Bird Tracks Through Math Land: Basic Matrix Ops

Math scares me. I hate to say it but it's true. I write software for super smart researchers/scientists who are solving all kinds of complex mathematical problems in computational biology and this discomfort with math has become a limiting factor in how much I can contribute to my group. So, it's time for me to fix the problem.

I don't learn well from just reading books on math. I have tried it. I always end up feeling like I've learned a few rules but I don't really "get" it. I need to have a sense for what the small pieces are and how they all fit together to form a larger system of high level abstractions/components. That's how I build software and it's the only way I am comfortable learning technical concepts. That's what got me thinking about "Bird tracks through math land". The best way for me to learn these concepts is to implement them! If you build something you have to understand it right?

I am going to do this is through a series of Literate Haskell posts. The idea is to take it slow, and explain things as fully as possible. In the end I would like to have a series of math tutorials/notes that also happen to be working code (there is nothing like a compiler to keep you honest!). My hope is that these notes will be almost completely self contained and should not require much more math background than what you get in high school. You will have to be able to read the Haskell source code though (I very highly recommend Learn You a Haskell for anyone new to Haskell).

A bit of background for anyone who is not familiar with Literate Haskell:

  • Haskell is a purely functional programming language. That means that the output of a function is completely determined by it's inputs (no cheating with hidden state). This aligns closely with the mathematical definition of a function, but is very different from main stream languages like C, Java ...
  • Haskell's syntax is beautiful and extremely concise. It is nearly as expressive as writing pseudocode which makes it perfect for this kind of project.
  • You're reading Literate Haskell source right now... Seriously! I wouldn't lie to you. Literate Haskell treats every line that does not start with a '>' (AKA bird track) as a comment. Quoting Dr. Knuth, who conceived Literate Programming "The main idea is to regard a program as a communication to human beings rather than as a set of instructions to a computer." Which is exactly what I am trying to achieve.

The road map for "Bird tracks through math land" is still up in the air, but for now I plan on starting with Linear Algebra and working my way toward its applications in probability and statistics. I am also very interested in Bayesian Networks so I think we'll head down that path later.

OK, enough blah blah. Lets get to it. We'll start by defining the basic matrix operations.

> module Math.BTTML.BasicMatrixOps (
>           dotProd,
>           matAdd,
>           matSub,
>           matMult,
>           eqZipWith,
>           matZipWith) where
>
> import Data.List

Let's define our own version of Haskell's zipWith function. eqZipWith applies the function f to each corresponding pair of elements in the given lists. It is just like Haskell's version, except that eqZipWith requires that the list lengths match.

> eqZipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
> eqZipWith f (x:xs) (y:ys) = (f x y):(eqZipWith f xs ys)
> eqZipWith f [] [] = []
> eqZipWith _ _ _ = error "List sizes should match"

Now let's create a matrix zip function. It applies a function f to each pair of corresponding elements from the two given matrices.

> matZipWith :: (a -> b -> c) -> [[a]] -> [[b]] -> [[c]]
> matZipWith f matrix1 matrix2 = eqZipWith (eqZipWith f) matrix1 matrix2

Subtract matrix1 from matrix2. This is an element-by-element operation.

> matSub :: (Num a) => [[a]] -> [[a]] -> [[a]]
> matSub = matZipWith (-)

Matrix addition works pretty much the same way

> matAdd :: (Num a) => [[a]] -> [[a]] -> [[a]]
> matAdd = matZipWith (+)

The dot product (AKA scalar product) of two vectors is the sum of the product of each pair of elements. The vectors have to be the same length.

> dotProd :: (Num a) => [a] -> [a] -> a
> dotProd vector1 vector2 = sum $ eqZipWith (*) vector1 vector2

The wikipedia page on matrix multiplication is great: http://en.wikipedia.org/wiki/Matrix_multiplication

The most important points are:

  • The row count of the resulting matrix is determined by the row count of matrix1
  • The column count of the resulting matrix is determined by the column count of matrix2
  • element[i, j] of the resulting matrix is calculated by taking the dot product of the ith row vector from matrix1 and the jth column from matrix2. Also note that because these two vectors must be the same length, the column count of matrix1 must equal the row count of matrix2.

In order to simplify the matrix multiplication we will transpose matrix2 (moves element[i,j] to element[j,i]) which makes it trivial to pull out column vectors

> matMult :: (Num a) => [[a]] -> [[a]] -> [[a]]
> matMult matrix1 matrix2 =
>   [[row1 `dotProd` col2 | col2 <- transpose matrix2] | row1 <- matrix1]

Now here are a couple of tests:

> testMat1 =
>   [[2, 0, -1, 1],
>    [1, 2, 0, 1]]
> testMat2 =
>   [[1, 5, -7],
>    [1, 1, 0],
>    [0, -1, 1],
>    [2, 0, 0]]
> testAdd = testMat1 `matAdd` testMat1
> testSub = testAdd `matSub` testMat1
> testMult = testMat1 `matMult` testMat2

Next time we'll solve linear equations with gaussian elimination

Updates: Applied CSS colors and made some code changes suggested by Conal

12 comments:

  1. Yo Keith, I think you should pluralize math like the Brits (e.g., Maths scares me.) They way I had it explained to me was mathematics is plural (the singular form, mathematic is obsolete), so anywhere you use math in place of mathematics it would be more correct to say maths.
    ReplyDelete
  2. Many of your definitions can be shorted by thinking/writing at the function level. For instance, you can replace

    > matAdd matrix1 matrix2 = matZipWith (+) matrix1 matrix2

    with

    > matAdd = matZipWith (+)

    As another example, change

    > matZipWith f matrix1 matrix2 = eqZipWith (eqZipWith f) matrix1 matrix2

    to

    > matZipWith f = eqZipWith (eqZipWith f)

    then to

    > matZipWith f = (eqZipWith . eqZipWith) f

    and finally to

    > matZipWith = eqZipWith . eqZipWith

    It will probably take some practice to develop function-level intuition. One trick that may help is to add parentheses in the type of eqZipWith:

    > eqZipWith :: (a -> b -> c) -> ([a] -> [b] -> [c])

    Now you can see that eqZipWith maps a binary function to another binary function (both curried) and therefore can be composed with itself any number of times.
    ReplyDelete
  3. Check out the hlint tool - it can give quite nice hints on code improvement, and would have suggested at least some of the above. The hints at what's possible can give you directions for deepening your insight. :-)
    ReplyDelete
  4. Hey, thanks to both of you for the tips. I can't believe you left out matMult. That's the ugliest one to my eye. I was thinking I need to redo that one with maps or something...

    I'm going to use your rewrite of matAdd/matSub. I like the curried versions. The illustration you gave was helpful but composition with functions that have > 1 argument is still kind of a reach for me.
    ReplyDelete
  5. For multi-argument composition and other generalizations, see http://conal.net/blog/posts/semantic-editor-combinators/. For instance, your dotProd def has a lovely little point-free form.

    matMult can be put in point-free form (as can everything), but in playing a bit, I didn't see a very pretty version.

    Expanding out your inner defs and tweaking names, I get

    matrix1 `matMult` matrix2 = [[row1 `dotProd` col2 | col2 <- transpose matrix2] | row <- matrix1]

    which I like. How about you?

    Is there a way to format code in your blog's comments?
    ReplyDelete
  6. yes, that reads much more clearly. Blogger hosts this blog and I don't know of any way to change the formatting in comments :-(
    ReplyDelete
  7. Here's a cleaned up multiply off the top of my head and 30s of scribbling. You have to write a 'transpose' function to use it.

    m1 `mul` m2 = transpose $ map (\q -> map (q`innerProd`) $ transpose m2) m1

    But there's an important distinction to be made here. Matrix operations are well and good, but it's important to know the underlying concept of linear maps. A linear map is a function from one vector space to another. When you put a coordinate system on each vector space, then you can construct a tuple of coordinates for each vector in each space. Every linear map between the two spaces has a corresponding matrix for the particular coordinate systems you use.

    Why is this important to care about? Partially because it really clarifies your thinking when you're working, since you stop worrying about rows vs columns and just think about functions from here to there. But also because it makes it more natural to express symmetries of the vector space as constraints on the form of the matrix entries.

    So you might amuse yourself by writing a set of functions for linear maps, projections onto subspaces, direct products, and a function that takes a map and a basis and produces a matrix.
    ReplyDelete
  8. That sounds like a good suggestion since this exercise is about building an intuition for what's "really" going on. I'll try exploring linear maps after the next post and see what I can come up with. Thanks!
    ReplyDelete
  9. Great. You have some fascinating posts. I have to compliment you on the writing in "What is a derivative, really?". The style is very clear and you do a good job of building intuition with examples (even so I will have to read some of it more than once for it to really sink in ;-)
    ReplyDelete
  10. Ever heard of dyscalculia? "Math dyslexia". Check out http://dyscalculiaforum.com
    ReplyDelete
  11. I hadn't, so thanks for the link. The way I see things is that we all have strengths and limitations and so all we can do is understand those and work with them the best way we can. I looked at the site and I think about 1/2 of the signs sound like me but the other 1/2 ... not so much (I can't split a check, but I was actually pretty good at high school physics. I think that's because the math in physics has a natural intuition built into the math)
    ReplyDelete