Sunday, May 31, 2009

TxtSushi 0.3.0

Just released TxtSushi 0.3.0 which adds:
  • Improved SQL parsing and error reporting
  • GROUP BY ... HAVING support with several aggregate functions (see home page)
  • A little utility "namecolumns" which will add sequential column names to a table which is useful if your flat file doesn't have a header with column names.

Sunday, May 24, 2009

haskell & music

This guy created a pretty cool command line interface to make music with text http://yaxu.org/haskell-hack

Thursday, May 21, 2009

TxtSushi 0.2

I just released TxtSushi 0.2 with the following updates:
  1. Improved type coercion. Some of the rules I was using before did not make sense. At some point I will write down what the rules are.
  2. Added some extra functions/operators including a regex matcher. Here is the full list: SUBSTRING, UPPER, LOWER, TRIM, *, /, +, - (binary and unary), =, <> (not equal test), <, <=, >, >=, AND, OR, || (string concatination), =~ (regex matching)

Saturday, May 16, 2009

Yay! TxtSushi 0.1

I just released my first haskell application called TxtSushi. It's basically useful for processing comma-delimited tables with SQL select statements, plus some other small conversion and formatting utilities. Here's an example that I just tried out with real data and ... it works!
wget -q -O - ftp://ftp.informatics.jax.org/pub/reports/MRK_List2.rpt | tabtocsv - \
| tssql -table mgi - \
'select `MGI Accession ID`, Symbol, Chr, trim(`cM Position`)
from mgi where (Chr = 1 or Chr = 8 or Chr = 19) and trim(`cM Position`) = "N/A"
order by Chr+0, Symbol' \
| csvtopretty -
Which gives you:
MGI Accession ID|Symbol                     |Chr|TRIM(cM Position)
MGI:3829209     |100039643                  |1  |N/A
MGI:3823247     |100042382                  |1  |N/A
MGI:3826364     |665246                     |1  |N/A
MGI:3828086     |667118                     |1  |N/A
MGI:3829949     |Bmd5a                      |1  |N/A
MGI:3829954     |Bmd5b                      |1  |N/A
MGI:3829959     |Bmd5c                      |1  |N/A
MGI:3762525     |Drinkcacl24                |1  |N/A
MGI:3762526     |Drinkmgcl21                |1  |N/A
MGI:3762535     |Drinkmgcl24                |1  |N/A
MGI:3762388     |Drinksac5                  |1  |N/A
MGI:3836959     |Mir1927                    |1  |N/A
MGI:3836960     |Mir1928                    |1  |N/A
MGI:3837225     |Mir1981                    |1  |N/A
MGI:98018       |OTTMUSG00000002279         |1  |N/A
MGI:3840135     |OTTMUSG00000020948         |1  |N/A
MGI:3834078     |OTTMUSG00000026591         |1  |N/A
MGI:3826770     |Qrr1                       |1  |N/A
MGI:3826773     |Qrr1d                      |1  |N/A
MGI:3826772     |Qrr1p                      |1  |N/A
MGI:3844119     |Sfp1                       |1  |N/A
MGI:3832320     |T(1E2.1;8B1.2)2Lub         |1  |N/A
MGI:3843694     |Tg(tetO-Chrnb2*V287L)H3Gica|1  |N/A
MGI:3720916     |Tgq9                       |1  |N/A
MGI:3640786     |lrm1                       |1  |N/A
MGI:3822907     |384645                     |8  |N/A
MGI:1924337     |Ankrd11                    |8  |N/A
MGI:3844136     |Arrh1                      |8  |N/A
MGI:3705791     |Defa-ps3                   |8  |N/A
MGI:3837211     |Mir1966                    |8  |N/A
MGI:3837213     |Mir1967                    |8  |N/A
MGI:3837215     |Mir1968                    |8  |N/A
MGI:3837216     |Mir1969                    |8  |N/A
MGI:3833469     |OTTMUSG00000016477         |8  |N/A
MGI:3833836     |OTTMUSG00000031120         |8  |N/A
MGI:3844123     |Sfp3                       |8  |N/A
MGI:3628904     |T(Tp(1E2.1);8B1.2)2Lub     |8  |N/A
MGI:3720925     |Tgq18                      |8  |N/A
MGI:3640782     |gpg6                       |8  |N/A
MGI:88396       |Chrm1                      |19 |N/A
MGI:3762554     |Drinkqhcl2                 |19 |N/A
MGI:3762516     |Drinksac2                  |19 |N/A
MGI:3720096     |Hdlq59                     |19 |N/A
MGI:3837023     |Mir1950                    |19 |N/A
MGI:1914960     |Polr2g                     |19 |N/A
MGI:3843453     |Prdt5                      |19 |N/A
MGI:3828068     |Tgq29                      |19 |N/A
I created this because it is something that will be useful to my work (flat files are just about everywhere you turn in bioinformatics), but I'm really hoping that this will be something that is generally useful to other people.

Wednesday, May 6, 2009

Happy to be in Maine

I moved to Maine over a year ago and it is a great place to live for a lot of reasons, but today it's an even better place to live. The governor just signed a bill that makes gay marriage legal! It is also really encouraging to know that this change is taking place within the elected branches of government which shows that we are seeing a popular shift. It looks like opponents to gay marriage are trying to organize a "peoples veto" so we have to keep pushing forward whether that just means talking to friends and family to explain how you feel or possibly volunteering time or money to organizations like EqualityMaine. Also, those of us lucky enough to see things changing in our own states need to keep doing what we can to change things for gay and lesbian families in the rest of the country.

Saturday, May 2, 2009

Learning about monads: some suggestions

So, I'm relatively new to Haskell, and I have to say that monads were a part of the language that took me a while to get comfortable with. This post is not a tutorial... there are already many monad tutorials out on the web that do a better job than I would. Instead this post contains a few of my "lessons learned" about how to best use your time when you are learning monads.
  1. My biggest suggestion is that you understand that there is nothing "special" about monad types. In my initial attempts to learn monads a lot of the material I read left me with the impression that monads somehow play by different rules than the rest of the Haskell language. The monad class is a plain old Haskell class that just happens to define an interface that is very useful for implementing a particular design pattern that shows up often with things like IO and error propagation.
  2. Don't expect to fully understand monads until you feel comfortable with higher-order functions (functions that take functions as arguments) and Haskell's type system (classes). The Learn You a Haskell tutorial is a lot of fun and provides a good introduction to these prerequisites.
  3. The "do" notation is useful for formatting reasons but it can also help to reinforce the idea that monads are somehow special. I recomend that you start learning how to use the >> and >>= functions for monads sooner rather than later so that you will see how monads work in a way that is consistent with the rest of Haskell.
  4. I also think its useful to read some opinions that are critical of monad tutorials so that you can make better decisions about how to spend your time. See: Brent's post on monad tutorials and a blog post on why monad tutorials are awful.