My areas of focus are building and running a computing infrastructure for bioinformatics research and writing software tools that allow users to scale an analysis from 1 core to hundreds of cores. Our computing environment is mostly Linux (CentOS) servers networked together to operate as a single cluster using the Rocks cluster management toolkit (http://www.rocksclusters.org/). The specific resource manager/scheduler combination is Torque/Maui. Much of software that I write for this system could be considered pipelines that split, distribute, and merge the analysis of large genomic data sets across the cluster. Basically wrappers for common third party bioinformatics tools (e.g. BLAST, Interproscan, HMMER, BWA, Velvet) and in-house custom software, taking advantage of the embarrassingly parallel nature of many problems in genome research. Specific areas of research interest are the quality processing, alignment, and assembly of next-gen sequencing data, with a focus on color space sequencing technology (SOLiD).
Languages used include Python, Shell, AWK, C, R, and Perl. Version control setup is Subversion and Git with gitolite.