As a Bioinformatics Research Consultant for the Lab, I'm responsible for making sure that our members have the tools they need to make sense of their data. Over the course of my first year here in the lab, I've had the privilege to:
- teach a weekly one-hour "Data Management, Analysis, and Visualization" class using the R statistical programming language.
- become intimately familiar with the bioinformatics inner-workings of our Applied Biosystems SOLiD sequencing machine.
- assess the lab's data archival needs and help our system administrator design appropriate directory structures and pipelines. While constructing our 454 cleanup pipeline: we've become involved with the leading edge of the BioPython and BioSQL software development communities.
- work closely with lab member's to help them answer their research questions using prepackaged bioinformatics toolkits as well as our own R and Python scripts.
- work in house and collaboratively with Armbrusters, other CEG labs, UW e-science, to design and build a cyberinfrastructure solution for our lab. Our current database houses SOLiD library preparation and run information as well as results from our in-house BLASTs (on our 256 CPU linux cluster). This database also serves as a backend for our AnnoJ genome browser installation: allowing high speed, AJAX powered interactivity with multiple tracks of experimental data ranging from microarrays and ESTs to gene-models and solid alignment coverage levels.
- develop automated scrubbing, subsetting, and population tracking plankton density visualization tools for the lab's continuous seawater flow cytometry device.
- construct web interfaces (in Django and PHP) to enable user-friendly access to the aforementioned tools.
- use the lab's 61 megapixel visualization wall for teaching class, presenting at lab meetings, and displaying high resolution data in formats ranging from PDFs of R plots to interactive whole-organism metabolic pathway diagrams with programs like BioCyc.