A long standing tradition in scientific research is to keep detailed notes on everything as it happens. This studious attention to detail not only makes analysis and paper writing much easier, but also serves as a record of exactly how an experiment was performed should it need to be repeated in the future. By looking though a lab notebook, an experiment can be repeated exactly, and results can be verified.
Figure 1: A laboratory notebook used to record experiment setup, observations, ideas, data, and analysis results. Laboratory notebooks are permanent records of the events that transpired during an experiment, an experimenters thoughts and observations during an experiment, and the experimental results. These records are an invaluable resource when communicating research, and are often a legally binding record of research that was conducted.
Although I have since moved away from the laboratory, I still keep detailed
records of my work in a series of markdown files detailing the steps taken
as I perform data analyses and develop software. Over the last few months,
however, I've found that the volume of my notes has grown to large to simply
grep
for keywords.
To make it easier for me to find project- or task-specific development notes, I developed a Rust-based CLI tool called Rememberall, which uses term frequency and Bayesian inference to retrieve documents relevant to a query of keywords.
Read more...It is a common theme in non-linear modeling that small perturbations in initial conditions can result in massive deviations in the outcome of a simulation. In 1969, Nobel laureate Thomas Schelling explored how even small biases can have large sociological effects. In his paper "Models of segregation", Schelling described how a model in which a preference that one's neighbor's be of a specific mixture can lead to total segregation regardless of intent.
Figure 1: A 500x500 Schelling segregation model with 3 races. This simulation was conducted with a maximum minority threshold of 0.2 for 1000 ticks.
Using this concept as a starting point, I developed a Java-based application to simulate a Schelling segregation model for an arbitrary number of races (n>=1).
Read more...In addition to the epidemiology presentation this morning at SUNY Geneseo's 9th Annual GREAT Day symposium, I also presented a poster with Matthew Taylor on the use of computer vision in the localization of fluorescently labeled genomes in the extremely polyploid Epulopiscium sp. Type B. As a test-case for this technology, we used the coordinates for the localized chromosomes to estimate the chromosome density for cells during different life stages. For those interested, the poster is a good read.
As a part of our presentation, we presented a 3D model of chromosomes localized from a cell that forming daughters. The model is rendered in WebGL using the three.js library with support for both mouse and Leap Motion control.
Figure 1: Live demonstration of 3D chromosome distribution generated using computer vision. This figure is a live 3D demonstration of the spatial structure of an Epulopiscium cell's chromosomes. This model can be rotated by clicking and dragging or by using a Leap Motion. No, seriously. Try it.
Read more...Today was SUNY Geneseo's 9th Annual GREAT Day, a college-wide symposium of creativity and academic research. This morning Mathew Taylor and I presented our metapopulation network model for simulating international the spread of Ebola via aviation. Using real flight data donated by FlightAware, airport data from OpenFlights.org, and a gridded population of the world, we constructed a model consisting of over 3,000 airports, 82,000 routes, and over 4 billion individuals. Using this model, we tested the efficacy of country-based flight regulations in preventing the international spread of Ebola.
Read more...This past weekend was UP-Stat 2015, and with it was this year's data competition. I keeping with this year's theme of "statistical modeling in the era of data science," this year's data competition was an analysis of traffic data collected from the intersection of Culver Road and East Main Street in Rochester, NY. I decided to take a stab at analyzing the data with my friends Matthew Taylor and Tom Hartvigsen, and our analysis was presented as a finalist at the conference!
Feel free to read our report. If you are interest in our analysis or the competition data, it is available on github.
Read more...