This is a place where I attempt to form coherent thoughts about current technology, computer science, math and the general things happening on the Internet.
A feed of the most recent posts is available.
Clustering senatorial speeches from 2008 by topic using t-stochastic neighbor embedding and latent dirichlet allocation.
An analysis of which Unix commands appear together more than random chance would suggest.
I recently gave a talk on a NLP project that I worked on for Kent's ACM