Buster – a new R package for bagging hierarchical clustering

I recently found myself a bit stuck. I needed to cluster some data. The distances between the data points were not representable in Euclidean space so I had to use hierarchical clustering. But then I wanted stable clusters that would retain their shape as I updated the data set with new observations.

Include uncertainty in a financial model

Here's a post that appears on my new website, coppelia.io. The problem You've been asked to calculate some figure or other (e.g. end of year revenue, average customer lifetime value) based on numbers supplied from various parts of the business. You know how to make the calculation but what bothers you is that some of

Box Me

Here's a short R function I wrote to turn a long data set into a wide one for viewing. It's not the most exciting function ever but I find it quite useful when my screen is wide and short. It simply cuts the data set horizontally into equal size pieces and puts them side by

Mahout for R Users

I have a few posts coming up on Apache Mahout so I thought it might be useful to share some notes. I came at it as primarily an R coder with some very rusty Java and C++ somewhere in the back of my head so that will be my point of reference. I've also included

R: Dealing with package updates

Here's a very short post to highlight one of the "highlights" of my week that I thought was worth sharing with the wider community. One of the things I find great about R is the rapidly evolving ecosystem where new packages are being constantly created and others are being updated. Up until now, I've found

Visualising the Path of a Genetic Algorithm

We quite regularly use genetic algorithms to optimise over the ad-hoc functions we develop when trying to solve problems in applied mathematics. However it's a bit disconcerting to have your algorithm roam through a high dimensional solution space while not being able to picture what it's doing or how close one solution is to another.

