This category contains 7 posts

Mahout for R Users

I have a few posts coming up on Apache Mahout so I thought it might be useful to share some notes. I came at it as primarily an R coder with some very rusty Java and C++ somewhere in the back of my head so that will be my point of reference. I’ve also included … Continue reading

Two Quick Recipes: Ubuntu and Hadoop

There are so many flavours of everything and things are changing so quickly that I find every task researched online ends up being a set of instructions stitched together from several blogs and forums. Here’s a couple of recent ones. Ubuntu on AWS (50 mins) Was going to buy a new laptop but it made … Continue reading

EC2 Tutorials: rJava – annoying enough to have its own blog post

One of the most frustrating items that I’ve been trying to install on my EC2 instance is rJava. Its an R package that lots of other packages have as a dependency, including glmulti and MongoDB. I’ve spent a fair few hours trying to get this installed, constantly receiving the error message: configure: error: Java Development … Continue reading

EC2 Tutorial: NumPy and SciPy

Another quick note for getting set up on your EC2 instance. To install SciPy, you first need to install ATLAS and lapack. The following few lines of code run as root (sudo bash) should sort you out: yum -y install atlas-devel yum install lapack pip install scipy

EC2 Tutorials: Scheduling tasks on EC2 using Crontab

One of my main reasons for wanting an EC2 instance was to be able to automatically run scripts at certain times, normally to collect data and save it to a database. As my EC2 instance is always running, I can forget about it for a month and have a month’s worth of data ready and … Continue reading

EC2 Tutorials: Installing new software; yum, pip, easy_install, sudo-apt

For anyone familiar with python and easy_install, Amazon Linux uses “yum” as its easy installation system, and it is possible to install “pip” and “easy_install” to install new python packages. As I’ve tried to install new software on my box, I’ve found lots and lots of references to sudo-apt as the standard way to install … Continue reading

EC2 Tutorials: Getting Started on Amazon Web Services

I’ve been interested in setting up an Amazon Web Services EC2 instance for a while – essentially a remote desktop in the cloud, which can be handy when you want an always-on machine (say, to run scripts at particular times, or to have easy access to a particular machine setup). Over the next few weeks, … Continue reading

Blog Stats

  • 336,388 hits

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 528 other followers