MapReduce

RHIPE – “Big Data” analytics made easy

October 20th, 2010

As I was browsing the Hadoop conference that was in town on October 12, 2010, I came across a session about utilizing Hadoop natively from R environment for statistical analytics of “Big Data”. After pausing for a few slides on the presentation (as I was going to another one actually) – I experienced déjà vu, as I discussed such capabilities in an email I wrote internally at the Lab couple of years prior.

Long story short – one of our clients asked for advice on how to achieve fast/near real-time analytics of Level-2 tick data from a large exchange – a 400 TB/year (at the time) stream – with 3+ years of history. Sounds like a tough, yet in-fact commonplace problem in exchange-traded product analytics.