Category Archives: Hadoop and Mapreduce

Combining Hadoop/Elastic Mapreduce with AWS Redshift Data Warehouse

There are currently interesting developments of scalable (up to Petabytes), low-latency and affordable datawarehouse related solutions, e.g. AWS Redshift (cloud-based) [1] Cloudera’s Impala (open source) [2,3] Apache Thrill (open source) [4] This posting shows how one of them – AWS … Continue reading

Posted in analytics, cloud computing, Hadoop and Mapreduce | Tagged , , , , | 5 Comments

Mapreduce & Hadoop Algorithms in Academic Papers (4th update – May 2011)

Follow @atbrox It’s been a year since I updated the mapreduce algorithms posting last time, and it has been truly an excellent year for mapreduce and hadoop – the number of commercial vendors supporting it has multiplied, e.g. with 5 … Continue reading

Posted in Atbrox, cloud computing, Hadoop and Mapreduce | 16 Comments

Mapreduce in Search

Wrote about mapreduce in search in a presentation for next week. Mapreduce in Search (more up-to-date pdf version of the presentation) Best regards, Amund Atbrox

Posted in Atbrox, Hadoop and Mapreduce, infrastructure, search | Tagged , , | 2 Comments

Programmatic Deployment to Elastic Mapreduce with Boto and Bootstrap Action

A while back I wrote about How to combine Elastic Mapreduce/Hadoop with other Amazon Web Services. This posting is a small update to that, showing how to deploy extra packages with Boto for Python. Note that Boto can deploy mappers … Continue reading

Posted in cloud computing, Hadoop and Mapreduce | Tagged , , , , , , | 4 Comments

Recommended Mapreduce Workshop

If you are interested in Hadoop or Mapreduce, I would like to recommend participating or submitting your paper to the First International Workshop on Theory and Practice of Mapreduce (MAPRED’2010) (held in correspondance with the 2nd IEEE International Conference on … Continue reading

Posted in cloud computing, Hadoop and Mapreduce | Tagged , , | Leave a comment