Monthly Archives: November 2009

Atbrox Customer Case Study – Scalable Language Processing with Elastic Mapreduce (Hadoop)

We developed a tool for scalable language processing for our customer Lingit using Amazon’s Elastic Mapreduce. More details: http://aws.amazon.com/solutions/case-studies/atbrox/ Contact us if you need help with Hadoop/Elastic Mapreduce.

Posted in cloud computing | Tagged , , , , , , | 2 Comments

How to combine Elastic Mapreduce/Hadoop with other Amazon Web Services

Elastic Mapreduce default behavior is to read from and store to S3. When you need to access other AWS services, e.g. SQS queues or database services SimpleDB and RDS (MySQL) the best approach from Python is to use Boto. To … Continue reading

Posted in cloud computing, Hadoop and Mapreduce, infrastructure | Tagged , , , , , , | 4 Comments

Preliminary Experiences Crawling with 80legs

80legs is a company specializing in the crawling and preprocessing part of search, where you can upload your seed urls (where to start crawling), configure your crawl job (depth, domain restrictions etc.) and also run existing or custom analysis code … Continue reading

Posted in cloud computing | Tagged , , , , , | 3 Comments