-
Archives
- November 2014
- September 2014
- April 2014
- August 2013
- May 2013
- February 2013
- October 2012
- September 2012
- May 2012
- April 2012
- March 2012
- January 2012
- November 2011
- May 2011
- April 2011
- February 2011
- January 2011
- October 2010
- September 2010
- August 2010
- May 2010
- February 2010
- November 2009
- October 2009
- September 2009
-
Meta
Tag Archives: aws
An example of using F# and C# (.net/mono) with Amazon’s Elastic Mapreduce (Hadoop)
This posting gives an an example how F# and C# can scale potentially to up to thousands of machines with Mapreduce in order to efficiently process TeraByte (TB) and PetaByte (PB) data amounts. It shows a C# (c sharp) mapper … Continue reading
Programmatic Deployment to Elastic Mapreduce with Boto and Bootstrap Action
A while back I wrote about How to combine Elastic Mapreduce/Hadoop with other Amazon Web Services. This posting is a small update to that, showing how to deploy extra packages with Boto for Python. Note that Boto can deploy mappers … Continue reading
Posted in cloud computing, Hadoop and Mapreduce
Tagged amazon, automation, aws, deployment, elastic mapreduce, hadoop, mapreduce
4 Comments
Atbrox Customer Case Study – Scalable Language Processing with Elastic Mapreduce (Hadoop)
We developed a tool for scalable language processing for our customer Lingit using Amazon’s Elastic Mapreduce. More details: http://aws.amazon.com/solutions/case-studies/atbrox/ Contact us if you need help with Hadoop/Elastic Mapreduce.
Posted in cloud computing
Tagged amazon, aws, data processing, elastic mapreduce, hadoop, language processing, nlp
2 Comments
How to combine Elastic Mapreduce/Hadoop with other Amazon Web Services
Elastic Mapreduce default behavior is to read from and store to S3. When you need to access other AWS services, e.g. SQS queues or database services SimpleDB and RDS (MySQL) the best approach from Python is to use Boto. To … Continue reading
Posted in cloud computing, Hadoop and Mapreduce, infrastructure
Tagged amazon, aws, hadoop, mapreduce, python, simpledb, sqs
4 Comments
Unstructured Search for Amazon’s SimpleDB
SimpleDB is a service primarily for storing and querying structured data (can e.g. be used for a product catalog with descriptive features per products, or an academic event service with extracted features such as event dates, locations, organizers and topics). … Continue reading
Posted in cloud computing
Tagged amazon, aws, hadoop, latency, python, s3, search, simpledb, storage, structured search, unstructured search
2 Comments