Tag Archives: python

atbr now has Apache Thrift support

atbr (large-scale and low-latency in-memory key-value pair store) now supports Apache Thrift for easier integration with other Hadoop services. Thrift Example Checkout and install atbr Prerequisite Install/compile Apache Thrift – http://thrift.apache.org/ Compile a atbr thrift server and connect using python … Continue reading

Posted in cloud computing | Tagged , , , , | Leave a comment

atbr – supports websocket-based sharding

atbr (large-scale and low-latency in-memory key-value pair store) now supports websocket-based sharding for parallel deployments. Websocket Sharding Example Checkout and install atbr Start 3 servers loaded with data Start shard server talking to shards Connect to shard server and lookup … Continue reading

Posted in cloud computing | Tagged , , , , , , , | 1 Comment

atbr – large-scale in-memory hashtables (in Python)

Large-scale in-memory key-value stores are universally useful (e.g. to load and serve tsv-data created by hadoop/mapreduce jobs), in-memory key-value stores have low latency, and modern boxes have lots of memory (e.g. EC2 intances with 70GB RAM). If you look closely … Continue reading

Posted in cloud computing | Tagged , , , | 5 Comments

Parallel Machine Learning for Hadoop/Mapreduce – A Python Example

Atbrox is startup providing technology and services for Search and Mapreduce/Hadoop. Our background is from from Google, IBM and Research. Update 2010-June-17 Code for this posting is now on github –http://github.com/atbrox/Snabler This posting gives an example of how to use … Continue reading

Posted in cloud computing, Hadoop and Mapreduce, infrastructure | Tagged , , , , , , , , | 14 Comments

How to combine Elastic Mapreduce/Hadoop with other Amazon Web Services

Elastic Mapreduce default behavior is to read from and store to S3. When you need to access other AWS services, e.g. SQS queues or database services SimpleDB and RDS (MySQL) the best approach from Python is to use Boto. To … Continue reading

Posted in cloud computing, Hadoop and Mapreduce, infrastructure | Tagged , , , , , , | 4 Comments