May 14

Attended Accel Partners Big Data conference last week. It was a good event with many interesting people, a very crude estimate of distribution: 1/3 VCs/investors, 1/3 startup tech people, 1/3 big corp tech people +-.

My personal 2 key takeaways from the conference:

  1. Realtime processing: hot topic with many companies creating their own custom solutions, but wouldn’t object having an exceptionally good opensource solution to gather around.
  2. Low-latency storage: emerging topic – or as quoted from the talk by Andy Becholsteim’s (Sun/Arista/Granite/Kealia/HighBAR co-founder and early Google-investor): “Hard Disk Drives are not keeping up. Flash solving this problem just in time”. The academic session had also interesting discussions regarding RAM-based storage.

I think Andy Becholsteim’s table titled “Memory Hierarchi is Not Changing” sums up the low-latency storage discussion quite good. I’ve taken the liberty to add a column with rough prices per Petabyte-month (calculation: estimated purchase-price divided by 12, note only the storage itself – not including all the hardware/network in order to run it) for RAM and SSD which are the only ones fit for low-latency AND big data. Note: I think mr. Becholsteim could have added up to petabytes for both SSD and RAM.

Type of memory Size Latency $ per Petabyte-month* (k$)
L1 cache 64 KB ~4 cycles (2 ns)
L2 cache 256 KB ~10 cycles (5 ns)
L3 cache (shared) 8 MB 35-40+ cycles (20 ns)
Main memory GBs up to terabytes 100-400 cycles 411 (non-ECC)
1,197 (ECC)
Solid state memory GBs up to terabytes 5,000 cycles 94
Disk Up to petabytes 1,000,000 cycles

*Storage price sources and calculations used

RAM (non-ECC): 16GB non-ECC (2x8GB) – price: $79, i.e. $79/16 per GB, $(79/16)K per TB, $(79/16)M per PB, $(79/16)M/12 per PB-month
RAM (ECC): 16GB ECC (1x16GB) – price: $229.98, i.e. $230/16 per GB, $(230/16)K per TB, $(230/16)M per PB, $(230/16)/12 per PB-month.
SSD: 512GB – price $579.99, i.e. $580/512 per GB, $(580/512)K per TB, $(580/512)M per PB, $(580/512)/12 per PB-month.


Since RAM-based storage is up to 50 times faster than SSD (latency-wise) but only roughly 4.3 to 12 times more expensive than SSD it is likely to become high on the agenda in settings where latency matter$ (all types of serving infrastructure, search, finance etc.). In absolute terms the costs for petabytes RAM have become within reach for all Fortune 1000 companies, i.e. about $1.1M per month for the storage alone (ECC RAM). One interesting thing about using RAM only is that for most systems using SSD or Disks there is also a big RAM component in addition, e.g. using memcached or caches various nosql storages, and by moving to RAM-only things might become simpler (i.e. avoiding dealing with memory-vs-disk/ssd-coherency and latency variations when not hitting the memory cache).

Note 1: If you have other sources for interesting large-scale RAM and SSD prices I would appreciate if you could add links to them in the comments below.

Note 2: If you’re interested in large-scale RAM-based key-value stores, check out our opensource project Atbr – github page:

Best regards,

Amund Tveit co-founder of Atbrox (@atbrox)

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
Tagged with:
preload preload preload