Mapreduce & Hadoop Algorithms in Academic Papers (5th update – Nov 2011)

The prior update of this posting was in May, and a lot has happened related to Mapreduce and Hadoop since then, e.g.
1) big software companies have started offering hadoop-based software (Microsoft and Oracle), 2) Hadoop-startups have raised record amounts, and 3) nosql-landscape becoming increasingly datawarehouse’ish and sql’ish with the focus on high-level data processing platforms and query languages.

Personally I have rediscovered Hadoop Pig and combine it with UDFs and streaming as my primary way to implement mapreduce algorithms here in Atbrox.

Best regards,
Amund Tveit (twitter.com/atveit)

Changes from the prior postings is that this posting only includes _new_ papers (2011):

Artificial Intelligence/Machine Learning/Data Mining

Bioinformatics/Medical Informatics

Image and Video Processing

Statistics and Numerical Mathematics

Search and Information Retrieval

Sets & Graphs

Simulation

Social Networks

Spatial Data Processing

Text Processing

This entry was posted in hadoop, machine learning, mapreduce. Bookmark the permalink.

7 Responses to Mapreduce & Hadoop Algorithms in Academic Papers (5th update – Nov 2011)

  1. Pingback: 30 Hadoop and Big Data Spelunkers Worth Following | My Blog

  2. Pingback: Mapreduce & Hadoop Algorithms in Academic Papers (5th update – Nov 2011) « Another Word For It

  3. Nikzad says:

    For whom ther are interested in MapReduce, these two papers may be intersting:
    1) “A Study on Using Uncertain Time Series Matching Algorithms in Map-Reduce Applications”
    http://arxiv.org/abs/1112.5505

    2) “MapReduce Implementation of Prestack Kirchhoff Time Migration (PKTM) on Seismic Data”
    http://www.computer.org/portal/web/csdl/doi/10.1109/PDCAT.2011.50

  4. Karthikeyan says:

    hello,
    can any one help me to find the coding or methodology for hadoop clustering in text mining? please………

  5. Pingback: 09CST-FYP交流平台 » 数据分析与数据挖掘相关资源整理

  6. Pingback: Hadoop Learning Resources | hadoop4u

  7. Ratnesh says:

    Thanks! Hadoop enables resilient, distributed processing of massive unstructured data sets across commodity computer clusters, in which each node of the cluster includes its own storage. MapReduce serves two essential functions: It parcels out work to various nodes within the cluster or map, and it organizes and reduces the results from each node into a cohesive answer to a query. More at http://www.youtube.com/watch?v=1jMR4cHBwZEa

Leave a Reply

Your email address will not be published. Required fields are marked *