The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant.

HDFS(Hadoop Distributed File System) is designed to run on commodity hardware – Low cost hardware

HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware.

HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data.

HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is now an Apache Hadoop subproject. The project URL is http://hadoop.apache.org/hdfs/.

HDFS(Hadoop Distributed File System) is designed to run on commodity hardware – Low cost hardware

The goal of HDFS

  • Hardware failure is the norm rather than the exception.
  • Streaming Data Access
  • Large Data Sets
  • Simple Coherency Model
  • Moving Computation is Cheaper than Moving Data
  • Portability Across Heterogeneous Hardware and Software Platforms

 

Data Replication

HDFS is designed to reliably store very large files across machines in a large cluster.

HDFS(Hadoop Distributed File System) is designed to run on commodity hardware – Low cost hardware

MapReduce Software Framework

Offers clean abstraction between data analysis tasks and the underlying systems challenges involved in ensuring reliable large-scale computation.

HDFS(Hadoop Distributed File System) is designed to run on commodity hardware – Low cost hardware

- Processes large jobs in parallel across many nodes and combines results.
- Eliminates the bottlenecks imposed by monolithic storage systems.
- Results are collated and digested into a single output after each piece has been analyzed.

 

References

http://hadoop.apache.org/common/docs/current/hdfs_design.html

http://www.cloudera.com/what-is-hadoop/hadoop-overview/

http://www.infoq.com/articles/data-mine-cloud-hadoop



facebook posting twit

  • The way can see the picture on the received email in outlook
  • If you want to copy clipboard image on livewriter as local image, use Clipboard Capture or Clipboard Live – Live Writer Plug-in
  • With AWS cloud,we met our reliability and performance objectives at a fraction of the cost – Mr. Chun Kang – Pricipal Engineer, Samsung
  • Hive provides SQL-like query language on HDFS(Hadoop Distributed File System)
  • Apache HBase is a storage system, with roots in Hadoop, and uses HDFS for underlying storage.
  • Hadoop MapReduce is a software framework for processing vast amounts of data in-parallel on large clusters
  • Apache Hadoop is designed to scale up from single servers to thousands of machines
  • Table of International Country Code, Time Zones, And Dialing prefix lookup
  • Web Cache function in Network Gateway could cause internet service trouble
  • How to add groups to active resistered user in phpBB ?
    Tagged on:                     
  • Leave a Reply