INTRODUCTION
Need for large data processing
Challenges in distributed computing --- meeting hadoop
COMPARISON WITH OTHER SYSTEMS
Comparison with RDBMS
ORIGIN OF HADOOP
SUBPROJECTS
THE HADOOP APPROACH
Data distribution
MapReduce: Isolated Processes
INTRODUCTION TO MAPREDUCE
Programming model
Types
HADOOP MAPREDUCE
Combiner Functions
HADOOP STREAMING
HADOOP PIPES
HADOOP DISTRIBUTED FILESYSTEM (HDFS)
ASSUMPTIONS AND GOALS
Hardware Failure
Streaming Data Access
Large Data Sets
Simple Coherency Model
“Moving Computation is Cheaper than Moving Data”
Portability Across Heterogeneous Hardware and Software Platforms
DESIGN
HDFS Concepts
Blocks
Namenodes and Datanodes
The File System Namespace
Data Replication
Replica Placement
Replica Selection
Safemode
The Persistence of File System Metadata
The Communication Protocols
Robustness
Data Disk Failure, Heartbeats and Re-Replication
Cluster Rebalancing
Data Integrity
Metadata Disk Failure
Snapshots
Data Organization
Data Blocks
Staging
Replication Pipelining
Accessibility
Space Reclamation
File Deletes and Undeletes
Decrease Replication FactorHadoop Archives
Using Hadoop Archives
Need for large data processing
Challenges in distributed computing --- meeting hadoop
COMPARISON WITH OTHER SYSTEMS
Comparison with RDBMS
ORIGIN OF HADOOP
SUBPROJECTS
THE HADOOP APPROACH
Data distribution
MapReduce: Isolated Processes
INTRODUCTION TO MAPREDUCE
Programming model
Types
HADOOP MAPREDUCE
Combiner Functions
HADOOP STREAMING
HADOOP PIPES
HADOOP DISTRIBUTED FILESYSTEM (HDFS)
ASSUMPTIONS AND GOALS
Hardware Failure
Streaming Data Access
Large Data Sets
Simple Coherency Model
“Moving Computation is Cheaper than Moving Data”
Portability Across Heterogeneous Hardware and Software Platforms
DESIGN
HDFS Concepts
Blocks
Namenodes and Datanodes
The File System Namespace
Data Replication
Replica Placement
Replica Selection
Safemode
The Persistence of File System Metadata
The Communication Protocols
Robustness
Data Disk Failure, Heartbeats and Re-Replication
Cluster Rebalancing
Data Integrity
Metadata Disk Failure
Snapshots
Data Organization
Data Blocks
Staging
Replication Pipelining
Accessibility
Space Reclamation
File Deletes and Undeletes
Decrease Replication FactorHadoop Archives
Using Hadoop Archives
No comments:
Post a Comment