HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is designed to scale single servers to thousands of computers, providing local computation and storage. Over a decade after HADOOP emerged on the forefront as an open system for Big Data analysis. Its growth has prompted several improvisations for particular data processing needs, based on the type of processing conditions at various periods of computation. This paper, through reviewing several kinds of research provides the basic HADOOP system structure and the description of the MapReduce, HDFS Efficiency. Explai...