Apache Hadoop YARN is the modern distributed operating system for big data applications. It morphed the Hadoop compute layer to be a common resource-management platform that can host a wide variety of applications. Many organizations leverage YARN in building their applications on top of Hadoop without repeatedly worrying about resource management, isolation, multitenancy issues, etc. The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. It employs a NameNode and DataNode architecture to implement a distributed file system that provides high-performance access to data across highly scalable Hadoop clusters. Wangda Tan and Wei-Chiu Chuang the current status of Apache Hadoop 3.x—how it’s used today in deployments large and small, and they dive into the exciting present and future of Hadoop 3.x—features that further strengthen Hadoop as the primary resource-management platform and the storage system for enterprise data centers. They explore the current status and the future promise of features and initiatives for both YARN and HDFS of Hadoop 3.×. For YARN 3.x, there is powerful container placement, global scheduling, support for machine learning (Spark) and deep learning (TensorFlow) workloads through GPU and field-programmable gate array (FPGA) scheduling and isolation support, extreme scale with YARN federation, containerized apps on YARN, support for long-running services (alongside applications) natively without any changes, seamless application/services upgrades, powerful scheduling features like application priorities, intra-queue preemption across applications, and operational enhancements including insights through Timeline Service v2, a new web UI, better queue management, etc. Also, HDFS 3.0 announced GA for erasure coding, which doubles the storage efficiency of data and thus reduces the cost of storage for enterprise use cases. HDFS added support for multiple standby NameNodes for better availability