Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- Definition of Big Data
- Reasons behind the growing popularity of Big Data
- Case studies on Big Data
- Key characteristics of Big Data
- Solutions for managing Big Data
Hadoop and Its Components:
- Understanding Hadoop and its core components
- Hadoop architecture and the characteristics of data it can handle and process
- A brief history of Hadoop, including companies that use it and the motivations behind their adoption
- Detailed explanation of the Hadoop framework and its components
- Explanation of HDFS and the read/write operations within the Hadoop Distributed File System
- Procedures for setting up a Hadoop cluster in various modes: standalone, pseudo-distributed, and multi-node cluster
(This section covers setting up a Hadoop cluster on VirtualBox, KVM, or VMware, addressing critical network configurations, running Hadoop daemons, and testing the cluster).
- Overview of the MapReduce framework and its operational mechanisms
- Executing MapReduce jobs on a Hadoop cluster
- Concepts of replication, mirroring, and rack awareness within Hadoop clusters
Hadoop Cluster Planning:
- Strategies for planning your Hadoop cluster
- Aligning hardware and software requirements for effective cluster planning
- Analyzing workloads to plan a cluster that prevents failures and ensures optimal performance
Introduction to MapR and Its Advantages:
- Overview of MapR and its architecture
- Understanding and working with the MapR Control System, MapR Volumes, snapshots, and mirrors
- Planning a cluster specifically for MapR environments
- Comparing MapR with other distributions and Apache Hadoop
- MapR installation and cluster deployment processes
Cluster Setup and Administration:
- Managing services, nodes, snapshots, mirrored volumes, and remote clusters
- Understanding and managing nodes
- Understanding Hadoop components and installing them alongside MapR services
- Accessing cluster data, including via NFS, and managing services and nodes
- Managing data through volumes, user and group management, role assignment to nodes, node commissioning and decommissioning, cluster administration, performance monitoring, configuring and analyzing performance metrics, and administering MapR security
- Understanding and working with M7 native storage for MapR tables
- Configuring and tuning the cluster for optimal performance
Cluster Upgrade and Integration with Other Setups:
- Upgrading the MapR software version and types of upgrades
- Configuring the MapR cluster to access an HDFS cluster
- Setting up a MapR cluster on Amazon Elastic MapReduce
All topics include demonstrations and practice sessions to provide learners with hands-on experience with the technology.
Requirements
- Fundamental knowledge of Linux file systems
- Basic Java programming skills
- Familiarity with Apache Hadoop (recommended)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay