Home
Big Data Training
Hadoop Training
Administrator Training for Apache Hadoop Training Course

Administrator Training for Apache Hadoop Training Course

Target Audience:

This course is designed for IT professionals seeking solutions to store and process large datasets within a distributed system environment.

Learning Objectives:

Gain in-depth knowledge of Hadoop cluster administration.

This course is available as onsite live training in Norway or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

1: HDFS (17%)

Explain the roles of HDFS Daemons.
Describe the standard operation of an Apache Hadoop cluster regarding both data storage and processing.
Recognize key characteristics of modern computing systems that necessitate a solution like Apache Hadoop.
Outline the primary objectives of HDFS Design.
Select appropriate use cases for HDFS Federation based on specific scenarios.
Identify the components and daemons required for an HDFS HA-Quorum cluster.
Evaluate the role of HDFS security mechanisms, specifically Kerberos.
Select the optimal data serialization method for a given scenario.
Describe the pathways for file read and write operations.
Identify commands for manipulating files using the Hadoop File System Shell.

2: YARN and MapReduce version 2 (MRv2) (17%)

Comprehend the impact of upgrading a cluster from Hadoop 1 to Hadoop 2 on cluster settings.
Understand the deployment of MapReduce v2 (MRv2 / YARN), including all associated YARN daemons.
Grasp the fundamental design strategy of MapReduce v2 (MRv2).
Determine how YARN manages resource allocations.
Identify the workflow of a MapReduce job executing on YARN.
Identify necessary file modifications to migrate a cluster from MapReduce version 1 (MRv1) to MapReduce version 2 (MRv2) on YARN.

3: Hadoop Cluster Planning (16%)

Key considerations when selecting hardware and operating systems for hosting an Apache Hadoop cluster.
Analyze options for selecting an operating system.
Understand kernel tuning and disk swapping mechanisms.
Identify hardware configurations suitable for a given scenario and workload pattern.
Determine the ecosystem components required for a cluster to meet SLA requirements in a given scenario.
Cluster Sizing: Identify workload specifics, including CPU, memory, storage, and disk I/O, based on a scenario and execution frequency.
Disk Sizing and Configuration: Understand JBOD versus RAID, SANs, virtualization, and disk sizing requirements within a cluster.
Network Topologies: Understand network usage in Hadoop (for HDFS and MapReduce) and propose or identify essential network design components for a given scenario.

4: Hadoop Cluster Installation and Administration (25%)

Identify cluster resilience against disk and machine failures in a given scenario.
Analyze logging configuration and the format of logging configuration files.
Understand the fundamentals of Hadoop metrics and cluster health monitoring.
Identify the functions and purposes of available cluster monitoring tools.
Install all ecosystem components in CDH 5, including (but not limited to): Impala, Flume, Oozie, Hue, Manager, Sqoop, Hive, and Pig.
Identify the functions and purposes of available tools for managing the Apache Hadoop file system.

5: Resource Management (10%)

Understand the overarching design goals of each Hadoop scheduler.
Determine how the FIFO Scheduler allocates cluster resources in a given scenario.
Determine how the Fair Scheduler allocates cluster resources under YARN in a given scenario.
Determine how the Capacity Scheduler allocates cluster resources in a given scenario.

6: Monitoring and Logging (15%)

Understand the functions and features of Hadoop’s metric collection capabilities.
Analyze the NameNode and JobTracker Web UIs.
Understand methods for monitoring cluster Daemons.
Identify and monitor CPU usage on master nodes.
Describe how to monitor swap and memory allocation on all nodes.
Identify methods to view and manage Hadoop’s log files.
Interpret log files effectively.

Requirements

Foundational Linux administration skills
Basic programming proficiency

35 Hours

Number of participants

Online

Classroom

Select Location

Please select a Venue

Price per participant

Open Training Courses require 5+ participants.

Administrator Training for Apache Hadoop Training Course - Booking

Full Name *

Email *

Phone *

Job Title

Company Name

Address 1 *

City *

State / Province

Country *

Postcode *

Start Date

Tax ID

Dates are subject to availability and take place between 09:30 and 16:30.

Payment *

Bank Transfer (Invoice, PO)

Debit / Credit Card

Comments

Terms and Conditions *

I am an authorised representative of the above named client and I wish to book the above courses or services in accordance with NobleProg Terms and Conditions and Privacy Policy.

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Administrator Training for Apache Hadoop Training Course - Enquiry

Full Name *

Email *

Phone *

Number of participants

Company Name

Company Address

How do you want to take the course?

Client Premises

Online

Classroom

Comments

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Administrator Training for Apache Hadoop - Consultancy Enquiry

Full Name *

Phone *

Email *

Company Name

Consultancy Subject *

Consultancy Goal

Who will the consultant work with?

Consultancy Urgency *

Comments

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Testimonials (3)

I genuinely enjoyed the many hands-on sessions.

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

I genuinely enjoyed the big competences of Trainer.

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

I mostly liked the trainer giving real live Examples.

Simon Hahn

Course - Administrator Training for Apache Hadoop

8500 EUR (Classroom)

Related Courses

Advanced R

14 Hours

This instructor-led, live training in Norway (online or onsite) is aimed at intermediate-level advanced R users who wish to use R to build faster workflows, improve code quality, and handle more complex analysis tasks.

By the end of this training, participants will be able to: create reusable functions, improve data workflows, debug and optimize code, and produce reproducible reports.

Algorithmic Trading with Python and R

14 Hours

This instructor-led live training in Norway (online or onsite) is designed for business analysts who wish to automate trading using algorithmic trading, Python, and R.

Upon completing this training, participants will be able to:

Use algorithms to rapidly buy and sell securities at specialized increments.
Lower the costs associated with trading through the use of algorithmic trading.
Automatically monitor stock prices and execute trades.

Programming with Big Data in R

21 Hours

Big Data encompasses solutions designed for the storage and processing of vast datasets. Originally pioneered by Google, these Big Data frameworks have evolved and inspired numerous similar open-source projects. R has established itself as a preferred programming language within the financial sector.

Introductory R (Basic to Intermediate)

14 Hours

This instructor-led, live training in Norway (online or onsite) is aimed at beginner-level data analysts who wish to use R programming to manipulate data, perform basic data analysis, and create compelling visualizations for insights.

By the end of this training, participants will be able to:

Understand the basics of R Programming.
Apply fundamental data science processes.
Create visual representations of data.

R Fundamentals

21 Hours

R is a free, open-source programming language designed for statistical computing, data analysis, and visualization. Its adoption is expanding among corporate managers, data analysts, and academic researchers. Additionally, R has gained a following among statisticians, engineers, and scientists who may lack formal programming training but appreciate its user-friendly nature. This growing popularity stems from the increasing reliance on data mining to achieve diverse objectives, such as optimizing pricing strategies, accelerating drug discovery, and refining financial models. R supports these efforts through a comprehensive ecosystem of packages tailored for data mining.

Cluster Analysis with R and SAS

14 Hours

This instructor-led live training in Norway (online or onsite) is aimed at data analysts who wish to program with R in SAS for cluster analysis.

By the end of this training, participants will be able to:

Use cluster analysis for data mining
Master R syntax for clustering solutions.
Implement hierarchical and non-hierarchical clustering.
Make data-driven decisions to help to improve business operations.

Data and Analytics - from the ground up

42 Hours

Data analytics is a crucial tool in business today. We will focus throughout on developing skills for practical hands on data analysis. The aim is to help delegates to give evidence-based answers to questions:

What has happened?

processing and analyzing data
producing informative data visualizations

What will happen?

forecasting future performance
evaluating forecasts

What should happen?

turning data into evidence-based business decisions
optimizing processes

Data Analysis with Python, R, Power Query, and Power BI

21 Hours

This guided, live training in Norway (available online or at your location) targets professionals at the beginner level who want to learn how to clean and analyze data, forecast statistical trends, and generate meaningful visual representations using these tools.

Upon completing this training, attendees will be capable of:

Gaining foundational knowledge of Python, R, Power Query, and Power BI for data analysis.
Cleaning and organizing datasets using Python and Power Query.
Executing statistical analysis and forecasting with R.
Developing professional dashboards and reports using Power BI.
Effectively integrating and analyzing data from various sources.

Data Analytics With R

21 Hours

R is a widely used, open-source environment designed for statistical computing, data analytics, and graphics. This course provides an introduction to the R programming language, covering essential language fundamentals, libraries, and advanced concepts. Participants will explore advanced data analytics and graphing techniques using real-world datasets.

Target Audience

Developers and data analytics professionals

Duration

3 days

Format

Interactive lectures combined with hands-on exercises

Econometrics: Eviews and Risk Simulator

21 Hours

This instructor-led, live training in Norway (online or onsite) is aimed at anyone who wishes to learn and master the fundamentals of econometric analysis and modeling.

By the end of this training, participants will be able to:

Learn and understand the fundamentals of econometrics.
Utilize Eviews and risk simulators.

Foundation R

7 Hours

This instructor-led, live training in Norway (online or onsite) is designed for beginner-level professionals who wish to master the fundamentals of R and learn how to work with data effectively.

By the end of this training, participants will be able to:

Understand the R programming environment and RStudio interface.
Import, manipulate, and explore datasets using R commands and packages.
Perform basic statistical analysis and data summarization.
Generate visualizations using both base R and ggplot2.
Manage workspaces, scripts, and packages effectively.

Forecasting with R

14 Hours

This instructor-led live training in Norway (online or onsite) is intended for intermediate-level data analysts and business professionals seeking to perform time series forecasting and automate data analysis workflows using R.

Upon completion of this training, participants will be able to:

Grasp the fundamentals of forecasting techniques in R.
Apply exponential smoothing and ARIMA models for time series analysis.
Leverage the 'forecast' package to create accurate forecasting models.
Automate forecasting workflows for business and research applications.

HR Analytics for Public Organisations

14 Hours

This instructor-led, live training (online or onsite) is aimed at HR professionals who wish to use analytical methods improve organisational performance. This course covers qualitative as well as quantitative, empirical and statistical approaches.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Statistical Analysis using SPSS

21 Hours

This instructor-led, live training in Norway (online or onsite) is designed for beginner to intermediate-level professionals who aim to perform statistical analysis using SPSS to accurately interpret data, execute complex statistical tests, and derive meaningful insights.

By the end of this training, participants will be able to:

Navigate the SPSS interface and manage datasets efficiently.
Perform descriptive and inferential statistical analyses.
Conduct t-tests, ANOVA, MANOVA, regression, and correlation analyses.
Apply non-parametric tests, principal component analysis, and factor analysis for advanced data interpretation.

Introduction to Data Visualization with Tidyverse and R

7 Hours

Target Audience

Course Format

Upon completing this training, participants will be capable of:

In this instructor-led, live training session, attendees will acquire the skills to manipulate and visualize data using the tools provided within the Tidyverse.

The Tidyverse comprises a suite of flexible R packages designed for data cleaning, processing, modeling, and visualization. Key packages include: ggplot2, dplyr, tidyr, readr, purrr, and tibble.

Individuals new to the R programming language
Those new to data analysis and visualization techniques

A blend of lectures, discussions, exercises, and extensive practical, hands-on practice

Execute data analysis and produce compelling visualizations
Derive meaningful insights from various sample datasets
Filter, sort, and summarize data to address exploratory questions
Convert processed data into informative line charts, bar charts, histograms, and other plots
Import and filter data from diverse sources, such as Excel, CSV, and SPSS files

Administrator Training for Apache Hadoop Training Course

Target Audience:

Learning Objectives:

Course Outline

1: HDFS (17%)

2: YARN and MapReduce version 2 (MRv2) (17%)

3: Hadoop Cluster Planning (16%)

4: Hadoop Cluster Installation and Administration (25%)

5: Resource Management (10%)

6: Monitoring and Logging (15%)

Requirements

Testimonials (3)

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

Simon Hahn

Course - Administrator Training for Apache Hadoop

Upcoming Courses

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Administrator Training for Apache Hadoop Training Course

Target Audience:

Learning Objectives:

Course Outline

1: HDFS (17%)

2: YARN and MapReduce version 2 (MRv2) (17%)

3: Hadoop Cluster Planning (16%)

4: Hadoop Cluster Installation and Administration (25%)

5: Resource Management (10%)

6: Monitoring and Logging (15%)

Requirements

Testimonials (3)

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

Simon Hahn

Course - Administrator Training for Apache Hadoop

Upcoming Courses

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Related Courses

Advanced R

Algorithmic Trading with Python and R

Programming with Big Data in R

Introductory R (Basic to Intermediate)

R Fundamentals

Cluster Analysis with R and SAS

Data and Analytics - from the ground up

What has happened?

What will happen?

What should happen?

Data Analysis with Python, R, Power Query, and Power BI

Data Analytics With R

Target Audience

Duration

Format

Econometrics: Eviews and Risk Simulator

Foundation R

Forecasting with R

HR Analytics for Public Organisations

Statistical Analysis using SPSS

Introduction to Data Visualization with Tidyverse and R

Related Categories

Hadoop

Statistics

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites