Hadoop for Developers (4 days) Treningskurs

Last updated




28 timer (vanligvis 4 dag inkludert pauser)


  • comfortable with Java programming language (most programming exercises are in java)
  • comfortable in Linux environment (be able to navigate Linux command line, edit files using vi / nano)

Lab environment

Zero Install : There is no need to install hadoop software on students’ machines! A working hadoop cluster will be provided for students.

Students will need the following

  • an SSH client (Linux and Mac already have ssh clients, for Windows Putty is recommended)
  • a browser to access the cluster. We recommend Firefox browser


Apache Hadoop er den mest populære rammen for behandling av Big Data på klynger av servere. Dette kurset vil introdusere en utvikler til ulike komponenter (HDFS, MapReduce, Pig, Hive og HBase) Hadoop økosystem.

    Machine Translated


    Section 1: Introduction to Hadoop

    • hadoop history, concepts
    • eco system
    • distributions
    • high level architecture
    • hadoop myths
    • hadoop challenges
    • hardware / software
    • lab : first look at Hadoop

    Section 2: HDFS

    • Design and architecture
    • concepts (horizontal scaling, replication, data locality, rack awareness)
    • Daemons : Namenode, Secondary namenode, Data node
    • communications / heart-beats
    • data integrity
    • read / write path
    • Namenode High Availability (HA), Federation
    • labs : Interacting with HDFS

    Section 3 : Map Reduce

    • concepts and architecture
    • daemons (MRV1) : jobtracker / tasktracker
    • phases : driver, mapper, shuffle/sort, reducer
    • Map Reduce Version 1 and Version 2 (YARN)
    • Internals of Map Reduce
    • Introduction to Java Map Reduce program
    • labs : Running a sample MapReduce program

    Section 4 : Pig

    • pig vs java map reduce
    • pig job flow
    • pig latin language
    • ETL with Pig
    • Transformations & Joins
    • User defined functions (UDF)
    • labs : writing Pig scripts to analyze data

    Section 5: Hive

    • architecture and design
    • data types
    • SQL support in Hive
    • Creating Hive tables and querying
    • partitions
    • joins
    • text processing
    • labs : various labs on processing data with Hive

    Section 6: HBase

    • concepts and architecture
    • hbase vs RDBMS vs cassandra
    • HBase Java API
    • Time series data on HBase
    • schema design
    • labs : Interacting with HBase using shell;   programming in HBase Java API ; Schema design exercise



    Related Categories

    Relaterte kurs


    Kursrabatter Nyhetsbrev

    We respect the privacy of your email address. We will not pass on or sell your address to others.
    You can always change your preferences or unsubscribe completely.

    Some of our clients

    is growing fast!

    We are looking to expand our presence in Norway!

    As a Business Development Manager you will:

    • expand business in Norway
    • recruit local talent (sales, agents, trainers, consultants)
    • recruit local trainers and consultants

    We offer:

    • Artificial Intelligence and Big Data systems to support your local operation
    • high-tech automation
    • continuously upgraded course catalogue and content
    • good fun in international team

    If you are interested in running a high-tech, high-quality training and consulting business.

    Apply now!

    This site in other countries/regions