Search
Close this search box.

Certificate Program on

Big Data Engineering

Leveraging Cloud for Big Data Analytics

3 Weeks Live Sessions | 4th Weekend Campus Immersion

Certification Bootcamp Focus

This Bootcamp provides a comprehensive overview and introduction into Big Data Engineering technologies, tools, and the corresponding cloud-based services.
Focus is to build skills and best practices around cloud-based Big Data infrastructure and analytics solutions and how cloud-based services can be integrated into a company’s IT and data infrastructure. Students will learn the core functionality of the major Big Data Infrastructure components and how they integrate to form a coherent solution with business benefit including insights into compliance related to EU General Data Protection Regulation (GDPR)
Hands-on exercises aim to provide insight into how the cloud-based services and tools can simplify processing of Big Data by using cloud-based services for Hadoop, MapReduce, Spark, HBase, Hive, Pig, Machine Learning and general data analytics. 

Learning Outcomes

Outline the basic concepts of Big Data and related technologies, and apply them to analyze general use cases and those related to their organizations
Compare and select the Big Data Infrastructure services from the major Cloud Service Providers to use them for enterprise data management and analysis
Describe main properties of the SQL and NoSQL databases, select appropriate database type depending on data and analysis
Outline the major components and processes of the Enterprise Data Governance Architecture and corresponding organizational roles; develop the company’s Data Management Plan (DMP) and a corresponding implementation plan
Select, assess, and deploy Hadoop or Spark cluster on one of cloud platforms (Azure HDInsight, Amazon EMR, Google Cloud Platform, or others); become acquainted with functionality and programming model of the main Hadoop ecosystem components MapReduce, Spark, HBase, Hive, Pig, Kafka, others; program simple tasks using one of scripting or programming languages (e.g. Hive SQL, Pig Latin, Python, Java)
Outline the main security and privacy challenges in using Big Data technologies; apply industry best practices and existing applications to protect companies’ data and customers personal data.

Why this Bootcamp

Work on Real-life Data Science Problems

Take your career headon by working on projects using a competency based learning paradigm. Quality of time spent and the outcome is far more important than the quantity.

Work 1:1 with a Mentor

We pair you with a mentor who has extensive professional and academic knowledge of the field. You’ll have one-on-one conversations with your mentor, and receive useful feedback on improving your work.

We Will Keep You Engaged

Our mentors are here to keep you motivated, answer questions, provide feedback, and help deepen your understanding of essential tools and techniques. Learn with live online classes and face to face sessions. Learning is best when you are able to ask the questions and clarify your doubts with the faculty.

What You Will Learn

■ Cloud Service Models and Operation, Cloud Resources, Multitenancy
■ Virtual Hybrid/Dynamic Cloud Datacenter, and outsourcing enterprise IT infrastructure to Cloud
■ Cloud use cases and scenarios for enterprise
■ Cloud Economics and Pricing Model

■ Overview of major Cloud based Big Data Platforms: AWS, Microsoft Azure, Google Cloud Platform (GCP). Introduction into MapReduce/Hadoop
■ Hadoop Ecosystem and Components
■ HDFS and Cloud Based File Systems
■ HBase, Hive and Pig, YARN MapReduce/Hadoop Programming and Tools

■ SQL basics (recollection from Database and SQL course)
■ NoSQL Databases types and overview 
■ Column based databases and use (e.g.HBase) 
■ Modern large scale databases AWS Aurora, Azure CosmosDB, Google Spanner

■ Data Streams and Stream Analytics
■ Spark Architecture and Components
■ Popular Spark platforms, DataBricks, Spark Programming and Tools

■ Enterprise Big Data Architecture and Large Scale Data Management 
■ Data Structures, Data Warehouses. Distributed Systems
■ CAP Theorem, ACID and BASE Properties
■ Cloud Based Services, Data Lakes
■ Big Data Security challenges, Data Protection 
■ Access Control and Identity Management

1. Run MapReduce tasks, e.g. word count; run a ranking algorithm, run graph Pregel (shortest path) algorithm.

2. For an enterprise profile select and suggest the enterprise Big Data Infrastructure, services and components. Also create a Data Management Plan (DMP) and cost assessment and deployment plan.

Projects and Skillathons

Group project on enterprise Big Data infrastructure: Data Management Plan (DMP), Cost assessment and deployment plan, Security and compliance issues, data protection- Capstone Skillathon
Learn to work with Amazon Web Services cloud; cloud services overview EC2, S3, VM instance deployment, and access.
Run MapReduce tasks, e.g. word count; run simple ranking an algorithm, run graph Pregel (shortest path) algorithm (individual assignment)
Work with Big Data analytics services, Deploy and run HDInsight Hadoop cluster, test HBase and Hive queries, and run simple data analysis tasks.

Ability to be learn hands on with real industry data and delivering insights to industry jury is the best part of the program. Data Science and its application for Decision Science with practitioner faculty is the biggest highlight of the program. Strongly Recommend it.

Vinod Tiwari

Senior Analyst, TCS

Facilitators

Prof.Yuri Demchenko

University of Amsterdam

Prof. Yuri from University of Amsterdam, Netherlands, is an internationally recognized expert on Big Data, Cloud Computing, Application Security and has published in various international Data Science Journals as an educator and as an industry practitioner. He is a member of the NIST Big Data Working Group and Project leader for the prestigious Project EDISON H2020. As a coach and faculty at Institute of Product Leadership he teaches courses in Big Data in MBA in Applied Data Science

Is this program right for you ? Get the advice from a Senior Counselor

Related Programs & BootCamps