Technician using laptop while analyzing server

Analytics & BI

Apache Spark for Data Scientists

Name: Apache Spark for Data Scientists
SKU: 566219
Price: 1995.00 USD
Availability: InStock

*Looking for flexible schedule (after hours or weekend)? Please call or email us: 858-208-4141 or sales@ccslearningacademy.com.

Student financing options are available.
Looking for group training? Contact Us

Category: Analytics & BI

Download PDF of Course Details

Course Description:

Learn Spark skills from a data science perspective to build unified big data applications combining batch, streaming, and interactive analytics on your data.

Apache Spark is a powerful, open-source processing engine for data in the Hadoop cluster, optimized for speed, ease of use, and sophisticated analytics. The Spark framework supports streaming data processing and complex iterative algorithms, enabling applications to run up to 100x faster than traditional Hadoop MapReduce programs. With Spark, you can write sophisticated applications to execute faster decisions and real-time actions to a wide variety of use cases, architectures, and industries.

This hands-on course explores using Spark for common data related activities from a data science perspective. You will learn to build unified big data applications combining batch, streaming, and interactive analytics on your data.

Format	Instructor-Led
Topic	Big Data
Length	3 days

Course Outline

Spark

Data Science: The State of the Art
Hadoop, Yarn, and Spark
Architectural Overview
Spark and Storm
MLib and Mahout
Distributed vs. Local Run Modes
Hello, Spark

Spark Overview

Spark Core
Spark SQL
Spark and Hive
MLib
Mahout
Spark Streaming
Spark API

DataFrames

DataFrames and Resilient Distributed Datasets (RDDs)
Partitions
DataFrame Types
DataFrame Operations
Map/Reduce with DataFrames

Spark SQL

Spark SQL Overview
Data stores: HDFS, Cassandra, HBase, Hive, and S3
Table Definitions
ETL in Spark
Queries

Spark MLib

MLib overview
MLib Algorithms Overview

Spark Streaming

Streaming overview
Real-time data ingestion
State
Window Operations

Spark GraphX

GraphX overview
ETL with GraphX
Graph computation

Performance and Tuning

Broadcast variables
Accumulators
Memory Management

Cluster Mode

Standalone Cluster
Masters and Workers
Configurations
Working with large data sets

Target Audience

Data Scientists, System Administrators, Testers, and other technical business professionals who seek to use Spark for data processing and analysis.

What You'll Learn

Join an engaging hands-on learning environment, where you’ll learn:

The essentials of Spark architecture and applications
How to execute Spark Programs
How to create and manipulate both RDDs (Resilient Distributed Datasets) and UDFs (Unified Data Frames)
How to integrate machine learning into Spark applications
How to use Spark Streaming

Prerequisites

Before attending this course, you should have:

Introduction to Java Programming (at least exposure to basic Java syntax)
Introduction to SQL (familiarity wits SQL basics)
Basic knowledge of Statistics and Probability
Data Science background

Inclusions

With CCS Learning Academy, you’ll receive:

Instructor-led training
Training Seminar Student Handbook
Collaboration with classmates (not currently available for self-paced course)
Real-world learning activities and scenarios
Exam scheduling support*
Enjoy job placement assistance for the first 12 months after course completion.
This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
Government and Private pricing available.*

*For more details call: 858-208-4141 or email: training@ccslearningacademy.com; sales@ccslearningacademy.com

Apache Spark for Data Scientists

*Looking for flexible schedule (after hours or weekend)? Please call or email us: 858-208-4141 or sales@ccslearningacademy.com.

Student financing options are available.
Looking for group training? Contact Us

Course Description:

Course Outline

Target Audience

What You'll Learn

Prerequisites

Inclusions

With CCS Learning Academy, you’ll receive:

Quick Links

Additional Info

Apache Spark for Data Scientists

*Looking for flexible schedule (after hours or weekend)? Please call or email us: 858-208-4141 or sales@ccslearningacademy.com.

Student financing options are available. Looking for group training? Contact Us

Course Description:

Course Outline

Target Audience

What You'll Learn

Prerequisites

Inclusions

With CCS Learning Academy, you’ll receive:

Related products

Tableau Desktop Level 3: Dashboard Deep Dive

55232-A: Writing Analytical Queries for Business Intelligence

50433-B: PowerPivot for End Users

55049-A: PowerPivot, Power View and SharePoint 2013 Business Intelligence Center for Analysts

Student financing options are available.
Looking for group training? Contact Us