Developing with Spark for Big Data (TTSK7505)
* Looking for a flexible schedule (after hours or weekends)? Please call 858-208-4141 or email us: sales@ccslearningacademy.com.
Student financing options are available.
Transitioning military and Veterans, please contact us to sign up for a free consultation on training and hiring options.
Looking for group training? Contact Us
About This Course
Course Description
Learn advanced Big Data and Spark skills to access disparate databases, integrate Machine Learning (ML), and establish streaming solutions.
Apache Spark is an important component in the Hadoop Ecosystem as a cluster computing engine used for Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, Spark offers faster in-memory processing for computing tasks when compared to Map/Reduce. It can be programmed in Java, Scala, Python, and R along with SQL-based front-ends.
With advanced libraries like Mahout and MLib for Machine Learning, GraphX, or Neo4J for rich data graph processing, as well as access to other NoSQL data stores, Rule engines, and components, Spark is a lynchpin in modern Big Data and Data Science computing.
This course introduces you to enterprise-grade Spark programming and the components to craft complete data science solutions. You’ll learn core big data and Spark development techniques and industry practices. This course is offered in Java, and with some alterations, Python, Scala, and R.
Learning Objectives
The essentials of Spark architecture and applications
How to execute Spark Programs
How to create and manipulate both RDDs (Resilient Distributed Datasets) and UDFs (Unified Data Frames)
How to persist and restore data frames
Essential NOSQL access
How to integrate machine learning into Spark applications
How to use Spark Streaming and Kafka to create streaming applications
Inclusions
- Instructor-led training
- Training Seminar Student Handbook
- Collaboration with classmates (not currently available for self-paced course)
- Real-world learning activities and scenarios
- Exam scheduling support*
- Enjoy job placement assistance for the first 12 months after course completion.
- This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
- Government and Private pricing available.*
Pre-requisites
- Java programming experience
- Python programming experience
- Basic understanding of SQL
- Comfort with navigating the Linux command line
- Basic knowledge of Linux editors (such as VI/nano) for editing code
Target Audience
- Experienced Developers and Architects who seek proficiency in working with Apache Spark in an enterprise data environment.
Curriculum
73 Lessons40h