Course Description:
This 3-day class is ideal for Snowflake users who are interested in developing the skills and experience necessary to utilize Snowflake Data Cloud for Data Science workloads. This course will cover key concepts, features, considerations, and best practices for building out data science solutions within Snowflake.
Course Outline
Snowflake Data Cloud Architecture and Overview
- Snowflake Data Cloud overview
- Three-tiered architecture
- Snowflake UI and core capabilities, including elasticity, workload separation, data security and simplicity
of performance
Data Exchange and Data Marketplace
- Private and Public Data Exchange
- Data Marketplace with ready-to-use and third-party datasets for data augmentation
- Diverse data including customer demographic data, time-series data, geospatial data
- Exploration and visualization using Snowsight
Data Lake for Machine Learning and Analytics
- Raw and external data sets in object stores
- External tables and direct queries in data lakes
- Native data formats of such as CSV, JSON, Parquet
Data Ingestion service and Continuous Data Pipelines
- Serverless continuous ingestion service Snowpipe
- Data ingestion best practices
- Bulk ingestion and scheduling data loads with tasks
- Table stream for capturing change data
Working with Semi-Structured Data
- Ingesting into native semi-structured data types without pre-processing
- Built-in functions for traversing, flattening, and nesting of semi-structured data
- Leveraging semi-structured state data for JavaScript Stored Procedures
- Complement learning with topics like geospatial data
Data Science and Machine Learning Concepts and Applications
- Data science applications
- Common machine learning vocabulary
- Machine learning workflow and pipeline
- Supervised and unsupervised machine learning
Data Science Toolset and Ecosystem
- Seamless connectivity using Snowflake connectors for languages such as Python, Spark, and R
- Notebook-based data science development environments
- Open source and many machine learning libraries including Scikit-Learn and more
- Partner platforms for data science automation and democratization around AutoML
- Partner platforms for deployment and practices with MLOps
Exploratory Data Analysis and Feature Engineering
- Descriptive exploratory data analysis using statistical and analytic functions
- Visual exploratory data analysis using popular and relevant libraries
- Employ common feature selection and feature engineering techniques
- Advanced SQL functions for data transformation at scale
Machine Learning Model Development and Tuning
- Supervised learning: linear regression with popular ML libraries
- Supervised learning: classification using techniques such as logistic regression, random forests, gradient boosts and more
- Identifying, using, and interpreting metrics to evaluate models and performance
- Unsupervised learning
Model Management and Deployment at Scale
- Deploying machine learning models using scalable framework
- External functions to support prediction and data augmentation through APIs
- Extensive partner ecosystem for automation around AutoML and operationalization using MLOps practices
- Using Snowflake capabilities including Snowpipe, table stream, and tasks for continuous data pipelines to update machine learning models
- Storing machine learning results in Snowflake
Visualizing and Collaborating on Data Science and Machine Learning Results
- Seamless connectivity to BI tools for reporting and analytics
- Communicating machine learning results
- Collaborating on models by sharing results with data sharing techniques
- Replicating your raw and processed data across region and cloud providers including AWS, Azure, and GCP
Course Objectives
By the end of this class you will be able to:
- Collect and access data from Snowflake Data Marketplace and other sources
- Manage and architect data lakes and real time streams
- Employ Snowflake best practices for developing or querying semi-structured and other data types
- Work with supervised and unsupervised machine learning models using some of the most relevant open source framework and libraries
- Formulate data science and machine learning workflow and data pipelines
- Manage and deploy machine learning models at scale with APIs
- Visualize and collaborate on machine learning results
Target Audience
Who should attend this course?
- Data scientists who build and train machine learning models
- Data scientists and data analysts who use the machine learning models to conduct predictive and prescriptive analytics
Inclusions
With CCS Learning Academy, you’ll receive:
- Instructor-led training
- Training Seminar Student Handbook
- Collaboration with classmates (not currently available for self-paced course)
- Real-world learning activities and scenarios
- Exam scheduling support*
- Enjoy job placement assistance for the first 12 months after course completion.
- This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
- Government and Private pricing available.*
*For more details call:Â 858-208-4141Â or email:Â training@ccslearningacademy.com; sales@ccslearningacademy.com