Skip to content Skip to site navigation Skip to service navigation

Introduction to Big Data with Apache Spark

Class Sessions

Date Location Cost
  • Thu Aug 12, 9:00 am to 12:00 pm
Live Online $300

Class Code


Class Description

Effective immediately in response to COVID-19, all Technology Training classes will be delivered online until further notice.

In advance of each session, Tech Training will provide you with a Zoom link to your class, along with any required class materials.


This half-day course will introduce you to open-source Big Data technologies, including Apache Spark, and shed light on how enterprise companies often utilize and tame large data sets to drive and problem-solving and decision-making efforts.

The training session is intended for software engineers and software architects. It provides a practical learning experience through a combination of about 70% lecture and 30% hands-on demo work with learner participation. The session will include examples of how companies have solved Big Data problems, and the application of Big Data within the industry.

Learning Objectives:

  • During this course, you will have the opportunity to learn how to:
  • Understand big data ecosystems and data distributions in the industry
  • Consider the different libraries associated with Apache Spark
  • Use Apache Spark and work with data structures

Topics include:

  • History and background of Big Data
  • Understanding the Big Data Ecosystems
  • Industry uses for Big Data Distributions
  • Why use Apache Spark?
  • Comparing MapReduce vs Apache Spark
  • Apache Spark Architecture
  • Understanding libraries associated with Spark -- Streaming, Machine Learning
  • Using Spark (Cloud or On Premises)
  • Working with Spark data structures used for handling data





University IT Technology Training classes are only available to Stanford University staff, faculty, students and Stanford Hospitals & Clinics employees. A valid SUNet ID is needed in order to enroll in a class.