Skip to content Skip to site navigation Skip to service navigation

Introduction to Big Data with Apache Spark

New session times will be displayed below upon confirmation.

Effective immediately in response to COVID-19, all Technology Training classes will be delivered online until further notice.


In advance of each session, Tech Training will provide you with a Zoom link to your class, along with any required class materials.
 


 

This half-day course will introduce you to open-source Big Data technologies, including Apache Spark, and shed light on how enterprise companies often utilize and tame large data sets to drive and problem-solving and decision-making efforts.

Abstract
The training session is intended for software engineers and software architects. It provides a practical learning experience through a combination of about 70% lecture and 30% hands-on demo work with learner participation. The session will include examples of how companies have solved Big Data problems, and the application of Big Data within the industry.

Learning Objectives:

  • During this course, you will have the opportunity to learn how to:
  • Understand big data ecosystems and data distributions in the industry
  • Consider the different libraries associated with Apache Spark
  • Use Apache Spark and work with data structures

Topics include:

  • History and background of Big Data
  • Understanding the Big Data Ecosystems
  • Industry uses for Big Data Distributions
  • Why use Apache Spark?
  • Comparing MapReduce vs Apache Spark
  • Apache Spark Architecture
  • Understanding libraries associated with Spark -- Streaming, Machine Learning
  • Using Spark (Cloud or On Premises)
  • Working with Spark data structures used for handling data

 

 

 


 

University IT Technology Training classes are only available to Stanford University staff, faculty, students and Stanford Hospitals & Clinics employees. A valid SUNet ID is needed in order to enroll in a class.

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training


University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.