Skip to content Skip to site navigation Skip to service navigation

Data Science Bootcamp - Earn Your Data Science Proficiency Certification

Class Sessions

Date Delivery Method Cost
  • Wed Oct 25, 1:00 pm to 4:00 pm
  • Thu Oct 26, 1:00 pm to 4:00 pm
  • Fri Oct 27, 1:00 pm to 4:00 pm
  • Wed Nov 8, 1:00 pm to 4:00 pm
  • Thu Nov 9, 1:00 pm to 4:00 pm
  • Fri Nov 10, 1:00 pm to 4:00 pm
Live Online - 6 sessions $1,200

Class Code


Class Description

Most Technology Training classes will be delivered online until further notice.

Before each sesson, Tech Training will provide a Zoom link for live online classes, along with any required class materials.


This course provides a thorough understanding of each of the key Python libraries used for data science -- NumPy, Pandas, Matplotlib and Scikit-learn, known as the Python data stack. We will perform data exploration, analysis, visualization and modeling.

Pre-requisite: Basic Python Programming

In six, half-day sessions of hands-on training, you can quickly become a knowledgeable, productive, and efficient Data Science professional and earn a Stanford Technology Training Certificate of Proficiency in Data Science.

We will begin by discussing the data science process and how to effectively work through a data science problem. We'll talk about how to clean, transform, and prepare data for analysis. We will also cover descriptive and inferential statistics which will enable you to perform hypothesis testing so that you can better interpret the significance of your analysis. We will also focus on machine learning and predictive analytics. We'll discuss the various ways to measure model performance, how to select the best model for your project, and ways to refine that model.


Learning Objectives:

During this course, you will have the opportunity to:

  • Install Anaconda on a personal computer
  • Have a clear understanding of data science and its role
  • Understand the data science process
  • Understand foundational descriptive statistics
  • Understand foundational inferential statistics
  • Understand the reasons for Python's popularity in data science
  • Learn the primary libraries for data science in Python including NumPy, Pandas, Matplotlib and Scikit-learn
  • Interact with and manipulate data arrays and matrices using NumPy
  • Perform exploratory data analysis using Pandas
  • Use Matplotlib and Seaborn to perform data visualization
  • Properly clean and prepare data for machine learning
  • Apply machine learning on a variety of datasets
  • Complete a data science project, end to end
  • Understand the big picture and the importance of data science in industry, research and technology


Topic Outline:

  • Course introduction
  • Install Anaconda
  • Overview of Data Science
  • The data science process
  • Identifying a problem and asking good questions
  • Descriptive statistics
  • Milestone 1: Learn how to use Jupyter Notebooks
  • Essential libraries
  • Numpy
  • Pandas
  • Matplotlib
  • Milestone 2: Exploratory data analysis
  • Getting data
  • Feature selection
  • Strategies for imputing missing data
  • Inferential statistics
  • Essential libraries
  • Statsmodels
  • Scikit-learn
  • Confidence intervals
  • Hypothesis testing
  • Milestone 3: Significance testing
  • Transforming data
  • Binary encoding
  • One-hot encoding
  • Feature Engineering
  • Training and test sets
  • Standardizing data
  • Milestone 4: Data modeling
  • Machine learning
  • K-fold cross-validation
  • Box plot
  • Measuring performance
  • Milestone 5: Model selection
  • Refining the model
  • Hyperparameter tuning
  • Grid search
  • Milestone 6: End-to-end project
  • Next steps

University IT Technology Training classes are only available to Stanford University staff, faculty, students, and Stanford Hospitals & Clinics employees, including Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health. A valid SUNet ID is needed to enroll in a class.