Data Science Bootcamp - Earn Your Data Science Proficiency Certification

Before each live online session, Tech Training will provide a Zoom link for live online classes, along with any required class materials.

This course provides a thorough understanding of each of the key Python libraries used for data science -- NumPy, Pandas, Matplotlib and Scikit-learn, known as the Python data stack. We will perform data exploration, analysis, visualization and modeling.

In six, half-day sessions of hands-on training, you can quickly become a knowledgeable, productive, and efficient Data Science professional and earn a Stanford Technology Training Certificate of Proficiency in Data Science.

We will begin by discussing the data science process and how to effectively work through a data science problem. We'll talk about how to clean, transform, and prepare data for analysis. We will also cover descriptive and inferential statistics which will enable you to perform hypothesis testing so that you can better interpret the significance of your analysis. We will also focus on machine learning and predictive analytics. We'll discuss the various ways to measure model performance, how to select the best model for your project, and ways to refine that model.

Antony Ross

Antony originally attained a degree in psychology with an emphasis in sport psychology. He began working with athletes and eventually chose to pursue a graduate degree in exercise physiology. He conducted research in muscle physiology while teaching at USC and, subsequently, UCLA. Learn more about Antony Ross

Learning Objectives

During this course, you will have the opportunity to:

Install Anaconda on a personal computer
Have a clear understanding of data science and its role
Understand the data science process
Understand foundational descriptive statistics
Understand foundational inferential statistics
Understand the reasons for Python's popularity in data science
Learn the primary libraries for data science in Python including NumPy, Pandas, Matplotlib and Scikit-learn
Interact with and manipulate data arrays and matrices using NumPy
Perform exploratory data analysis using Pandas
Use Matplotlib and Seaborn to perform data visualization
Properly clean and prepare data for machine learning
Apply machine learning on a variety of datasets
Complete a data science project, end to end
Understand the big picture and the importance of data science in industry, research and technology

Topic Outline

Course introduction
Install Anaconda
Overview of Data Science
The data science process
Identifying a problem and asking good questions
Descriptive statistics
Milestone 1: Learn how to use Jupyter Notebooks
Essential libraries
Numpy
Pandas
Matplotlib
Milestone 2: Exploratory data analysis
Getting data
Feature selection
Strategies for imputing missing data
Inferential statistics
Essential libraries
Statsmodels
Scikit-learn
Confidence intervals
Hypothesis testing
Milestone 3: Significance testing
Transforming data
Binary encoding
One-hot encoding
Feature Engineering
Training and test sets
Standardizing data
Milestone 4: Data modeling
Machine learning
K-fold cross-validation
Box plot
Measuring performance
Milestone 5: Model selection
Refining the model
Hyperparameter tuning
Grid search
Milestone 6: End-to-end project
Next steps

Prerequisites

Basic Python Programming

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training

Special Group Rates

For groups of 5 or more within the same team or department, special rates are available. Please contact techtraining@stanford.edu for more details.

University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.

Session Registration

Resources

Enter your email to get this month’s free MicroLearning course.

Education that goes directly to work.