Skip to content Skip to site navigation Skip to service navigation

Data Science Bootcamp - Earn Your Data Science Proficiency Certification

Most Technology Training classes will be delivered online until further notice.

Before each sesson, Tech Training will provide a Zoom link for live online classes, along with any required class materials.

 


This course provides a thorough understanding of each of the key Python libraries used for data science -- NumPy, Pandas, Matplotlib and Scikit-learn, known as the Python data stack. We will perform data exploration, analysis, visualization and modeling.


Pre-requisite: Basic Python Programming
 


In six, half-day sessions of hands-on training, you can quickly become a knowledgeable, productive, and efficient Data Science professional and earn a Stanford Technology Training Certificate of Proficiency in Data Science.


We will begin by discussing the data science process and how to effectively work through a data science problem. We'll talk about how to clean, transform, and prepare data for analysis. We will also cover descriptive and inferential statistics which will enable you to perform hypothesis testing so that you can better interpret the significance of your analysis. We will also focus on machine learning and predictive analytics. We'll discuss the various ways to measure model performance, how to select the best model for your project, and ways to refine that model.

 

Learning Objectives:


During this course, you will have the opportunity to:

  • Install Anaconda on a personal computer
  • Have a clear understanding of data science and its role
  • Understand the data science process
  • Understand foundational descriptive statistics
  • Understand foundational inferential statistics
  • Understand the reasons for Python's popularity in data science
  • Learn the primary libraries for data science in Python including NumPy, Pandas, Matplotlib and Scikit-learn
  • Interact with and manipulate data arrays and matrices using NumPy
  • Perform exploratory data analysis using Pandas
  • Use Matplotlib and Seaborn to perform data visualization
  • Properly clean and prepare data for machine learning
  • Apply machine learning on a variety of datasets
  • Complete a data science project, end to end
  • Understand the big picture and the importance of data science in industry, research and technology

 

Topic Outline:

  • Course introduction
  • Install Anaconda
  • Overview of Data Science
  • The data science process
  • Identifying a problem and asking good questions
  • Descriptive statistics
  • Milestone 1: Learn how to use Jupyter Notebooks
  • Essential libraries
  • Numpy
  • Pandas
  • Matplotlib
  • Milestone 2: Exploratory data analysis
  • Getting data
  • Feature selection
  • Strategies for imputing missing data
  • Inferential statistics
  • Essential libraries
  • Statsmodels
  • Scikit-learn
  • Confidence intervals
  • Hypothesis testing
  • Milestone 3: Significance testing
  • Transforming data
  • Binary encoding
  • One-hot encoding
  • Feature Engineering
  • Training and test sets
  • Standardizing data
  • Milestone 4: Data modeling
  • Machine learning
  • K-fold cross-validation
  • Box plot
  • Measuring performance
  • Milestone 5: Model selection
  • Refining the model
  • Hyperparameter tuning
  • Grid search
  • Milestone 6: End-to-end project
  • Next steps
Antony Ross

Antony originally attained a degree in psychology with an emphasis in sport psychology. He began working with athletes and eventually chose to pursue a graduate degree in exercise physiology. He conducted research in muscle physiology while teaching at USC and, subsequently, UCLA.

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training


University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.