Skip to content Skip to site navigation Skip to service navigation

Python for Data Science

New session times will be displayed below upon confirmation.

Most Technology Training classes will be delivered online until further notice.

Before each sesson, Tech Training will provide a Zoom link for live online classes, along with any required class materials.

 




Python is the language of data science, and this class will expose you to the most important libraries (i.e., NumPy, Pandas, Matplotlib, and Scikit-learn) that will enable you to effectively do data science using Python.

Prerequisite: Basic Python Programming 

In this course, you will have an opportunity to:

  • Install Anaconda on a personal computer
  • Understand the various options for performing data science
  • Understand the reasons for Python's popularity in data science
  • Learn the primary libraries for data science in Python including NumPy, Pandas, Matplotlib and Scikit-learn
  • Perform exploratory data analysis using Pandas
  • Use Matplotlib and Seaborn to perform data visualization
  • Prepare data for machine learning
  • Apply machine learning on a variety of datasets
  • Understand the data science process
  • Understand the big picture and the importance of data science in business, industry, and technology
     

We will begin by installing Anaconda, which provides the libraries required for most data problems. We will discuss the focus and strengths of the most important libraries and how they enable data analysis and the application of machine learning to defined data problems. We will then use these libraries to perform data exploration, visualization, analysis and modeling on a variety of datasets as we work through the data science process.

 

Topics covered in this class include: 

  • Course Introduction
  • Overview of data science
  • Understand the reasons for Python's popularity in data science
  • Installing Anaconda
  • Milestone 1: Learn how to use Jupyter Notebooks
  • The data science process
  • Essential Python data science libraries
     - NumPy
    - Pandas
    - Matplotlib
    - Scikit-learn
  • Data Visualization
    - Line Chart
    - Scatterplot
    - Pairplot
    - Histogram
    - Density Plot
    - Bar Chart
    - Boxplot
  • Customizing Charts
    - Prepare data for machine learning
    - Milestone 2: Perform exploratory data analysis using Pandas
    - Milestone 3: Apply machine learning algorithms using Scikit-learn
    - Conclusion: Data Science in the real world, next steps
Antony Ross

Antony originally attained a degree in psychology with an emphasis in sport psychology. He began working with athletes and eventually chose to pursue a graduate degree in exercise physiology. He conducted research in muscle physiology while teaching at USC and, subsequently, UCLA.

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training


University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.