Skip to content Skip to site navigation Skip to service navigation

Fundamentals of Data Science (Live Online)

New session times will be displayed below upon confirmation.

This live online course exposes you to real-world applications of data science and why it's become such an integral part of business and academia. We will discuss the data science process and the tools used to analyze data sets.


Prerequisite: Basic Python Programming training, or equivalent experience

 

In this class, you will have the opportunity to:

  • Install Anaconda on a personal computer.
  • Understand the Data Science Field.
  • Become familiar with Descriptive and Inferential Statistics and statistical analysis.
  • Learn the primary toolkit for data science in Python including NumPy, Pandas, Matplotlib and Scikit-learn.
  • Learn how to perform exploratory data analysis.
  • Learn the importance of data cleaning.
  • Utilize common Machine Learning algorithms such as Linear and Logistic Regression.
  • Learn how to evaluate models and choose the most effective one.
  • Understand how to interpret a Confusion Matrix
  • Understand the uses of the AUC-ROC curve in model evaluation.
  • Solidify understanding by completing hands-on exercises and milestones.
  • Create two data science projects.
  • Understand the big picture and the importance of data science in business, industry, and technology

 

 

Topic Outline:

  • Course Introduction
  • Installing Anaconda
  • Overview of Data Science
  • The Difference Between Business Analytics (BI), Data Analytics and Data Science
  • The Field of Data Science
  • The Data Science Process
    - Define the Problem
    - Get the Data
    - Explore the Data
    - Clean the Data
    - Model the Data
    - Communicate the Findings
  • Descriptive Statistics Fundamentals
  • Central Tendency
    - Mean
    - Median
    - Mode
  • Spread of the Data
    - Variance
    - Standard Deviation
    - Range
  • Relative Standing
    - Percentile
    - Quartile
    - Inter-quartile Range
  • Inferential Statistics Fundamentals
    - Normal Distribution
    - Central Limit Theorem
    - Standard Error
    - Confidence Intervals
    - Other Distributions
    - Samples
    - Hypothesis Testing
  • Milestone 1: Perform statistical analysis on a given data set.
  • Essential Python Data Science Libraries
    - Numpy
    - Pandas
    - Matplotlib
    - Scikit-learn
    - Statsmodels
  • Data Exploration
    - Describe
    - Merging
    - Grouping
    - Evaluating Features
  • Data Visualization
    - Line
    - Scatterplot
    - Pairplot
    - Histogram
    - Density Plot
    - Bar Chart
    - Boxplot
    - Customizing Charts
  • Milestone 2: Perform Exploratory Data Analysis
  • Data Cleaning
    - Dropping Rows
    - Imputing Missing Values
    - Feature Evaluating
  • Feature Engineering
  • Data Transformation
    - One-Hot Encoding
    - Standardization
    - Normalization
  • Test/Train Split
  • Model Training
  • Machine Learning
    - Linear Regression
    - Logistic Regression
    - Support Vector Machine
    - Decision Tree
    - K-Means
    - Clustering
  • Milestone 3: Apply machine learning algorithms, select and refine the best model.
  • Conclusion: Data Science in the real world, next steps.
     



University IT Technology Training classes are only available to Stanford University staff, faculty, or students. A valid SUNet ID is needed in order to enroll in a class.

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training


University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.