Fundamentals of Data Science

Most Technology Training classes will be delivered online until further notice.

Before each sesson, Tech Training will provide a Zoom link for live online classes, along with any required class materials.

This course exposes you to real-world applications of data science and why it's become an integral part of business and academia. We will discuss the data science process and the tools used to perform data exploration, analysis, and modeling.

Prerequisite: Basic Python Programming training, or equivalent experience

Learning Objectives

In this class, you will have the opportunity to:

Install Anaconda on a personal computer
Understand the Data Science Field
Become familiar with Descriptive and Inferential Statistics and statistical analysis
Learn primary tools used for data science in Python including Pandas and Scikit-learn
Learn how to perform exploratory data analysis
Learn the importance of data cleaning
Utilize common Machine Learning algorithms such as Linear and Logistic Regression
Solidify understanding by completing hands-on exercises and milestones
Walkthrough two data science projects
Understand the big picture and the importance of data science in learning from data

Course Outline

Course Introduction
Install Anaconda
Review the Essentials of Python
Overview of Data Science
The Difference Between Business Analytics (BI), Data Analytics and Data Science
Descriptive Statistics Fundamentals
Central Tendency
- Mean
- Median
- Mode
Spread of the Data
- Variance
- Standard Deviation
- Range
Relative Standing
- Percentile
- Quartile
- Inter-quartile Range
Inferential Statistics Fundamentals
Data Distributions
- Normal Distribution
- Uniform Distribution
The Data Science Process
- Define the Problem
- Get the Data
- Explore the Data
- Clean the Data
- Model the Data
- Communicate the Findings
Feature Selection
Data Cleaning
- Dropping Rows
- Imputing Missing Values
Data Transformation
- Binary Encoding
- One-Hot Encoding
- Standardization
- Normalization
Machine Learning Overview
Introduction to Pandas
Milestone 1: Use Pandas to perform data analysis on a real-world dataset.
Data Exploration
- Describe
- Merge
- Group
- Feature Evaluation
Feature Engineering
Milestone 2: Perform exploratory data analysis and feature engineering
Test/Train Split
Model Training
Basic Machine Learning Implementation
- Linear Regression
- Logistic Regression
- Support Vector Machine
- Decision TreeBasic Machine Learning Implementation
Milestone 3: Perform an end-to-end project of the data science process.
Conclusion: Next steps
Structured Activity/Exercises/Case Studies
- Milestone Project 1: Use Pandas to perform data analysis on a real-world dataset.
- Milestone Project 2: Perform exploratory data analysis and feature engineering.
- Milestone Project 3: Perform an end-to-end project of the data science process.

Fundamentals of Data Science

Custom training workshops are available for this program

For Stanford Affiliates: