Image
Three students engaged in virtual discussion.

Applied Data Science Lab

Gain the fundamental data science skills required for this growing field.

  • Completely Online
  • 100% Free of Cost
  • Rigorous Focus on Applied Learning
Image
Woman's hand with pen pointing at data visualizations.

Learn the fundamentals of data science, entirely free.


By completing a series of end-to-end data science projects, students build the wrangling, analysis, model-building, and communication skills to prepare for success in data-centric careers. They use their skills to build models that predict everything from real estate prices to customer retention, create interactive dashboards that run statistical experiments, and build APIs to incorporate their insights into web applications.

Data science is becoming a cornerstone of modern business. Practitioners use analytical tools and techniques to extract meaningful insights from data that drive critical business decisions. Data scientists are in demand across industries, and the number of positions is projected to grow by 35% through 2030.

“WQU helped me massively in my ML/AI journey. It gave me all the fundamental knowledge I needed in my career.

From WQU, I had more confidence in taking on ML projects. It was a lot easier to continue studying myself as well, and I also got a few decent roles due to the knowledge I gained here.”

Applied Data Science Lab Graduate, Nigeria
2023

Applied Data Science Lab

Next Deadline

Rolling Admissions

Lab Start Date

Upon Acceptance

Cost

Entirely Free

Length

16 weeks (recommended)

Applicant Requirements

  • Beginner-level Python skills
  • Familiarity with basic statistics
  • Passing score on Admissions Quiz (66% or higher)

Commitment

10-15 hours per week

Credentials Awarded

  • WQU Applied Data Science Certificate
  • Verified Digital Badge

Learn How to Apply

Image of Applied Data Science Lab Credly Badge

Upon successful completion of the Applied Data Science Lab, students receive both a digital certificate and a sharable, verified credential.

What You Will Learn

In this dynamic learning environment, students get real-time feedback and opportunities to collaborate with their peers and participate in live office hours with their instructor. After successfully completing the Lab, students earn an easily shareable WQU badge issued by Credly.

Project Descriptions

This project analyzes a dataset of 21,000 properties for sale in Mexico from Properati.com to investigate whether property prices are more strongly influenced by size or location. Learners organize information using Python data structures and import and clean CSV data with the pandas library. The project includes creating data visualizations such as scatter and box plots to explore the data, culminating in an examination of variable relationships through correlation analysis to answer the central research question.

Building on the data wrangling and visualization skills from Project 1, Project 2 marks a transition from descriptive to predictive data science. Focusing on real estate in Buenos Aires, Argentina, learners create a machine learning model to predict apartment prices. The project covers building linear regression models using the scikit-learn library and constructing data pipelines for imputing missing values and encoding categorical features. Learners also explore techniques to improve model performance by reducing overfitting and conclude by creating a dynamic dashboard for interacting with their completed prediction model.

Project 3 utilizes data from openAfrica, one of Africa's largest open data platforms, to analyze air quality measurements from Nairobi, Lagos, and Dar es Salaam. Learners build a time series model to predict PM 2.5 readings throughout the day, gaining experience in querying MongoDB databases and preparing time series data for analysis. The project covers building autoregression models and improving performance through hyperparameter tuning. These time series modeling skills are valuable beyond public health applications, serving as foundational concepts for financial engineering and natural language processing work.

This project uses data from Open Data Nepal to build a model predicting building damage from the Nepal 2015 Earthquake, focusing primarily on the Gorkha district with additional examples from Ramechhap. Learners query SQL databases to retrieve data and develop classification models using both logistic regression and decision tree approaches. The project emphasizes the importance of incorporating ethical considerations into model building, providing learners with experience in responsible machine learning practices while addressing real-world disaster response scenarios.

Project 5 explores bankruptcy data collected by a team of Polish economists to build a predictive model that determines whether a company will go bankrupt. Learners develop skills in navigating file systems from the Linux command line and loading and saving files using Python. The project addresses the challenge of imbalanced datasets through resampling techniques and covers model evaluation using classification metrics such as precision and recall, providing practical experience with real-world financial data analysis.

Project 6 uses data from the 2019 Survey of Consumer Finances to identify households facing credit access challenges and build a model to segment them into distinct subgroups. As an example of unsupervised learning through clustering, the project has applications in commercial marketing, customer segmentation, and sociological studies of social stratification. Learners compare subgroup characteristics using side-by-side bar charts and build k-means clustering models. The project covers feature selection for clustering based on variance and dimensionality reduction through principal component analysis (PCA). Finally, learners design, build, and deploy an interactive Dash web application to share their findings.

This project involves designing and conducting an A/B testing experiment to determine if WQU can increase quiz completion rates. Learners explore the Applied Data Science Lab applicant pool to formulate research questions and hypotheses, then run and analyze their experiments. The project demonstrates randomized controlled experimentation techniques used across industries, from email marketing to campaign testing and scientific research. Learners build choropleth maps to visualize the global distribution of ADSL students and create custom Python classes for ETL processes. The project covers experimental design and chi-square test analysis, culminating in building an interactive web application that follows a three-tiered design pattern.

The final project focuses on building a model to predict volatility on the Bombay Stock Exchange. Learners will explore stock data for two companies using the AlphaVantage stock API, then use that data to calculate volatility and build predictive models. The project concludes with model deployment through creating a custom API to serve predictions. As volatility models are essential tools in econometrics and financial engineering, this project provides practical experience with real-world financial applications while reinforcing time series concepts from earlier coursework. Learners make HTTP requests to retrieve data from web APIs and transform and load data to SQL databases using custom Python classes. The project covers calculating asset volatility and building GARCH models for prediction, culminating in building and deploying a web API and server to serve model predictions.

Lab Outcomes

Card Content
Image
Number 1

Database Management

Extract data from SQL and NoSQL databases

Card Content
Image
Number 3

Regression & Classification Modeling

Build predictive models for regression and classification

Card Content
Image
Number 5

Ethics in Machine Learning

Discuss the ethical implications of deploying models in the real world and the environmental impact of machine learning models

Card Content
Image
Number 2

Data Cleaning and Preprocessing

Clean authentic, messy datasets

Card Content
Image
Number 4

Data Visualization

Create compelling visualizations to explain data characteristics and model performance

Card Content
Image
Number 6

Business Insight & Intelligence

Learn how to apply machine learning to business problems

Frequently Asked Questions

1

How can I prepare for the Admissions Quiz?

Before you can start the Applied Data Science Lab, you need to take a short Admissions Quiz. We want to make sure you have a solid foundation on which you c

2

How does the Data Science Lab work?

The Applied Data Science Lab is divided into eight two-week projects and was designed to be completed in approximately 16 weeks. Following a prescribed sequence, students complete one project at a time. The projects range from exploring housing prices in Mexico to predicting air quality in Kenya. Students work with publicly available datasets, upon which they can develop larger portfolio projects. The curriculum is deployed on virtual machines

3

What happens if I fail the Admissions Quiz?