Our Applied Data Science Module consists of two eight-week units that challenge students to solve real-world problems through data analysis. Our hands-on approach ensures the skills students acquire translate seamlessly into the workplace.
Across both units in the Module, students gain a comprehensive introduction to scientific computing, Python, and the related tools data scientists use to succeed in their work. Students will develop machine learning and statistical analysis skills through hands-on practice with open-ended investigations of real-world data.
All students receive complimentary access to a ready-to-use Python environment for the entire Module. This allows students to gain first-hand experience with Python, pandas, and Jupyter Notebooks, and allows for immediate immersion into novel data science problems.
The Applied Data Science Module is built by WorldQuant University’s partner, The Data Incubator, a fellowship program that trains data scientists. Graduates earn a Credly badge upon completion of each unit to share and celebrate their professional development.
The Applied Data Science Module is delivered online to enable students to participate in a flexible yet rigorous continuing education program to amplify their skills and knowledge. To apply, applicants fill out a profile on their educational history and technical skillset, which takes about 20 minutes to complete.
Across two units and sixteen weeks, students learn to source data relevant to a business problem or task, to summarize data in aggregate statistics and visualizations, and to model trends to showcase insights and make practical business decisions.
Students who successfully complete Unit I are eligible to enroll in Unit II. Students who complete either Unit earn a badge from Credly, the recognized leader in skills credentialing.
In Unit I, students gain a comprehensive introduction to scientific computing, Python, and the related tools data scientists use to succeed in their work. Successful completion of Unit I is a prerequisite for enrollment in Unit II.
In this project students use Python to compute Mersenne numbers, using the Lucas-Lehmer test to identify Mersenne numbers that are prime. They use Python data structures and core programming principles such as loops to implement their solution. In addition, students learn to implement the Sieve of Eratosthenes as a faster solution for checking if numbers are prime, learning about the importance of algorithm time complexity.
In this project students use Object Oriented Programming to create a class that represents a geometric point. They define methods that describe common operations with points such as adding two points together and finding the distance between two points. Finally, they write a K-means clustering algorithm that uses the previous defined point class.
In this project students use basic Python data structures, functions, and control program flow to answer posed questions over medical data from the British NHS on prescription drugs. They also work with fundamental data wrangling techniques such as joining data sets together, splitting data into groups, and aggregating data into summary statistics.
In this project students use the Python package pandas to perform data analysis on a prescription drug data set from the British NHS. They answer questions such as identifying what medical practices prescribe opioids at an usually high rate and what practices are prescribing substantially more rare drugs compared to the rest of the medical practices. They also use statistical concepts like z-score to help identify the aforementioned practices.
In Unit II, students learn how to build machine learning models to make predictions based on real-world data. They will understand the best way to treat, clean, and encode data and how to choose the appropriate machine learning models for the task. They will properly tune the model to create a generalized model that performs well on both a training set and on out-of-sample data. They will learn how to build models using text and time series data.
In this project students work with nursing home inspection data from the United States, predicting which providers may be fined and for how much. They use the scikit-learn Python package to construct progressively more complicated machine learning models. They also impute missing values, apply feature engineering, and encode categorical data.
In this project students use natural language processing to train various machine learning models to predict an Amazon review rating based on the text of the review. Further, they use one of the trained models to gain insight on the reviews, identifying words that are highly polar. With these highly polar words identified, one can understand what words highly influence the model’s prediction.
This is a true introduction to data science and can accommodate beginners with the right amount of foundational knowledge.
The Module runs four times each calendar year. We strongly encourage all qualified candidates to apply soon, as space in each class is limited.
|Start Date||Application Deadline|
|January 5||December 22|
|April 12||March 23|
|July 5||June 13|
|September 27||September 5|