

Computer Vision as a Gateway to Deep Learning
Computer vision stands out as one of the most accessible and impactful applications of deep learning with its use of neural networks to interpret complex visual data. Its ability to address real-world problems makes it the ideal starting point for those ready to master artificial intelligence in an applied setting. Beyond its use in specialized roles such as computer vision engineering, AI research, and robotics development, these skills are increasingly valuable in healthcare, where medical imaging aids diagnostics; agriculture, for monitoring crop health; and security, with applications in surveillance and biometrics.
Our self-paced Applied AI Lab focuses on practical applications, using computer vision as a hands-on framework for building essential deep learning skills. Through 6 real-world projects, you’ll learn to clean and transform visual data, train custom computer vision models, and apply advanced techniques like transfer learning. By the end of the program, you’ll will be equipped with end-to-end computer vision skills, from data preparation to model deployment, ready to tackle complex challenges across industries.
"Mastering deep learning for computer vision empowers young professionals with practical tools to solve real-world challenges across industries, from healthcare to agriculture, positioning them to lead with expertise in ethical, sustainable AI, and to tackle complex, meaningful problems."
Dr. Iván Blanco
Associate Finance Professor, CUNEF University, Founder & Director, NOAX Trading
Applied AI Lab:
Deep Learning for Computer Vision
Applicant Deadline | Rolling Admissions |
---|---|
Program Start Date | Upon Acceptance |
Cost | Entirely Free |
Length | 10-16 weeks |
Applicant Requirements |
|
Commitment | Self-paced, 10-15h per week |
Credentials Awarded |
|
Project Descriptions
The Applied AI Lab comprises six end-to-end Computer Vision projects.
Each successful project completion unlocks the registration for the next.
In this project, learners examine a data science competition helping scientists track animals in a wildlife preserve. The goal is to take images from camera traps and classify which animal, if any, is present. To complete the competition, learners expand their machine learning skills by creating more powerful neural network models that can take images as inputs and classify them into one of multiple categories.
Working with a dataset of crop disease images from Uganda, learners build and train a convolutional neural network to classify images into five categories. They explore how to improve the performance of a computer vision model by using pre-trained models and optimizing training with techniques like Callbacks.
Using traffic video feed data from Dhaka, Bangladesh, learners develop real-time object detection systems to identify and label vehicles, pedestrians, and other traffic elements. They work with pre-trained models and extend existing architectures to detect custom objects specific to urban traffic analysis, creating solutions that can monitor traffic flow and congestion patterns.
In this project, learners perform face detection and recognition tasks by using a video of an interview with Indian Olympic boxer Mary Kom. They use a state-of-the-art pre-trained Multi-task Cascaded Convolutional Network (MTCNN) model together with Inception-ResNet model to perform face recognition. The goal is to use selected video frames of Mary Kom and her interviewer and create a face embedding for each of them. This allows learners to detect their faces on new images. Learners conclude the project by wrapping their code into a Flask app that allows a user to upload an image and perform face recognition.
Working with medical imaging data, learners explore using neural networks to generate new images such as X-rays and MRIs. They accomplish this using Generative Adversarial Network (GAN) systems, both by building custom architectures and leveraging pre-trained models. Learners also create a web app using Streamlit to allow users to interact with the GAN. Additionally, learners use Git and GitHub to track the app's code.
Lab Outcomes

Map Challenges and Tasks
Map real-world challenges to machine learning tasks.

Dataset Preparation
Assess datasets and prepare them for model training.

Neural Networks
Identify the core concepts behind neural networks, such as model components, optimizers, loss functions and performance metrics.

Model Building
Build, train, and evaluate deep neural networks for computer vision tasks.

Model Deployment
Deploy models and model output in AI.

Debugging
Select appropriate resources and strategies when debugging a project.

AI Ethics
Summarize the main ethical and environmental issues confronting deep learning, as well as model-building techniques that favor fairness and sustainability.

Community of Practice
Deconstruct underlying values, areas of focus, and professional concerns of data science practitioners.
In this project, learners use Stable Diffusion to create images from text descriptions. They assemble the Stable Diffusion pipeline using several pre-trained neural networks, and learn how to fine-tune the networks to include new image information. With the goal of generating meme-worthy images, learners also create and deploy a Streamlit app to be a front-end to their fine-tuned Stable Diffusion model. This allows a non-technical marketing team to generate such images easily.