Roadmap to a job

Machine Learning / AI Engineer

Train on this path

A 2026 ML/AI Engineer designs, ships, and operates intelligent systems end-to-end, not just notebooks

7 stages · 33 skills · 75 free resources

Core stack

PythonPyTorchTensorFlowJupyterHugging Face

Track your progress

0 / 38 done

Stage 01
Stage 0, Software Engineering Foundations
Write clean, tested Python and operate like an engineer (version control, environments, CLI) before touching any ML. This is the strongest predictor of hireability and the most under-rated step.
Python (core + intermediate)Essential3 links
Python is a high-level, dynamically typed programming language widely used in data science, machine learning, and backend development. It supports object-oriented, functional, and procedural styles, and its extensive ecosystem of libraries makes it the primary language for building and deploying ML systems.
Why it matters · The lingua franca of ML/AI; everything downstream assumes fluency (OOP, typing, error handling, comprehensions).
docsThe Python Tutorial (official docs)coursefreeCodeCamp, Scientific Computing with Python articleReal Python, tutorials and guides
Git & GitHubEssential2 links
Git is a distributed version control system that tracks changes to source code over time, enabling collaboration and rollback. GitHub is a cloud-based hosting platform built on Git that adds pull requests, code review workflows, issue tracking, and CI/CD integrations for managing software projects.
Why it matters · Every team runs on version control, and your public repos double as portfolio and hiring signal.
docsPro Git (free online book)courseGitHub Skills, hands-on interactive courses
Command line, virtual environments & dependency managementEssential2 links
The command line (shell) provides direct control over the operating system for running scripts, managing files, and automating tasks. Virtual environments (venv, conda, uv) isolate project dependencies, while tools like pip, requirements.txt, and pyproject.toml specify and lock package versions for reproducibility.
Why it matters · Reproducible environments (venv/conda/uv, requirements/pyproject) prevent the 'works on my machine' failures that sink ML projects.
docsMDN, Command line crash course docsPython venv, virtual environments (official docs)
SQL & relational databasesEssential3 links
SQL (Structured Query Language) is the standard language for querying and manipulating data stored in relational databases such as PostgreSQL, MySQL, and SQLite. It is used to filter, aggregate, join, and transform tabular data, and forms the foundation for data extraction in analytics and ML pipelines.
Why it matters · Most real ML data lives in databases; SQL is a near-universal requirement and needed for nearly every data-pulling task.
courseMode, SQL Tutorial docsPostgreSQL, The SQL Language (official tutorial)courseKaggle Learn, Intro to SQL
Testing, clean code & code review basicsRecommended2 links
Software testing involves writing automated checks (unit, integration) using frameworks like pytest to verify that code behaves correctly. Clean code practices emphasize readability, modularity, and consistent style, while code review is a collaborative process where teammates inspect changes before they are merged.
Why it matters · ML code that ships needs tests and readability; teams screen for software discipline, not just model accuracy.
docspytest, documentation articleReal Python, Getting Started With Testing in Python
Stage 02
Stage 1, Math & Statistics for ML (right-sized)
Build enough intuition to understand WHY models work, debug them, and read papers, without a year of pure theory. Learn it as a tool, alongside code.
Linear algebra (vectors, matrices, dot products)Essential2 links
Linear algebra is a branch of mathematics concerned with vectors, matrices, and linear transformations. In machine learning, it underlies data representations, dimensionality reduction, and the core computations in neural networks, including matrix multiplications used in forward and backward passes.
Why it matters · Embeddings, neural nets, and attention are all linear algebra; you can't reason about model internals without it.
video3Blue1Brown, Essence of Linear Algebra courseKhan Academy, Linear Algebra
Probability & statisticsEssential2 links
Probability theory describes the likelihood of events and forms the mathematical foundation for statistical inference. In ML, it is applied to modeling uncertainty, understanding data distributions, designing experiments, and evaluating models through metrics such as confidence intervals and hypothesis tests.
Why it matters · Distributions, hypothesis testing, and Bayes underpin model evaluation, uncertainty, and experiment design.
courseKhan Academy, Statistics & Probability articleSeeing Theory, a visual intro to probability & statistics
Calculus & gradients (intuition)Recommended2 links
Calculus studies rates of change and accumulation, with derivatives measuring how a function's output changes with its input. In machine learning, gradient descent relies on partial derivatives to iteratively minimize a loss function, and backpropagation uses the chain rule to compute gradients through a neural network.
Why it matters · Gradient descent and backprop are calculus; you need the intuition, not the ability to do proofs by hand.
video3Blue1Brown, Essence of Calculus courseKhan Academy, Calculus 1
Mathematics for Machine Learning (consolidated path)Optional2 links
Mathematics for Machine Learning is a structured curriculum that unifies linear algebra, multivariate calculus, probability, and principal component analysis (PCA) into a single coherent learning path. It is commonly taught through the book of the same name (Deisenroth et al.) and associated online courses, providing the mathematical grounding needed to understand modern ML algorithms.
Why it matters · A single track ties linear algebra, calculus, and PCA together if you prefer one structured route.
courseMathematics for Machine Learning, Imperial College London (Coursera, free to audit)docsMathematics for Machine Learning, free online book
Build itRecall: AI Flashcard Generatorbeginner · 8-12 hours
Stage 03
Stage 2, Data Wrangling & Classical Machine Learning
Turn messy data into features and train/evaluate classical models well. This is still the bread-and-butter of most production ML and the base that deep learning builds on.
NumPy, pandas & data visualizationEssential3 links
NumPy provides efficient N-dimensional array operations and numerical computing primitives in Python. Pandas builds on NumPy to offer DataFrame-based data manipulation for tabular datasets, while visualization libraries such as Matplotlib and Seaborn generate charts and plots for exploratory data analysis.
Why it matters · Most real ML work is cleaning, joining, and shaping data; pandas/NumPy are non-negotiable daily tools.
docspandas, Getting started tutorials (official docs)courseKaggle Learn, Pandas docsNumPy, the absolute basics for beginners
scikit-learn & classical algorithmsEssential3 links
scikit-learn is a Python library providing consistent implementations of classical machine learning algorithms including linear regression, decision trees, gradient boosting (XGBoost, LightGBM), k-nearest neighbors, k-means clustering, and support vector machines. It also supplies utilities for preprocessing, pipelines, and model selection.
Why it matters · Regression, trees, gradient boosting, KNN, k-means, and the train/validate/test workflow are foundational and still widely deployed.
docsscikit-learn, User Guide courseKaggle Learn, Intro to Machine Learning courseGoogle, Machine Learning Crash Course
Feature engineering & model evaluationEssential2 links
Feature engineering is the process of transforming raw data into informative inputs for a model, including encoding categorical variables, scaling, and creating interaction terms. Model evaluation assesses predictive performance using techniques such as cross-validation, precision-recall curves, ROC-AUC, and careful separation of training and test sets to prevent data leakage.
Why it matters · Cross-validation, leakage avoidance, and metric choice (precision/recall/ROC) separate working models from misleading ones.
courseKaggle Learn, Feature Engineering docsscikit-learn, Metrics and scoring (model evaluation)
ML theory (Andrew Ng specialization)Recommended1 link
The Machine Learning Specialization by Andrew Ng (DeepLearning.AI on Coursera) covers supervised learning, unsupervised learning, and best practices for model building including regularization and the bias-variance trade-off. It provides a widely recognized theoretical foundation through video lectures, assignments, and practical exercises in Python.
Why it matters · Gives a coherent mental model of supervised/unsupervised learning and the bias-variance trade-off; free to audit and widely respected.
courseMachine Learning Specialization, DeepLearning.AI (Coursera, free to audit)
Build itDataLab: Clean and Chart a Real Datasetbeginner · 6-10 hours
Stage 04
Stage 3, Deep Learning with PyTorch
Build, train, and debug neural networks for vision, sequence, and tabular problems, using PyTorch, the framework most new ML postings ask for.
Neural network fundamentals & PyTorchEssential3 links
Neural networks are computational models composed of layers of parameterized units that learn representations from data through gradient-based optimization. PyTorch is an open-source deep learning framework that provides dynamic computation graphs, automatic differentiation (autograd), and GPU-accelerated tensor operations used to define, train, and deploy neural network models.
Why it matters · PyTorch now leads TensorFlow in new ML job postings; fluency with tensors, autograd, and training loops is core.
docsPyTorch, Learn the Basics (official tutorial)coursefast.ai, Practical Deep Learning for Coders courseDeep Learning Specialization, DeepLearning.AI (Coursera, free to audit)
CNNs, RNNs & the Transformer architectureEssential3 links
Convolutional Neural Networks (CNNs) apply learned filters to extract spatial features and are the standard architecture for image tasks. Recurrent Neural Networks (RNNs) process sequential data with shared weights over time steps, while Transformers replace recurrence with self-attention mechanisms, enabling parallel processing and capturing long-range dependencies, which is the foundation for modern large language models.
Why it matters · Transformers (attention) power modern AI; understanding them is required to fine-tune, debug, and reason about LLMs.
articleThe Illustrated Transformer (Jay Alammar)video3Blue1Brown, Neural Networks (incl. attention)videoAndrej Karpathy, Neural Networks: Zero to Hero
Hugging Face Transformers (using & fine-tuning models)Essential2 links
Hugging Face Transformers is an open-source Python library that provides a unified API to load, run, and fine-tune thousands of pretrained models for NLP, vision, and multimodal tasks. It integrates with PyTorch and JAX and supports parameter-efficient fine-tuning methods such as LoRA through the PEFT library.
Why it matters · The standard way to load, run, and fine-tune open models (Llama, Mistral, Qwen, etc.); ubiquitous in industry.
courseHugging Face, LLM Course (free)docsHugging Face, Transformers documentation
TensorFlow / KerasOptional2 links
TensorFlow is an open-source machine learning framework developed by Google that supports building and training neural networks at scale, with deployment options across servers, mobile, and browsers. Keras is its high-level API that simplifies model construction through layer-based abstractions, functional and sequential interfaces, and built-in training loops.
Why it matters · Still present in many enterprise/legacy stacks, but PyTorch is the better primary investment in 2026.
docsKeras, Developer guides docsTensorFlow, tutorials
Build itAsk-a-Podcast RAG Searchintermediate · 12-20 hours
Checkpoint
Don't wait, start applying
You don't have to finish the path to begin. Early applications and interviews show you exactly what to learn next.
Start applying to ML / AI roles nowReal applications and interviews tell you what to learn next. Begin before you finish.Browse jobs
Stage 05
Stage 4, LLM & AI Application Engineering
Build production AI features on foundation models: prompting, RAG over your own data, tool-using agents (including MCP for tool/data wiring), and, critically, evaluation. In 2026 this is a mainstream requirement, not a specialty.
LLM APIs, prompting & structured outputsEssential3 links
LLM APIs (such as those from Anthropic, OpenAI, and providers on OpenRouter) expose large language models over HTTP for tasks including text generation, summarization, and reasoning. Prompting techniques shape model behavior through system messages and few-shot examples, while structured outputs constrain responses to JSON schemas using tool/function calling or response format parameters.
Why it matters · The entry point to AI engineering: chat/completions APIs, tool/function calling, structured (JSON) outputs, and prompt caching.
docsOpenAI, Text generation & prompting guide docsClaude, Prompt engineering overview courseDeepLearning.AI, ChatGPT Prompt Engineering for Developers
Embeddings, vector databases & RAGEssential3 links
Embeddings are dense numerical representations of text, images, or other data produced by neural encoders that place semantically similar content close together in vector space. Vector databases (Pinecone, Weaviate, pgvector) index these embeddings for fast approximate nearest-neighbor retrieval. Retrieval-Augmented Generation (RAG) combines a retrieval step over a vector index with an LLM to ground responses in specific documents.
Why it matters · RAG is the most-deployed LLM pattern in 2026; you must handle chunking, hybrid search, reranking, and corpus drift.
articlePinecone, Learn (embeddings, vector search, RAG)docsLlamaIndex, documentation docspgvector, open-source vector search for Postgres
AI agents & orchestrationEssential3 links
AI agents are systems where a language model iteratively reasons, selects actions, calls external tools, and updates its plan based on intermediate results. Orchestration frameworks such as LangGraph, CrewAI, and PydanticAI provide abstractions for defining agent graphs, managing state across steps, handling retries, and composing multi-agent workflows.
Why it matters · The fastest-growing AI skill: systems where an LLM plans, calls tools, holds state, and recovers from failure (LangGraph/CrewAI/PydanticAI).
docsLangGraph, documentation courseHugging Face, Agents Course (free)articleAnthropic, Building effective agents
Model Context Protocol (MCP) & tool integrationRecommended2 links
Model Context Protocol (MCP) is an open standard introduced by Anthropic in 2024 and adopted across the industry for connecting AI agents to external tools, data sources, and services through a standardized client-server interface. MCP servers expose resources and callable tools, allowing agents to retrieve data, execute actions, and integrate with APIs in a consistent, composable way.
Why it matters · MCP became the de-facto open standard in 2025-2026 for connecting agents to tools and data (backed by Anthropic, OpenAI, Google, Microsoft); increasingly expected for agentic roles.
docsModel Context Protocol, Getting started (official docs)articleAnthropic, Introducing the Model Context Protocol
LLM evaluation & guardrailsEssential2 links
LLM evaluation encompasses methods for measuring the quality, accuracy, and safety of language model outputs, including LLM-as-judge scoring, retrieval metrics (precision, recall, MRR), and task-specific benchmarks. Guardrails are validation layers applied at input and output to detect prompt injection, enforce output schemas, filter harmful content, and prevent model behavior from drifting outside acceptable boundaries.
Why it matters · Without evals (LLM-as-judge, retrieval metrics) and prompt-injection defense, AI features silently regress; eval rigor is exactly what employers screen for.
docsRagas, RAG/LLM evaluation framework (docs)articleOWASP, Top 10 for LLM Applications
Fine-tuning (SFT / LoRA / DPO)Recommended2 links
Fine-tuning adapts a pretrained language model to a specific task or style by continuing training on a curated dataset. Supervised fine-tuning (SFT) trains on labeled examples, LoRA (Low-Rank Adaptation) injects small trainable weight matrices to reduce compute cost, and DPO (Direct Preference Optimization) aligns model outputs to human preferences without a separate reward model.
Why it matters · Useful for shaping behavior/format and cutting cost/latency with smaller models, but it's for behavior, not for teaching new facts (use RAG for that).
docsHugging Face, PEFT (LoRA and friends) docs courseHugging Face, Fine-tuning (LLM Course)
Build itChatbot That Remembers Youintermediate · 10-16 hours
Stage 06
Stage 5, MLOps, Deployment & Production Systems
Ship models/AI systems as reliable services: containerize, serve via an API, deploy to a cloud, version data/models, and monitor in production. This is what turns a notebook into a hireable skill set.
Serving models as APIs (FastAPI) + DockerEssential2 links
FastAPI is a modern Python web framework for building HTTP APIs with automatic OpenAPI documentation and async support, commonly used to expose ML models as prediction endpoints. Docker packages an application and its dependencies into a portable container image, ensuring consistent behavior across development, testing, and production environments.
Why it matters · The standard pattern for exposing a model/agent; containerization is assumed for essentially any deployment.
docsFastAPI, Tutorial - User Guide docsDocker, Get started
Cloud platform (AWS, GCP, or Azure), pick oneEssential2 links
AWS, GCP, and Azure are the three leading public cloud platforms, each offering managed compute, storage, networking, databases, and ML-specific services (SageMaker, Vertex AI, Azure ML). Cloud platforms provide the infrastructure for training, deploying, and scaling ML models without managing physical hardware.
Why it matters · Fluency in one cloud is expected for nearly all ML roles; AWS leads in 2026 with GCP and Azure close behind.
courseAWS Skill Builder, Machine Learning Learning Plan (free)courseGoogle Cloud Skills Boost, Professional ML Engineer path
Experiment tracking & data/model versioningEssential2 links
Experiment tracking tools such as MLflow and Weights and Biases (W&B) record hyperparameters, metrics, artifacts, and code for each training run, enabling comparison and reproducibility. Data and model versioning with tools like DVC (Data Version Control) applies Git-like semantics to large datasets and model checkpoints stored in remote storage.
Why it matters · MLflow/W&B + DVC make experiments reproducible and models auditable, core to any real ML pipeline.
docsMLflow, documentation docsDVC, data and model versioning (docs)
End-to-end MLOps (pipelines, CI/CD, monitoring)Essential3 links
MLOps applies software engineering practices to the full ML lifecycle, from data ingestion and model training to deployment and monitoring. Pipeline orchestrators (Kubeflow, Prefect, Airflow) automate workflow steps, CI/CD systems (GitHub Actions) run tests and deploy on code changes, and monitoring tools track prediction drift, data distribution shift, latency, and cost in production.
Why it matters · Orchestration, GitHub Actions, and monitoring (drift, latency, cost) are the 'boring' skills that most distinguish hireable ML engineers.
courseMLOps Zoomcamp, DataTalks.Club (free, self-paced)courseMade With ML, production ML course (free)articleGoogle Cloud, MLOps: Continuous delivery and automation pipelines
LLM serving & inference optimization (vLLM, quantization)Recommended2 links
LLM serving frameworks such as vLLM use techniques like PagedAttention and continuous batching to maximize GPU throughput when hosting large language models. Quantization reduces model weight precision (FP8, INT4, GPTQ) to decrease memory footprint and increase inference speed, enabling larger models to run on fewer or less expensive accelerators.
Why it matters · Self-hosting LLMs cost-effectively (PagedAttention, continuous batching, FP8/INT4 quantization) is an increasingly demanded edge.
docsvLLM, documentation docsHugging Face, Quantization (concepts & guide)
Kubernetes, Terraform & big data (Spark)Optional2 links
Kubernetes is a container orchestration system that automates deployment, scaling, and management of containerized workloads across clusters. Terraform is an infrastructure-as-code tool for provisioning and managing cloud resources declaratively. Apache Spark is a distributed computing engine for processing large-scale datasets in parallel across many nodes.
Why it matters · Needed at scale and in platform/infra-heavy roles, but not required to land a first ML/AI engineering job.
docsKubernetes, tutorials docsApache Spark, documentation
Build itLLM Trip Planner With Live Toolsintermediate · 8-14 hours
Stage 07
Stage 6, Portfolio, Specialization & Job Readiness
Prove you can deliver end-to-end and get hired. Build 2-3 deployed, documented projects, pick a depth area, and prepare for ML system design + coding interviews.
End-to-end portfolio projects (deployed + documented)Essential2 links
End-to-end portfolio projects demonstrate the ability to take a problem from raw data through modeling, serving, and monitoring to a live, accessible application. Documentation covers architecture decisions, dataset sources, model choices, and performance metrics, while deployment to a public URL or API makes the work verifiable and shareable.
Why it matters · Employers hire demonstrated delivery (data -> model/RAG/agent -> live API -> monitoring) over certificates; this is your strongest signal.
projectKaggle, Competitions (practice + visibility)projectMade With ML, project-based MLOps curriculum
ML system design interview prepEssential2 links
ML system design interviews assess the ability to architect complete machine learning systems, covering problem framing, data collection and labeling, feature pipelines, model selection, training infrastructure, serving, and monitoring. Preparation involves studying canonical systems (recommendation engines, search ranking, fraud detection) and practicing structured trade-off discussions.
Why it matters · Mid/senior ML interviews center on designing data-to-serving systems (trade-offs, scaling, monitoring), not just algorithms.
articleChip Huyen, Introduction to ML Interviews Book (free online)articleEugene Yan, Start Here (applied ML & system design writing)
Coding & DSA interview practiceRecommended2 links
Coding interviews for ML and AI engineering roles test general software engineering proficiency through algorithmic problems involving data structures (trees, graphs, hash maps) and algorithm design (sorting, dynamic programming, two-pointer techniques). Practice platforms such as LeetCode provide a large bank of problems organized by topic and difficulty.
Why it matters · ML/AI engineering roles still run software-engineering coding rounds; steady practice keeps you competitive.
projectNeetCode, coding interview practice & roadmap projectLeetCode, practice problems
Pick a depth specialization (NLP/LLMs, CV, RecSys, or platform/MLOps)Recommended2 links
Depth specialization means developing concentrated expertise in one ML subfield: Natural Language Processing and LLMs (text understanding, generation, fine-tuning), Computer Vision (image classification, detection, segmentation), Recommender Systems (collaborative filtering, ranking, retrieval), or ML platform and MLOps (infrastructure, pipelines, tooling). A clear specialization complements a general ML foundation and aligns with specific team needs.
Why it matters · A generalist foundation plus one credible depth area makes you memorable and matches how teams actually hire.
courseStanford CS231n, Deep Learning for Computer Vision (notes free)courseStanford CS224n, NLP with Deep Learning (materials free)
Build itLocal LLM Workstationintermediate · 4-8 hours
Land the job
Turn these skills into offers
ResuMax takes you from skilled to hired: a resume that proves it, applications tailored per role, and interview reps.
Build a resume that proves these skillsIn ResuMaxOpen builder
Tailor it to each ML / AI postingIn ResuMaxTailor
Apply to ML / AI jobs matched to youIn ResuMaxBrowse jobs
Practice ML / AI interviewsIn ResuMaxStart prep

Browse all coding projects

Train on this path

Atlas reads your resume, shows what you already have on this path, and coaches the gaps in order.

Map my resume

Stage 0, Software Engineering Foundations

Stage 1, Math & Statistics for ML (right-sized)

Stage 2, Data Wrangling & Classical Machine Learning

Stage 3, Deep Learning with PyTorch

Don't wait, start applying

Stage 4, LLM & AI Application Engineering

Stage 5, MLOps, Deployment & Production Systems

Stage 6, Portfolio, Specialization & Job Readiness

Turn these skills into offers