Roadmap to a job
Cloud Engineer
A Cloud Engineer designs, provisions, automates, and operates the infrastructure that runs applications on AWS, Azure, or GCP
8 stages · 20 skills · 51 free resources
Core stack
Track your progress
0 / 25 done
Stage 01
Stage 1, Computing Foundations (Linux, Networking, Git, Scripting)
Get comfortable on a Linux command line, understand how networks actually move traffic, track changes with Git, and automate small chores with scripts, the bedrock under every cloud skill.
Linux command line & shellEssential3 links
The Linux command line is a text-based interface for interacting with a Linux operating system through a shell such as Bash or Zsh. It is used to navigate file systems, manage processes, configure services, and automate repetitive tasks. Nearly all cloud infrastructure runs on Linux, making shell proficiency foundational to operating and troubleshooting cloud workloads.
Why it matters · The overwhelming majority of cloud workloads run on Linux; you cannot operate, debug, or automate them without being at home in the terminal.
Networking fundamentals (IP, DNS, routing, subnets, TCP/HTTP, load balancing)Essential3 links
Networking fundamentals cover the protocols and concepts that govern how data moves between machines: IP addressing and subnetting define host identity and segmentation, DNS translates names to addresses, TCP and HTTP structure reliable and web-layer communication, routing determines traffic paths, and load balancing distributes requests across servers. These concepts map directly to cloud constructs such as VPCs, security groups, and managed load balancers.
Why it matters · Cloud is networking wearing new names, VPCs, subnets, security groups, and DNS are these fundamentals applied; shaky networking is the single most common thing that stalls junior cloud engineers.
Git & GitHub (version control)Essential3 links
Git is a distributed version control system that tracks changes to files over time, enabling multiple contributors to collaborate without overwriting each other's work. GitHub is a hosted platform built on Git that adds pull requests, issue tracking, and Actions-based CI/CD workflows. Together they form the foundation for managing infrastructure code, application source, and team collaboration.
Why it matters · Infrastructure as code, CI/CD, and GitOps all live in Git, and collaborating through pull requests is a daily expectation on any infrastructure team.
Scripting: Python + BashEssential3 links
Bash is a Unix shell scripting language used to chain commands, automate system tasks, and write portable scripts that run on any Linux host. Python is a general-purpose language with a rich ecosystem of cloud SDKs (boto3 for AWS, azure-sdk, google-cloud libraries) and is the dominant choice for automation, data processing, and tooling across cloud platforms. Combining both covers the full range of scripting tasks a cloud engineer encounters daily.
Why it matters · Automation is the heart of the role; Python is the dominant glue language across cloud tooling and SDKs, while Bash ties together everything that happens on a Linux host.
Stage 02
Stage 2, Pick ONE Cloud Platform & Go Deep (AWS recommended)
Master the core services of a single provider, compute, storage, networking, identity, and databases, and earn an entry-level certification to confirm the fundamentals.
Cloud core services (compute, storage, networking, IAM, databases)Essential3 links
Cloud core services are the primary building blocks offered by providers such as AWS, Azure, and GCP: compute services (EC2, virtual machine scale sets) run application workloads, object storage (S3, GCS) persists unstructured data, managed databases (RDS, Cloud SQL) handle relational data, networking constructs (VPCs, subnets) isolate resources, and IAM controls who and what can access them. Deep familiarity with one provider's versions of these services is the basis for all more advanced cloud work.
Why it matters · Depth in one provider (EC2 / S3 / VPC / IAM / RDS on AWS) is far more hireable than thin familiarity with three; the underlying concepts transfer to Azure or GCP later with little friction.
Cloud networking & IAM basics (VPC, subnets, security groups, routing, least-privilege)Essential2 links
A Virtual Private Cloud (VPC) is a logically isolated network within a cloud provider where resources are deployed across subnets (subdivided IP ranges). Security groups act as stateful firewalls controlling inbound and outbound traffic at the resource level, while routing tables direct traffic between subnets, gateways, and the internet. Least-privilege IAM means granting identities only the permissions they require, which is the primary control for limiting blast radius from misconfigurations or breaches.
Why it matters · Standing up secure, well-segmented networks with sane access controls is a defining daily task and appears in nearly every cloud-engineer posting; IAM is learned here, not bolted on later.
Entry cloud certification (AWS Solutions Architect Associate)Recommended2 links
The AWS Certified Solutions Architect Associate (SAA-C03) is an entry-to-mid level credential that validates knowledge of AWS core services, architecture best practices, and cost and reliability trade-offs. The exam covers compute, storage, database, networking, security, and high-availability patterns across AWS. It is among the most widely recognized cloud certifications and is commonly listed as a baseline requirement in cloud engineering job postings.
Why it matters · Certs don't replace projects, but AWS SAA is a widely recognized screening signal for junior and mid cloud roles; treat Cloud Practitioner as an optional warm-up rather than a hiring credential.
Stage 03
Stage 3, Infrastructure as Code (Terraform)
Stop provisioning by hand, define, version, and reproduce entire environments in code. This is the single most-requested modern cloud skill.
Terraform (provision real infrastructure as code; OpenTofu aware)Essential3 links
Terraform is an infrastructure-as-code tool that uses a declarative configuration language (HCL) to define, provision, and version cloud resources across providers including AWS, Azure, and GCP. It maintains a state file to track real infrastructure and plans changes before applying them, enabling safe, repeatable deployments. OpenTofu is the MPL-licensed, Linux Foundation fork of Terraform created after HashiCorp relicensed Terraform under the Business Source License in 2023, and shares the same HCL syntax.
Why it matters · IaC is non-negotiable in 2026 postings, and Terraform's HCL is the de facto standard; know that HashiCorp relicensed Terraform under the BSL in 2023 and that OpenTofu, the MPL-licensed, Linux Foundation fork, is now common in enterprises, so the syntax you learn carries to both.
Configuration management (Ansible)Recommended2 links
Ansible is an agentless configuration management and automation tool that uses YAML-based playbooks to describe the desired state of servers, installed packages, configuration files, and running services. It connects to hosts over SSH and applies changes idempotently, making it suitable for provisioning application environments on top of infrastructure that tools like Terraform have already created. It is also used for ad-hoc task execution and rolling deployments.
Why it matters · Terraform provisions the infrastructure; Ansible configures what runs on it, a pairing still listed across many infrastructure and DevOps postings.
Stage 04
Stage 4, Containers & Orchestration (Docker → Kubernetes)
Package applications into containers and run them reliably at scale, the default deployment model for cloud-native workloads.
Docker (containerization)Essential3 links
Docker is a containerization platform that packages an application and its dependencies into a portable image, which runs as an isolated container on any host with the Docker runtime installed. Dockerfiles define the image build process, and the Docker CLI manages building, running, tagging, and pushing images to registries such as Docker Hub or Amazon ECR. Containers provide consistent environments across development, testing, and production.
Why it matters · Containers are the standard unit of deployment; you need to build, run, and ship images comfortably before orchestration makes any sense.
Kubernetes (container orchestration)Essential3 links
Kubernetes is an open-source container orchestration system that automates the deployment, scaling, self-healing, and networking of containerized workloads across a cluster of nodes. Core objects include Pods (the smallest deployable unit), Deployments (desired-state replicas), Services (stable network endpoints), and ConfigMaps or Secrets (configuration data). Managed distributions such as Amazon EKS, Azure AKS, and Google GKE handle control-plane operations so teams focus on workloads.
Why it matters · Kubernetes has shifted from nice-to-have to expected across most cloud and platform roles, and managed flavors (EKS / AKS / GKE) are everywhere.
Stage 05
Stage 5, Automated Delivery: CI/CD & DevOps Practices
Ship infrastructure and applications automatically and safely through pipelines, the workflow employers expect you to live inside.
CI/CD pipelines (GitHub Actions / GitLab CI / Jenkins)Essential3 links
CI/CD pipelines automate the process of building, testing, and deploying code whenever changes are pushed to a repository. GitHub Actions uses YAML workflow files stored in the repository to define jobs that run on GitHub-hosted or self-hosted runners; GitLab CI uses a similar pipeline-as-code model within GitLab repositories; Jenkins is a self-hosted, plugin-based automation server with a long history in enterprise environments. All three reduce manual deployment steps and enforce quality gates before changes reach production.
Why it matters · Automated build, test, and deploy is a baseline expectation in 2026; pipelines are how your IaC and applications actually reach production.
GitOps (Argo CD / Flux)Recommended2 links
GitOps is an operational model where a Git repository is the single source of truth for the desired state of a system, and an automated agent continuously reconciles the live environment to match it. Argo CD and Flux are the two primary Kubernetes-native GitOps controllers: both watch Git repositories for changes and apply them to a cluster, with Argo CD providing a web UI for visualization and Flux offering a more declarative, composable toolkit. The model improves auditability and reduces configuration drift.
Why it matters · Git-as-source-of-truth deployment is fast becoming the norm for Kubernetes; it's a strong differentiator and is rising quickly in postings.
Checkpoint
Don't wait, start applying
You don't have to finish the path to begin. Early applications and interviews show you exactly what to learn next.
Stage 06
Stage 6, Cloud Security & Secrets
Apply least-privilege access, encryption, secrets hygiene, and secure network design, security is now embedded in the cloud-engineer role, not someone else's job.
IAM hardening, encryption, secrets managementEssential3 links
IAM hardening involves applying least-privilege policies, enforcing multi-factor authentication, rotating credentials, and auditing access to limit the impact of compromised identities in cloud environments. Encryption protects data at rest (using provider-managed or customer-managed keys via services such as AWS KMS) and in transit (via TLS). Secrets management tools such as HashiCorp Vault, AWS Secrets Manager, and Azure Key Vault store and programmatically distribute sensitive values like API keys and database passwords, keeping them out of source code and environment variables.
Why it matters · Misconfigured IAM and leaked secrets are the leading causes of cloud breaches; every posting now assumes a working security mindset, not just functional infrastructure.
Stage 07
Stage 7, Observability, Cost (FinOps) & Multi-Cloud (Seniority Levers)
Make systems visible, keep their spend under control, and broaden beyond a single provider, the skills that lift you from junior toward senior and raise pay.
Monitoring & observability (Prometheus, Grafana, CloudWatch)Essential3 links
Prometheus is an open-source time-series metrics database that scrapes instrumented endpoints and supports a query language (PromQL) for alerting and analysis. Grafana is a visualization platform that queries Prometheus (and many other data sources) to build dashboards for infrastructure and application health. Amazon CloudWatch is the native AWS observability service that collects logs, metrics, and traces from AWS resources, enabling alarms, log queries, and distributed tracing through CloudWatch Logs Insights and AWS X-Ray.
Why it matters · You operate what you build; metrics, logs, and dashboards are required to keep production healthy and to debug incidents when they happen.
Cloud cost optimization (FinOps)Recommended2 links
FinOps (Financial Operations) is a practice that combines engineering, finance, and business functions to understand and reduce cloud spending without sacrificing reliability or performance. It involves tagging resources for cost attribution, right-sizing compute instances, using reserved or spot capacity where appropriate, setting budget alerts, and reviewing usage data through tools such as AWS Cost Explorer or Azure Cost Management. The goal is to make cost a first-class consideration at provisioning time, not an afterthought.
Why it matters · Cost-aware engineers are scarce and well compensated; building cost guardrails at provisioning time is a clear 2026 platform-team trend.
Second cloud provider / multi-cloud literacyRecommended2 links
Multi-cloud literacy refers to functional knowledge of more than one major cloud provider (typically AWS, Azure, and GCP) and the ability to map equivalent services between them (for example, AWS S3 to GCS, AWS IAM to GCP IAM, EKS to AKS to GKE). Enterprises often distribute workloads across providers for resilience, vendor negotiation leverage, or regulatory reasons. Core concepts such as VPCs, object storage, and managed Kubernetes transfer readily once one provider is mastered in depth.
Why it matters · Many enterprises run more than one cloud, so cross-provider fluency widens your options and pay, but only attempt it after one provider is genuinely solid.
Go programming languageOptional2 links
Go (also called Golang) is a statically typed, compiled language developed at Google, designed for simplicity, fast compilation, and efficient concurrency via goroutines and channels. It is the implementation language for Kubernetes, Terraform, Docker, Prometheus, and many other cloud-native tools, making it the natural choice for extending or contributing to that ecosystem. Go is also used to write cloud operators, CLI tools, and high-throughput microservices.
Why it matters · Go underpins much of the cloud-native toolchain (Kubernetes, Terraform); valuable for platform and tooling roles but not required to get hired.
Stage 08
Stage 8, Portfolio, Certifications & Job Search
Prove you can do the work with real deployed projects, validate it with targeted certs, and convert that into a junior cloud or cloud-support offer.
Portfolio projects (3-5 real, deployed, Terraform + CI/CD)Essential2 links
Portfolio projects for cloud engineers are publicly visible, deployed infrastructure builds that demonstrate end-to-end skills: provisioning resources with Terraform, securing them with proper IAM, and delivering code through automated CI/CD pipelines. Effective projects include a README explaining architecture decisions, a diagram, and links to the live environment or recorded demo. Three to five well-documented projects spanning compute, networking, containers, and automation give reviewers concrete evidence of applied knowledge.
Why it matters · Hiring managers weight demonstrated, deployed infrastructure over paper credentials; well-documented projects are what actually land interviews.
Targeted certifications (Terraform Associate, CKA)Recommended2 links
The HashiCorp Terraform Associate (and its OpenTofu-compatible equivalent) validates practical knowledge of infrastructure-as-code concepts, HCL syntax, state management, modules, and provider configuration. The Certified Kubernetes Administrator (CKA), issued by the Cloud Native Computing Foundation, is a hands-on, performance-based exam that tests the ability to deploy, configure, troubleshoot, and manage Kubernetes clusters and workloads. Both are widely recognized signal points in cloud and platform engineering hiring pipelines.
Why it matters · After AWS SAA, the Terraform Associate and the Certified Kubernetes Administrator are the highest-signal certs for modern cloud roles.
Land the job
Turn these skills into offers
ResuMax takes you from skilled to hired: a resume that proves it, applications tailored per role, and interview reps.
Train on this path
Atlas reads your resume, shows what you already have on this path, and coaches the gaps in order.