All roadmaps

Roadmap to a job

Backend Engineer

A backend engineer builds the server-side systems behind every application: APIs, data stores, business logic, authentication

9 stages · 32 skills · 77 free resources

Core stack

GoPostgresDockerRedisGraphQL

Track your progress

0 / 37 done

  1. Stage 01

    Stage 1, Computing & Internet Fundamentals

    Build the mental models every later topic rests on: how the web moves data, how an operating system behaves, and how teams track changes, before writing a line of server code.

    How the internet and HTTP work (request/response, methods, status codes, headers, DNS, TLS)Essential

    HTTP is the application-layer protocol that powers data exchange on the web, structuring communication as request/response pairs with methods (GET, POST, PUT, DELETE), status codes, and headers. DNS translates domain names into IP addresses, while TLS encrypts the connection between client and server. Together these mechanisms form the foundational plumbing that every networked service depends on.

    Why it matters · Every backend you write is ultimately an HTTP server answering requests; you cannot reason about APIs without this foundation.

    Linux and the command line (shell, files, processes, permissions, environment variables)Essential

    Linux is a family of open-source Unix-like operating systems that power the vast majority of servers and cloud infrastructure. The command line provides direct control over the file system, running processes, user permissions, and environment configuration through a shell such as Bash or Zsh. Proficiency here enables navigation, scripting, and administration of any Linux-based environment.

    Why it matters · Production services run on Linux, and you deploy, inspect, and debug all of it from a terminal.

    Git and GitHub (commits, branches, merge and rebase, pull requests)Essential

    Git is a distributed version control system that tracks changes to source code across time and across collaborators. GitHub is a hosting platform built on Git that adds pull requests, code review workflows, and CI integration. Together they form the standard mechanism for managing code history, branching, and collaborative software delivery.

    Why it matters · Version control is how all real code ships and collaborates; fluency here is assumed on day one of any team.

  2. Stage 02

    Stage 2, Pick ONE Language & Go Deep

    Reach genuine fluency in a single backend language, its types, error handling, package ecosystem, and concurrency model. Depth here compounds through every later stage.

    Choosing your primary language (Go / Python / Java / Node + TypeScript)Essential

    Go, Python, Java, and Node.js with TypeScript are each mature, widely adopted backend languages with distinct runtime models and ecosystems. Go compiles to a small native binary with built-in concurrency primitives; Python offers rapid development with a rich data-science ecosystem; Java provides the JVM's performance and extensive enterprise tooling; Node.js enables non-blocking I/O using the same language as the browser, with TypeScript adding static types. Selecting one and developing deep fluency in its idioms, toolchain, and standard libraries is more productive than spreading effort across several.

    Why it matters · Hiring is ecosystem-specific: one strong stack earns interviews while dabbling across four does not. Pick by the market you are targeting.

    Core language proficiency (syntax, types, error handling, modules, package manager)Essential

    Core language proficiency covers the syntax, type system, error handling patterns, module organization, and package management that govern day-to-day code in a given backend language. Each language has specific conventions (for example, Go's explicit error return values, Python's virtual environments and pip, Java's checked exceptions, or TypeScript's structural type system) that must be internalized to write idiomatic, maintainable code. This foundation underpins every library, framework, and system built on top of it.

    Why it matters · This is what you write business logic in every day; idiomatic fluency is what employers actually mean when they say you 'know' a language.

    Concurrency and async (threads/goroutines, async-await, the event loop)Essential

    Concurrency refers to a program's ability to make progress on multiple tasks at the same time, either truly in parallel across CPU cores or interleaved on a single core. Go achieves this with lightweight goroutines and channels; JavaScript and Python use an async/await model layered on an event loop; Java uses threads managed by the JVM. Understanding which model a runtime uses determines how to write code that handles simultaneous requests without deadlocks, race conditions, or blocking.

    Why it matters · Backends handle many requests at once; a real grasp of concurrency is a key line between junior and senior engineers.

    Data structures, algorithms, and Big-O (working knowledge)Recommended

    Data structures are organized containers for data (arrays, hash maps, trees, graphs, heaps) and algorithms are the procedures that operate on them. Big-O notation describes how an algorithm's time or space requirements grow relative to input size, enabling comparison of efficiency. A working knowledge of these fundamentals guides decisions about which standard library type to use and whether a given solution will remain fast as data volumes grow.

    Why it matters · You rarely hand-roll a balanced tree at work, but interviews test this and it drives choosing the right data structure and writing efficient queries.

  3. Stage 03

    Stage 3, Build Real APIs

    Stand up an HTTP server, design clean endpoints, validate input, version and document them, and ship a working API in your chosen framework.

    A web framework for your language (Express/NestJS, FastAPI/Django, Spring Boot, Gin/Echo)Essential

    Web frameworks provide pre-built abstractions for routing HTTP requests, applying middleware, parsing request bodies, and sending responses, removing the need to write that plumbing from scratch. Express and NestJS serve Node.js; FastAPI and Django serve Python; Spring Boot serves Java; Gin and Echo serve Go. Each framework shapes the conventions for project structure, dependency injection, and request lifecycle within its ecosystem.

    Why it matters · Real backends are built on frameworks, routing, middleware, and request handling come ready-made so you focus on the domain.

    REST API design + OpenAPI (resources, verbs, status codes, pagination, versioning, idempotency)Essential

    REST (Representational State Transfer) is an architectural style for designing networked APIs around resources identified by URLs, manipulated with standard HTTP verbs, and represented as JSON or other formats. OpenAPI (formerly Swagger) is a specification format for describing REST APIs in a machine-readable way, enabling documentation generation and client code generation. Well-designed REST APIs apply consistent status codes, pagination strategies, versioning schemes, and idempotency guarantees to remain predictable for consumers.

    Why it matters · REST remains the pragmatic default for external APIs; clean, documented endpoints are a core daily deliverable.

    Input validation and error handling (schemas, 4xx vs 5xx, consistent error shapes)Essential

    Input validation is the practice of verifying that incoming request data conforms to expected types, formats, and ranges before processing, typically using schema libraries such as Zod, Pydantic, or Jakarta Bean Validation. Error handling covers classifying failures as client errors (4xx) or server errors (5xx) and returning consistent, structured error response bodies. These practices prevent both security vulnerabilities from malformed inputs and confusing, inconsistent API behavior.

    Why it matters · A large share of security and reliability bugs trace back to unvalidated input or sloppy, inconsistent error responses.

    GraphQL and/or gRPC (knowing which to reach for)Recommended

    GraphQL is a query language and runtime for APIs that lets clients specify exactly which data fields they need, reducing over- and under-fetching, and is well suited to multi-client products with heterogeneous data needs. gRPC is a high-performance remote procedure call framework using Protocol Buffers for serialization and HTTP/2 for transport, optimized for low-latency service-to-service communication. Choosing between them (or defaulting to REST) depends on the communication pattern, client diversity, and performance requirements of the system.

    Why it matters · gRPC is increasingly the default for fast internal service-to-service calls; GraphQL earns its keep for complex multi-client products. Know the trade-offs rather than every detail.

  4. Stage 04

    Stage 4, Databases (Relational First)

    Model data, write correct and fast SQL, understand indexes, transactions, and normalization, then learn when a non-relational store is the right tool. This is the heart of backend work.

    SQL and relational modeling (joins, normalization, constraints, schema design)Essential

    SQL (Structured Query Language) is the standard language for creating, querying, and manipulating data in relational databases such as PostgreSQL and MySQL. Relational modeling is the discipline of organizing data into tables with defined columns, primary and foreign keys, normalization rules to reduce redundancy, and constraints to enforce integrity. Core SQL operations including joins, aggregations, subqueries, and transactions are used daily to retrieve and transform data.

    Why it matters · Relational databases back most systems; confident SQL and data modeling come up in nearly every backend interview.

    Indexing, query performance, and EXPLAINEssential

    Database indexes are auxiliary data structures (most commonly B-trees) that allow the database engine to locate rows matching a query predicate without scanning the entire table, dramatically reducing query time. The EXPLAIN (or EXPLAIN ANALYZE) command reveals the query execution plan chosen by the planner, showing which indexes are used, estimated row counts, and where cost is concentrated. Reading and acting on execution plans is the primary technique for diagnosing and fixing slow queries.

    Why it matters · The gap between a 2ms and a 2s endpoint is almost always a missing index or a bad query plan; reading plans is core senior-level value.

    Transactions, ACID, and isolation levels (MVCC, concurrency)Essential

    A database transaction groups a set of operations into a single atomic unit that either fully succeeds or fully rolls back, preserving data integrity. ACID properties (Atomicity, Consistency, Isolation, Durability) define the correctness guarantees a database provides. Isolation levels (read committed, repeatable read, serializable) control how concurrent transactions see each other's in-progress changes, with MVCC (Multi-Version Concurrency Control) being the mechanism most modern databases use to implement isolation without blocking reads.

    Why it matters · Money, orders, and inventory must stay correct under concurrent writes; getting transactions wrong silently corrupts data.

    NoSQL and when to use it (document, key-value, search; CAP and trade-offs)Recommended

    NoSQL databases encompass a broad category of non-relational data stores optimized for specific access patterns: document stores (MongoDB) for flexible JSON-like records, key-value stores (Redis, DynamoDB) for fast single-key lookups, and search engines (Elasticsearch, OpenSearch) for full-text and faceted search. The CAP theorem states that a distributed system can guarantee at most two of Consistency, Availability, and Partition tolerance simultaneously, and each NoSQL system makes explicit trade-offs among them. Selecting a NoSQL store appropriately depends on the query patterns, consistency requirements, and scale of the application.

    Why it matters · Document and key-value stores fit specific access patterns; recognizing when NOT to use relational is itself a real skill.

    ORMs and migrations (and their pitfalls)Recommended

    An ORM (Object-Relational Mapper) is a library that maps database tables to objects or types in application code, allowing developers to query and manipulate data without writing raw SQL in most cases. Database migrations are versioned, incremental scripts that evolve the schema over time in a repeatable and trackable way, typically managed by tools such as Alembic, Flyway, Liquibase, or Prisma Migrate. Common ORM pitfalls include N+1 query problems, opaque generated SQL, and migration conflicts in parallel branches, all of which require understanding the underlying SQL to resolve.

    Why it matters · Teams evolve schema through migrations and reach data through ORMs; you also need to know when to drop down to raw SQL.

  5. Stage 05

    Stage 5, Authentication, Authorization & Security

    Secure your API correctly. Because broken authorization is the leading real-world API vulnerability in 2026, this comes before scaling, not after.

    Authentication vs authorization, sessions, JWT, OAuth2/OIDCEssential

    Authentication is the process of verifying who a user is (login), while authorization determines what that verified user is permitted to do. Sessions store a server-side record of a logged-in user identified by a cookie, whereas JWTs (JSON Web Tokens) are self-contained signed tokens that carry claims and can be verified without a database lookup. OAuth2 is an authorization delegation framework and OIDC (OpenID Connect) is an identity layer on top of it, together forming the industry standard for third-party login flows and API access delegation.

    Why it matters · Every non-trivial API needs login and access control; sessions or OAuth2/OIDC with JWTs are the industry baseline.

    OWASP API Security Top 10 (especially broken object- and function-level authorization)Essential

    The OWASP API Security Top 10 is a regularly updated list of the most critical security risks specific to APIs, maintained by the Open Web Application Security Project. Broken Object Level Authorization (BOLA) occurs when an API exposes object identifiers and fails to verify whether the requesting user owns or can access the referenced object, while Broken Function Level Authorization occurs when sensitive operations (admin actions, elevated operations) are not properly restricted. Understanding this list guides secure-by-default API design and server-side access control patterns.

    Why it matters · A large majority of API attacks come from authenticated users reaching data that isn't theirs, you must re-check authorization on every request, not just at login.

    Transport and data security (TLS/HTTPS, secrets management, password hashing, rate limiting)Essential

    TLS (Transport Layer Security) encrypts data in transit between clients and servers, and HTTPS is HTTP carried over a TLS connection, ensuring confidentiality and integrity. Secrets management refers to storing API keys, database credentials, and certificates outside of source code, using tools such as environment variables, AWS Secrets Manager, or HashiCorp Vault. Password hashing with adaptive algorithms (bcrypt, Argon2) protects stored credentials, and rate limiting controls how many requests a client can make in a given window to reduce abuse and denial-of-service risk.

    Why it matters · Encrypting traffic, hashing passwords, keeping secrets out of source, and throttling abusive callers are table stakes for anything in production.

  6. Checkpoint

    Don't wait, start applying

    You don't have to finish the path to begin. Early applications and interviews show you exactly what to learn next.

  7. Stage 06

    Stage 6, Caching, Queues & Async Processing

    Make systems fast and loosely coupled: cache hot data, push slow work off the request path, and let services communicate through events.

    Caching with Redis (cache-aside, write-through, TTLs, invalidation)Essential

    Redis is an in-memory data structure store used as a cache, message broker, and short-lived data store, with sub-millisecond read and write latency. The cache-aside pattern has the application check Redis first and populate it on a miss; write-through keeps the cache in sync by writing to both cache and database on every mutation. TTLs (time-to-live) automatically expire stale entries, while cache invalidation strategies ensure that updated data is reflected promptly, which is the hardest part of caching to get right.

    Why it matters · Caching is often the highest-leverage performance win, and Redis, with sub-millisecond reads, is the near-universal default.

    Message queues and event-driven architecture (Kafka, RabbitMQ; pub/sub, work queues)Recommended

    Message queues are durable intermediaries that decouple the producers and consumers of work, allowing services to communicate asynchronously without tight runtime dependencies. Apache Kafka is a distributed event streaming platform designed for high-throughput, ordered, persistent log-based messaging, commonly used for event sourcing and stream processing. RabbitMQ is a broker implementing AMQP, well suited for task queues and routing patterns where individual messages are delivered to a specific consumer and acknowledged upon completion.

    Why it matters · Background jobs, service decoupling, and event streaming are core to scalable backends, Kafka for streams, RabbitMQ for task queues.

  8. Stage 07

    Stage 7, Containers, Cloud & CI/CD

    Package, ship, and run your service the way modern teams do. In 2026 containers and a cloud provider are baseline expectations, not advanced extras.

    Docker (images, Dockerfiles, Compose, multi-stage builds)Essential

    Docker is a platform for packaging applications and their dependencies into portable, isolated units called containers, defined by Dockerfiles that describe the image build process layer by layer. Docker Compose is a tool for defining and running multi-container applications (for example, an API service alongside a database and a cache) using a single declarative YAML file. Multi-stage builds reduce final image size by separating the build environment from the runtime environment within a single Dockerfile.

    Why it matters · Containers are now the default unit of deployment; 'works on my machine' no longer holds up.

    CI/CD pipelines (automated test, build, deploy with GitHub Actions)Essential

    CI/CD (Continuous Integration and Continuous Delivery) is the practice of automatically building, testing, and deploying code on every push or pull request, catching regressions before they reach production. GitHub Actions is a workflow automation platform integrated into GitHub that runs YAML-defined jobs in response to repository events such as pushes, pull request opens, or scheduled triggers. A typical pipeline runs linting, unit tests, integration tests, builds a Docker image, and deploys to a staging or production environment.

    Why it matters · Teams ship through pipelines; wiring up tests and deploys is expected even at the junior level.

    Cloud fundamentals on one provider (AWS most common: compute, storage, managed DB, networking, IAM)Essential

    Cloud providers offer on-demand infrastructure as managed services, with AWS being the most widely adopted platform. Core AWS concepts include EC2 or Lambda for compute, S3 for object storage, RDS or Aurora for managed relational databases, VPCs and security groups for networking, and IAM (Identity and Access Management) for controlling access to cloud resources. Understanding these primitives and how they relate to each other is the foundation for designing and operating cloud-hosted backend systems.

    Why it matters · Modern backends run on the cloud; cloud-native thinking is now baseline, and AWS dominates job postings.

    Kubernetes (pods, deployments, services, scaling)Recommended

    Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of containerized applications across a cluster of machines. The core abstractions are Pods (one or more containers sharing a network namespace), Deployments (declarative desired state for a set of replicated Pods), and Services (stable network endpoints that route traffic to Pods). Kubernetes handles health checking, automatic restarts, rolling updates, and horizontal scaling based on resource metrics.

    Why it matters · The standard for orchestrating services at scale; expected for senior and infra roles, valuable but not day-one for juniors.

    Infrastructure as Code (Terraform basics)Optional

    Infrastructure as Code (IaC) is the practice of defining and provisioning cloud infrastructure through machine-readable configuration files rather than manual console interactions. Terraform is a widely used open-source IaC tool by HashiCorp that uses a declarative HCL syntax to describe resources across AWS, GCP, Azure, and other providers, then plans and applies changes to match the desired state. Storing infrastructure configuration in version control enables reproducible environments, peer review of infrastructure changes, and auditability.

    Why it matters · Teams provision cloud through code; useful to read and tweak, but rarely something entry-level backend roles must author.

  9. Stage 08

    Stage 8, Testing, Observability & Reliability

    Prove your code works and keep it healthy in production. This is what turns a coder into an engineer teams trust on call.

    Testing (unit, integration, end-to-end; mocking; a test-first habit)Essential

    Unit tests verify individual functions or classes in isolation using mocks or stubs for external dependencies; integration tests verify that multiple components (for example, a handler and a real database) work correctly together; end-to-end tests exercise the full system through its public interface. Mocking substitutes real dependencies (databases, HTTP clients) with controlled fakes so unit tests remain fast and deterministic. A test-first habit means writing tests alongside or before production code, making tests a first-class deliverable rather than an afterthought.

    Why it matters · Untested backends fail in production; teams expect tests to be part of shipping, not an afterthought.

    Observability, logging, metrics, and tracing (OpenTelemetry, Prometheus, Grafana)Essential

    Observability is the ability to understand the internal state of a running system from its external outputs, comprising three pillars: logs (discrete timestamped events), metrics (numeric measurements aggregated over time), and distributed traces (correlated records of a request's path across services). OpenTelemetry is a vendor-neutral SDK and wire protocol for instrumenting applications to emit all three signals. Prometheus collects and stores time-series metrics, and Grafana provides dashboards and alerting on top of Prometheus and other data sources.

    Why it matters · You can't fix what you can't see; structured logs, metrics, and traces are how production incidents actually get diagnosed.

  10. Stage 09

    Stage 9, System Design & Distributed Systems (Job-Ready)

    Reason about scale, reliability, and trade-offs across services. This is the senior signal and the core of the backend system-design interview.

    Scalability fundamentals (load balancing, replication, sharding, horizontal scaling, CDNs)Essential

    Scalability is a system's ability to handle increased load by adding resources without redesigning the architecture. Load balancers distribute incoming requests across multiple server instances; database replication copies data to read replicas to spread query load; sharding partitions data across multiple database instances to scale writes. Horizontal scaling adds more machines rather than upgrading a single machine, and CDNs (Content Delivery Networks) cache and serve static assets from edge nodes geographically close to users.

    Why it matters · Designing systems that survive growth and partial failure is the defining senior backend skill and a standard interview round.

    Microservices vs monolith and service communication (sync vs async, API gateways)Recommended

    A monolith is a single deployable unit containing all application logic, while a microservices architecture decomposes the system into independently deployable services that each own a bounded domain. Services communicate either synchronously (via HTTP/REST or gRPC, where the caller waits for a response) or asynchronously (via message queues or event streams, where the caller continues without waiting). API gateways sit at the edge of a microservices cluster and handle cross-cutting concerns such as authentication, rate limiting, and request routing.

    Why it matters · You should choose architecture deliberately and understand the operational cost of microservices, not adopt them by reflex.

    Integrating LLM APIs with a vector store (RAG basics, embeddings, pgvector/Pinecone)Recommended

    LLM APIs (such as those provided via OpenRouter or directly from model providers) accept text prompts and return generated text, enabling language understanding and generation within backend applications. Embeddings are dense numerical vector representations of text that capture semantic similarity, produced by embedding models and stored in a vector database such as pgvector (a PostgreSQL extension) or Pinecone. RAG (Retrieval-Augmented Generation) is the pattern of embedding a user query, retrieving semantically similar documents from the vector store, and injecting them as context into an LLM prompt to ground its response in specific data.

    Why it matters · A fast-growing share of 2026 backend postings ask for LLM-API integration plus a vector database; it is a strong differentiator, though not yet universal.

    Capstone portfolio project (containerized API + DB + cache + auth + CI/CD, deployed)Essential

    A capstone portfolio project is a self-directed, end-to-end application that demonstrates practical integration of the full backend stack: a containerized API service, a relational or document database, a Redis cache layer, authentication, and a CI/CD pipeline that deploys to a public cloud environment. Unlike tutorial exercises, it is a real deployed service with a public URL, source code in a repository, and a README documenting the architecture and design decisions. Building and maintaining such a project provides concrete evidence of the ability to independently assemble and operate a production-grade system.

    Why it matters · A deployed end-to-end project is the single strongest hireability signal, it proves you can integrate every stage above into something real.

  11. Land the job

    Turn these skills into offers

    ResuMax takes you from skilled to hired: a resume that proves it, applications tailored per role, and interview reps.

Train on this path

Atlas reads your resume, shows what you already have on this path, and coaches the gaps in order.

Map my resume