Nginx as a Reverse Proxy | Engineering Notes

Your app runs on port 3000.

node server.js
# listening on 0.0.0.0:3000

You open http://localhost:3000 and it works.

Then it has to go live. A real domain. HTTPS on port 443. The same app, now reachable from the internet.

The first instinct is to make the app listen on 443 directly. That works for about a day, until the app also needs to serve static files, handle TLS certificates, survive restarts without dropping traffic, limit how fast a single client can hit it, and eventually run as more than one process behind one address.

A single application process is not a good front door.

Open Table of contents

The Problem Nginx Is Solving
The Big Idea
How Nginx Is Built
How A Request Flows
Server Blocks
Location Blocks
proxy_pass And The Trailing Slash
Passing The Real Client Information
WebSockets Need An Upgrade
Buffering And Timeouts
HTTP/2 vs HTTP/1.1
Compressing Responses
Tuning Workers To The Server
Rate Limiting
Hiding Server Information
Restricting Who Can Embed Your Site
Multiple Backends With Upstream
Failure Modes
A Practical Config
Operating Nginx
Final Thoughts

The Problem Nginx Is Solving

An application process is good at running application logic. It is not built to be the thing the public internet talks to directly.

When the app is the only layer, it has to handle:

TLS termination and certificate renewal
serving static files efficiently
slow clients holding connections open
request size limits and timeouts
rate limiting and abuse control
routing different paths to different services
staying reachable while the app itself restarts

Every one of these is work that has nothing to do with the actual business logic.

Nginx takes that work and sits in front. The app moves behind it and goes back to doing one job.

The Big Idea

Nginx becomes the single front door.

The public talks to Nginx. Nginx talks to the app over a private connection.

client -> nginx (443, TLS) -> app (3000, plain http)

The app no longer faces the internet. It listens on a local port that only Nginx reaches. Nginx handles the parts that face the outside world: the certificate, the public port, the timeouts, the limits.

This is what “reverse proxy” means here. A normal proxy sits in front of clients and talks to many servers on their behalf. A reverse proxy sits in front of servers and takes requests from many clients on their behalf.

How Nginx Is Built

Nginx runs as one master process and a set of worker processes.

master process
   ├── worker process
   ├── worker process
   └── worker process

The master process reads the config, binds the ports, and manages the workers. It does not handle a single request itself. When you reload the config, the master is what starts new workers and retires old ones.

The workers do the actual work. Each worker is a single process running an event loop.

The older model, used by servers like Apache in its default setup, gave each connection its own thread or process. A thousand idle clients meant a thousand threads sitting around, each using memory and forcing the kernel to switch between them.

A single Nginx worker takes a different approach. It keeps thousands of connections open at once and only touches a connection when something actually happens on it, using the kernel’s event notification (epoll on Linux). When a connection is waiting on the network, the worker is not blocked on it. It moves on and services another.

This is why a slow client holding a connection open is cheap for Nginx and expensive for a thread-per-connection app. It is also the reason Nginx is placed in front of the app: it absorbs slow and idle connections so the app only deals with complete, ready requests.

How A Request Flows

A request to https://example.com/api/users goes through a few steps.

The client opens a TLS connection to Nginx on port 443.
Nginx terminates TLS and now has a plain HTTP request.
Nginx matches the request against its server and location rules.
Nginx opens a connection to the app on 127.0.0.1:3000.
The app responds to Nginx.
Nginx sends the response back to the client over the encrypted connection.

The app sees a plain HTTP request coming from Nginx on the local machine. It never sees the TLS handshake and never sees the client directly.

Server Blocks

A server block tells Nginx how to handle traffic for a given name and port.

server {
    listen 443 ssl;
    server_name example.com;

    ssl_certificate     /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    location / {
        proxy_pass http://127.0.0.1:3000;
    }
}

listen 443 ssl means this block handles HTTPS.

server_name decides which block answers when a request arrives. One Nginx instance can hold many server blocks for many domains on the same port, and the Host header picks the right one.

The ssl_certificate lines point at the certificate and private key. Nginx uses them to terminate TLS.

Location Blocks

A location block decides what happens for a path.

location /api/ {
    proxy_pass http://127.0.0.1:3000;
}

location /static/ {
    root /var/www/app;
}

Requests to /api/... go to the application.

Requests to /static/... are served as files straight from disk. Nginx reads the file and sends it. The application is never involved, which keeps the app process free for requests that actually need it.

This split is the practical reason to put Nginx in front. Static files go out fast from disk, and dynamic requests go to the app.

proxy_pass And The Trailing Slash

proxy_pass is the line that forwards a request to the backend. It has one behavior that is easy to miss the first time.

The trailing slash changes how the path is rewritten.

location /api/ {
    proxy_pass http://127.0.0.1:3000/;
}

With the trailing slash on proxy_pass, a request to /api/users reaches the app as /users. Nginx strips the matched /api/ prefix.

location /api/ {
    proxy_pass http://127.0.0.1:3000;
}

Without it, the same request reaches the app as /api/users. The full path is passed through.

Passing The Real Client Information

The app sees the request coming from Nginx, so by default it thinks every client is 127.0.0.1.

That breaks logging, rate limiting, and anything that depends on the client address or the original protocol.

Nginx has to pass that information forward as headers.

location / {
    proxy_pass http://127.0.0.1:3000;

    proxy_set_header Host              $host;
    proxy_set_header X-Real-IP         $remote_addr;
    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
}

Host carries the domain the client actually asked for. Without this line, the backend receives the proxy_pass target as its host, which is something like 127.0.0.1:3000. Any logic that depends on the requested domain breaks.

X-Real-IP carries the client address as a single value.

X-Forwarded-For carries the chain of client addresses the request passed through. The value $proxy_add_x_forwarded_for takes any existing X-Forwarded-For header and appends the current client’s address to it. With one proxy this is just the client IP. With several proxies in front, each hop adds its caller, so the app can read the original client at the start of the list.

X-Forwarded-Proto tells the app whether the original request was http or https. The app needs this because, after TLS termination, the request arriving at the app is plain HTTP. Without this header, an app trying to build absolute https:// URLs may build http:// ones instead.

WebSockets Need An Upgrade

A plain proxy_pass does not carry a WebSocket connection. The connection starts as HTTP and asks to upgrade, and Nginx has to forward that upgrade.

location /ws/ {
    proxy_pass http://127.0.0.1:3000;

    proxy_http_version 1.1;
    proxy_set_header Upgrade    $http_upgrade;
    proxy_set_header Connection "upgrade";
}

Without these lines, the WebSocket handshake fails and the connection keeps falling back or dropping.

Buffering And Timeouts

When the backend sends a response, Nginx does not have to pass it through byte by byte. By default it reads the whole response from the app into a buffer as fast as the app can produce it, then feeds it out to the client at whatever pace the client can take.

This matters because clients are often slow and apps are often not built to wait. Without buffering, a phone on a weak connection would hold an app worker busy for the entire download. With buffering, the app hands the response to Nginx quickly and moves on, and Nginx deals with the slow client.

location / {
    proxy_pass http://app;

    proxy_buffering on;
    proxy_buffers 8 16k;
    proxy_buffer_size 16k;
}

proxy_buffering is the master switch for this response-side behavior, and it is on by default. proxy_buffers sets how many buffers and how large. If a response does not fit in the memory buffers, Nginx spills it to a temporary file on disk, which is slower, so the sizes are worth matching to typical response sizes.

Turning buffering off makes sense for streaming, where the client should receive bytes as they are produced rather than after the whole response is ready. Server-sent events and long-polling are the common cases.

location /events/ {
    proxy_pass http://app;
    proxy_buffering off;
}

Buffering also works in the other direction. By default Nginx reads the whole request body from the client before it opens the connection to the backend. This is proxy_request_buffering, and it is on by default too.

location /upload/ {
    proxy_pass http://app;
    proxy_request_buffering off;
}

With request buffering on, a slow upload is the proxy’s problem, not the app’s. Nginx collects the full body and hands the backend a complete request, so a backend worker is not tied up for the length of a slow upload. Turning it off streams the body to the backend as it arrives, which avoids buffering very large uploads in Nginx but holds a backend connection open for the whole transfer.

Timeouts decide how long Nginx waits at each stage. They come in two groups: timers for the connection to the backend, and timers for the connection to the client.

# backend side
proxy_connect_timeout 5s;
proxy_send_timeout    60s;
proxy_read_timeout    60s;

# client side
client_header_timeout 10s;
client_body_timeout   10s;
keepalive_timeout     65s;

proxy_connect_timeout is how long to wait to open the connection to the backend. A low value here fails fast when a backend is down instead of hanging.

proxy_read_timeout is how long to wait between reads once the backend is responding. This is the one behind a 504: the backend went quiet for longer than the timeout, so Nginx gave up. Raising it buys time for genuinely slow work, but a request that needs a 120-second read timeout is usually a request that should be doing its work in the background instead.

client_header_timeout and client_body_timeout cap how long a client can take to send its headers and body. They cut off clients that open a connection and then dribble bytes slowly to hold a worker open, which is the shape of a slowloris attack.

keepalive_timeout is how long an idle client connection stays open for reuse before Nginx closes it.

These are the timers worth knowing first. There are more, and the exact rules for when each one resets matter once you start tuning, so read the proxy module and core module docs before changing them in production.

HTTP/2 vs HTTP/1.1

Under HTTP/1.1, a connection carries one request at a time. The next request on that connection waits for the previous response to finish. Browsers work around this by opening several connections to the same host, usually around six, and spreading requests across them. Each connection costs a TLS handshake and its own memory.

HTTP/2 changes the shape of the connection. One connection carries many requests at once as independent streams, so a page with dozens of small assets does not need a pile of parallel connections. It also compresses headers and uses a binary framing instead of plain text.

Turning it on in Nginx is one line.

server {
    listen 443 ssl;
    http2 on;
    server_name example.com;
}

Older Nginx versions wrote this as listen 443 ssl http2;. Either way, browsers only use HTTP/2 over TLS, so it goes hand in hand with HTTPS.

Nginx speaks HTTP/2 to the client, but it usually speaks HTTP/1.1 to the backend app. The multiplexing benefit is on the public side of Nginx, between the browser and the proxy, not between Nginx and your app.

Compressing Responses

Text responses like HTML, CSS, JavaScript, and JSON compress well. Sending them compressed cuts transfer size, which is the slowest part of a request for a user on a phone or a far-away network. Nginx can compress on the way out with gzip.

gzip on;
gzip_types text/css application/javascript application/json image/svg+xml;
gzip_min_length 1024;
gzip_comp_level 5;

gzip_types lists what to compress. HTML is always included, so it does not appear in the list. Images and video in their normal formats are already compressed, so running gzip over them burns CPU for no gain.

gzip_min_length skips tiny responses, where the compression overhead is larger than the saving.

gzip_comp_level trades CPU for size. Higher levels compress a little more but cost more CPU per response, and the gain past the middle of the range is small. A level around 5 is a reasonable balance for most sites.

The cost is real: compression uses CPU on every response it touches. For a busy site serving large text payloads, that CPU is well spent. For static assets that never change, compressing them once ahead of time and serving the precompressed file avoids paying for it on every request.

Tuning Workers To The Server

worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 16384;
}

worker_processes auto starts one worker per CPU core. Each core runs one event loop, so the workers run in parallel instead of fighting over the same core. Setting this far above the core count does not add capacity, it just adds processes competing for the same CPUs.

worker_connections is how many connections one worker can hold at once. Multiplying it by the worker count gives the number people quote as the limit:

max connections = worker_processes * worker_connections

That is an upper bound, not the real one. The actual ceiling is whichever resource runs out first: file descriptors, memory, CPU, or network. The formula only tells you the most connections Nginx will let itself track, not the most the machine can carry.

As a reverse proxy, Nginx usually keeps a connection to the client and a separate one to the backend, so its client capacity tends to be lower than that number. The relationship is not a clean one-to-one, though. A keep-alive connection serves many requests over its life, response buffering can release the backend connection early, a cached response may need no backend at all, and HTTP/2 carries many requests over a single client connection. The takeaway is that practical capacity sits below the theoretical maximum, not at a fixed fraction of it.

worker_rlimit_nofile is the part people forget. Every connection needs a file descriptor, and the operating system caps how many files a process can open. If worker_connections is high but the file descriptor limit is low, Nginx hits its real ceiling first and the error log fills with:

socket() failed (24: Too many open files)

The fix is to raise the descriptor limit in two places. worker_rlimit_nofile raises it inside Nginx, but the OS has the final say, so the system limits have to move too: ulimit -n for the shell, /etc/security/limits.conf for the user, and LimitNOFILE when Nginx runs under systemd. Setting worker_connections to a large number while the OS still allows 1024 open files does nothing except hide where the real limit is. A worker also needs descriptors for more than client sockets, since log files, upstream sockets, and temporary files each take one, so the limit should sit comfortably above worker_connections rather than exactly at it.

The tradeoff is plain: these numbers should match what the box can actually back with CPU, memory, and file descriptors. A huge worker_connections on a small server does not create capacity, it just moves the failure from one place to another.

Rate Limiting

Without a limit, one client can send requests as fast as the network allows, and a login endpoint or a search route becomes easy to hammer.

Nginx rate limiting works in two parts: a zone that tracks clients, and a rule that applies a rate.

http {
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
}

server {
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://app;
    }
}

limit_req_zone defines the tracking. It keys on $binary_remote_addr, which is the client IP in a compact form, reserves 10 MB of shared memory to hold the counters, and sets a steady rate of 10 requests per second.

limit_req applies that zone to a location. burst=20 lets a client briefly go over the rate by queueing up to 20 extra requests, which covers normal bursts like a page firing several API calls at once. nodelay serves that burst immediately instead of spacing the requests out to fit the rate.

Over the limit, Nginx rejects the request. The default status is 503, which you can change to the more accurate 429:

limit_req_status 429;

There is one trap that matters in production. The key is the client IP, but if Nginx sits behind a CDN or another load balancer, $remote_addr is that upstream’s IP, not the user’s. Every request then looks like it comes from one client, and the limit applies to the whole world as a single bucket. To rate limit real users in that setup, Nginx has to recover the true client IP from X-Forwarded-For using the real_ip module before the limit is applied.

Hiding Server Information

By default, Nginx announces its version in every response and on its error pages.

Server: nginx/1.25.3

That version number is a gift to anyone scanning for hosts running a release with a known vulnerability. Turning it off removes the version.

http {
    server_tokens off;
}

The response then says Server: nginx with no version.

Be precise about what this does. server_tokens off hides the version, not the fact that the server is Nginx. Fully removing or rewriting the Server header needs more than this directive.

Restricting Who Can Embed Your Site

If any other site can load your pages inside an iframe, it can overlay its own controls on top of yours and trick a logged-in user into clicking something they did not intend. Telling browsers who is allowed to frame your site shuts that down.

add_header X-Frame-Options "SAMEORIGIN" always;
add_header Content-Security-Policy "frame-ancestors 'self'" always;

X-Frame-Options: SAMEORIGIN is the older header, understood everywhere. frame-ancestors 'self' in a Content-Security-Policy is the modern replacement and takes priority in browsers that support it. Sending both covers old and new clients. Use 'none' or DENY if no one, including you, should ever frame the page.

Two Nginx details decide whether these headers actually show up.

The always parameter makes Nginx send the header on every response, including errors. Without it, add_header only applies to a set of success and redirect codes, so a 404 or 500 page would go out unprotected.

The sharper gotcha is inheritance. If you set add_header in the server block and then use add_header again inside a location, the location’s headers replace the inherited ones rather than adding to them. A location that sets its own header silently drops the security headers from the server block. The safe habit is to define these headers in one place, or repeat them where you override.

Multiple Backends With Upstream

One app process is a single point of failure and a single core’s worth of capacity. Running several and putting Nginx in front of them spreads the load.

An upstream block names a group of backends.

upstream app {
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
}

server {
    listen 443 ssl;
    http2 on;
    server_name example.com;

    location / {
        proxy_pass http://app;
    }
}

Nginx spreads requests across the three in round-robin order by default. Other methods exist for when round-robin is not a good fit: least_conn sends each request to the backend with the fewest active connections, and ip_hash pins a client to the same backend by its IP.

If one backend stops responding, Nginx marks it as failed for a while and sends traffic to the others. The site stays up on the remaining processes while one is restarting or deploying.

This is load balancing, but it is worth being precise about the limit: Nginx is balancing across processes it can reach directly, and it judges health mainly by whether the connection succeeds. It does not know whether the work inside a backend is healthy beyond that.

Failure Modes

The errors Nginx returns point at where the failure is.

502 Bad Gateway means Nginx reached the backend but could not get a valid response. Usually the app is down, crashed, or listening on a different port than the config expects.

curl -I http://127.0.0.1:3000

If that fails from the same host, Nginx will fail too.

504 Gateway Timeout means the backend accepted the connection but took too long to respond. The app is alive but slow.

proxy_read_timeout 60s;

Raising the timeout hides the symptom. A 504 is usually a signal that a request is doing too much work, not that the timeout is too low.

413 Request Entity Too Large means the request body crossed Nginx’s limit before it ever reached the app. File uploads hit this first.

client_max_body_size 25m;

429 Too Many Requests comes from the rate limit, not the app. The client is sending faster than the configured rate allows.

Too many open files in the error log means the connection load crossed the file descriptor limit, which ties straight back to worker_connections and worker_rlimit_nofile.

A Practical Config

A setup that pulls the pieces together: worker tuning, HTTP to HTTPS redirect, HTTP/2, gzip, rate limiting, security headers, static files from disk, and everything else proxied to the app.

worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 16384;
}

http {
    server_tokens off;

    gzip on;
    gzip_types text/css application/javascript application/json image/svg+xml;
    gzip_min_length 1024;

    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

    upstream app {
        server 127.0.0.1:3000;
    }

    server {
        listen 80;
        server_name example.com;
        return 301 https://$host$request_uri;
    }

    server {
        listen 443 ssl;
        http2 on;
        server_name example.com;

        ssl_certificate     /etc/letsencrypt/live/example.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

        client_max_body_size 25m;

        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header Content-Security-Policy "frame-ancestors 'self'" always;

        location /static/ {
            root /var/www/app;
            expires 30d;
        }

        location /api/ {
            limit_req zone=api burst=20 nodelay;
            limit_req_status 429;

            proxy_pass http://app;

            proxy_set_header Host              $host;
            proxy_set_header X-Real-IP         $remote_addr;
            proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            proxy_read_timeout 60s;
        }

        location / {
            proxy_pass http://app;

            proxy_set_header Host              $host;
            proxy_set_header X-Real-IP         $remote_addr;
            proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
}

The port 80 block does nothing but redirect to 443. The port 443 block terminates TLS, speaks HTTP/2, serves static files with a cache header, rate limits the API, and proxies the rest to the app.

Operating Nginx

Two commands do most of the day-to-day work.

Check the config before applying it:

nginx -t

This catches syntax errors and bad paths. Running it before a reload avoids taking the site down with a broken config.

Apply a new config without dropping connections:

nginx -s reload

A reload starts new worker processes with the new config and lets the old workers finish their in-flight requests before exiting. Existing requests are not cut off.

When something is wrong, the logs say where to look.

tail -f /var/log/nginx/access.log
tail -f /var/log/nginx/error.log

The access log shows what came in and what status went out. The error log shows why a request failed inside Nginx, including the upstream address it tried to reach. A 502 in the error log usually names the backend that refused the connection, which points straight at the app rather than at Nginx.

Final Thoughts

Nginx earns its place by separating two jobs that get tangled together when an app faces the internet alone.

The app runs application logic on a private port.

Nginx handles the public side: TLS, HTTP/2, static files, rate limits, security headers, and spreading traffic across backends. Its worker-and-event-loop design is what lets it hold a flood of slow connections cheaply, and the tuning knobs are just that design exposed as numbers you match to the machine.