<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Hugh’s Tech Blog]]></title><description><![CDATA[Bits and pieces from different technologies and the DevOps world.]]></description><link>https://techblog.hughtipping.com</link><image><url>https://substackcdn.com/image/fetch/$s_!YZMG!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd56886b9-ca36-46ca-b31a-cce5588b44a3_1024x1024.png</url><title>Hugh’s Tech Blog</title><link>https://techblog.hughtipping.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 27 May 2026 12:08:06 GMT</lastBuildDate><atom:link href="https://techblog.hughtipping.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Hugh]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[spackle0@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[spackle0@substack.com]]></itunes:email><itunes:name><![CDATA[Hugh Tipping]]></itunes:name></itunes:owner><itunes:author><![CDATA[Hugh Tipping]]></itunes:author><googleplay:owner><![CDATA[spackle0@substack.com]]></googleplay:owner><googleplay:email><![CDATA[spackle0@substack.com]]></googleplay:email><googleplay:author><![CDATA[Hugh Tipping]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Kubernetes Networking Deep Dive, Part 4]]></title><description><![CDATA[Encryption In-Flight]]></description><link>https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-bc9</link><guid isPermaLink="false">https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-bc9</guid><dc:creator><![CDATA[Hugh Tipping]]></dc:creator><pubDate>Fri, 27 Feb 2026 18:30:18 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3840" height="2160" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2160,&quot;width&quot;:3840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;icon&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="icon" title="icon" srcset="https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1667372283587-e1557c08aca4?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxlbmNyeXB0aW9ufGVufDB8fHx8MTc3MjE1NjI3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@growtika">Growtika</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p></p><p>This is the fourth and <em>final</em> post in my blog post series tracing packets&#8217; journey through a Kubernetes cluster. Parts 1-3 covered foundations, pod-to-pod traffic (east-west), and north-south traffic through load balancers. This post will get to that scary and often dreaded topic of encryption. This will keep it simple, doing encryption without a service mesh.</p><p>Encryption in Kubernetes is not a one-size-fits-all setup. It requires decisions about things like where to terminate TLS, what traffic should be encrypted, and how to manage all those certificates. Now that we have a better understanding of the path of a packet, we can make these decisions.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Encryption Boundaries</h2><p>Traffic flowing into, out of, and through a Kubernetes cluster crosses several <em>boundaries</em>:</p><ol><li><p>Client to Load Balancer: external network traffic</p></li><li><p>Load Balancer to Node: it could be external traffic or may be internal</p></li><li><p>Node to Ingress Controller Pod: this is the cluster network</p></li><li><p>Ingress Controller to a Backend Pod: also cluster network</p></li><li><p>Pod to Pod: also cluster network</p></li></ol><p>Each boundary that you cross is a possible encryption termination point. But you need to ask which hops on this journey need encryption, and where should TLS terminate?</p><h2>TLS Termination</h2><p>There are three general patterns for handling TLS in a Kubernetes cluster.</p><h3>Pattern 1: TLS Terminates at the Load Balancer itself</h3><p>The load balancer has the TLS certificate and can terminate the encryption. The traffic <em>within</em> the cluster then travels unencrypted.</p><pre><code><code>Client &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Load Balancer &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9658; Node &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9658; Pod
         HTTPS          &#9474;           HTTP            HTTP
                   TLS terminates
                   here
</code></code></pre><p>This is the simplest configuration pattern. The load balancer handles all that pesky certificate stuff, and applications within the cluster receive plain ol&#8217; HTTP.</p><p><strong>Configuration example for an AWS ALB (<a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html">Application Load Balancer</a>) Ingress:</strong></p><pre><code><code>apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:123456789:certificate/abc-123
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
    alb.ingress.kubernetes.io/ssl-redirect: '443'
spec:
  ingressClassName: alb
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 80
</code></code></pre><p><strong>Pros:</strong></p><ul><li><p>Simple certificate management: in this case AWS handles certificate renewal</p></li><li><p>No certificate configuration happens inside cluster</p></li><li><p>Allows the load balancer to inspect traffic for use in things like rate limiting, or logging</p></li><li><p>Lower CPU usage on application pods since they don&#8217;t have to spend compute resources on encryption and decryption.</p></li></ul><p><strong>Cons</strong></p><ul><li><p>Traffic flows unencrypted within cluster network</p></li><li><p>Requires that you really trust the security of your internal network infrastructure</p></li><li><p>Not so good when you have compliance/regulatory requirements that mandate full end-to-end encryption</p></li></ul><p><strong>Use cases:</strong></p><ul><li><p>Internal applications where the cluster network is <em>trusted</em></p></li><li><p>When the L7 load balancer features (WAF, header inspection) are required</p></li><li><p>Environments that are not subject to stricter regulatory requirements</p></li></ul><h3>Pattern 2: TLS Passthrough to Ingress Controller</h3><p>In this case, instead of the load balancer terminating the encryption, it instead forwards the encrypted traffic as is. The TLS encryption will terminate at the Ingress Controller running inside the cluster.</p><pre><code><code>Client &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Load Balancer &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Node &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Ingress Pod &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9658; Backend Pod
         HTTPS          &#9474;           HTTPS          HTTPS            &#9474;         HTTP
                   L4 passthrough                              TLS terminates
                   (no decryption)                             here
</code></code></pre><p>The load balancer operates at Layer 4 rather than Layer 7, forwarding the TCP connections without doing any payload inspection.</p><p><strong>Configuration example for an NGINX Ingress with passthrough:</strong></p><pre><code><code>apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  annotations:
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls-secret
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 443
</code></code></pre><p>Notice the ssl-passthrough is set to &#8220;true&#8221;. The TLS certificate is stored, in this case, in a Kubernetes Secret:</p><pre><code><code># Create TLS secret from certificate files (Bad idea. You should use a service)
kubectl create secret tls app-tls-secret \
  --cert=tls.crt \
  --key=tls.key

# View the secret
kubectl get secret app-tls-secret -o yaml # Blech
</code></code></pre><p><strong>Pros:</strong></p><ul><li><p>TLS terminates within cluster boundary</p></li><li><p>The load balancer does not need any certificate access</p></li><li><p>The traffic remains encrypted traffic between the LB and ingress controller</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>The load balancer is unable inspect traffic, preventing use of any L7 features the LB may have.</p></li><li><p>You will need to do some sort of certificate management within the cluster, a service that updates the certificate from some source.</p></li><li><p>Ingress controller to backend still unencrypted (by default)</p></li></ul><p><strong>Use cases:</strong></p><ul><li><p>When TLS must terminate within the cluster</p></li><li><p>When the load balancer should not have access to certificates (but for a managed service, there should already be sufficient security built-in)</p></li><li><p>When L4 load balancing is sufficient (it can be but it limits your options)</p></li></ul><h3>Pattern 3: End-to-End Encryption with Re-encryption at Each Hop (Oh&#8230; fun)</h3><p>In this case, each hop uses its own TLS connection. Traffic is decrypted and re-encrypted at each hop.</p><pre><code><code>Client &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Load Balancer &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Ingress Pod &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658; Backend Pod
         HTTPS          &#9474;           HTTPS          &#9474;          HTTPS
                   TLS session 1              TLS session 2      TLS session 3
                   terminates                 terminates         terminates
                   re-encrypts                re-encrypts        here
</code></code></pre><p>This requires certificates for all hops. That&#8217;s a lot of certificates depending on the hops.</p><p><strong>Configuration example (NGINX Ingress with backend HTTPS):</strong></p><pre><code><code>apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    nginx.ingress.kubernetes.io/proxy-ssl-verify: "on"
    nginx.ingress.kubernetes.io/proxy-ssl-secret: "default/backend-ca-secret"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.example.com
    secretName: ingress-tls-secret
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 8443  # Backend listens on HTTPS
</code></code></pre><p>The backend pod must serve TLS (note the mounted certs on volumes):</p><pre><code><code>apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: my-app:latest
    ports:
    - containerPort: 8443
    volumeMounts:
    - name: tls-certs
      mountPath: /etc/tls
      readOnly: true
  volumes:
  - name: tls-certs
    secret:
      secretName: backend-tls-secret
</code></code></pre><p><strong>Pros:</strong></p><ul><li><p>Traffic is encrypted at every segment</p></li><li><p>It is part of <a href="https://en.wikipedia.org/wiki/Defense_in_depth_(computing)">Defense in Depth</a></p></li><li><p>It can satisfy stricter requirements</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>More complicated certificate management: you have lots of certs to manage and expiration of just one of them can break the chain</p></li><li><p>Increased latency for the extra time to do TLS handshakes</p></li><li><p>Higher CPU usage because of encryption/decryption at each hop</p></li><li><p>Just a bigger operational pain</p></li></ul><p><strong>Use cases:</strong></p><ul><li><p>Regulatory compliance requiring end-to-end encryption (government, financial institutions, health care)</p></li><li><p><a href="https://en.wikipedia.org/wiki/Zero_trust_architecture">Zero-trust</a> network architectures</p></li></ul><h2>CNI-Level Encryption</h2><p>You can instead encrypt traffic at the network layer using your CNI plugin. Several CNI plugins can encrypt all pod traffic transparently. Here are some examples, though I haven&#8217;t practiced these but I present them for your education. I encourage you to experiment.</p><h3>WireGuard Encryption</h3><p>WireGuard is a VPN protocol built into the Linux kernel (5.6+). It encrypts traffic at Layer 3 so the applications don&#8217;t have to care about it..</p><p><strong>How it works:</strong></p><ol><li><p>Every cluster node generates a WireGuard keypair</p></li><li><p>The CNI configures WireGuard tunnels between nodes</p></li><li><p>All pod-to-pod traffic between nodes is encrypted</p></li><li><p>Applications see plain TCP/UDP; encryption happens before it gets that far up the OSI stack.</p></li></ol><p>Both Calico and Cilium support this.</p><h3>IPsec Encryption</h3><p>IPsec is an older protocol for encryption. It is also Layer .</p><p>IPsec requires more configuration than WireGuard, including key exchange (IKE) setup. WireGuard is generally preferred for new deployments due to simpler configuration and better performance.</p><p>Calico supports IPSec.</p><h3>CNI Encryption</h3><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                         CNI-LEVEL ENCRYPTION                                   &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474;                                                                                &#9474;
&#9474;   Pod A (10.244.0.5)                      Pod B (10.244.1.3)                   &#9474;
&#9474;   &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                    &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                 &#9474;
&#9474;   &#9474;  Application     &#9474;                    &#9474;  Application     &#9474;                 &#9474;
&#9474;   &#9474;  sends HTTP      &#9474;                    &#9474;  receives HTTP   &#9474;                 &#9474;
&#9474;   &#9474;  (plaintext)     &#9474;                    &#9474;  (plaintext)     &#9474;                 &#9474;
&#9474;   &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                    &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                 &#9474;
&#9474;            &#9474;                                       &#9474;                           &#9474;
&#9474;            &#9660;                                       &#9474;                           &#9474;
&#9474;   &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                    &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                 &#9474;
&#9474;   &#9474;  Kernel TCP/IP   &#9474;                    &#9474;  Kernel TCP/IP   &#9474;                 &#9474;
&#9474;   &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                    &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9650;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                 &#9474;
&#9474;            &#9474;                                       &#9474;                           &#9474;
&#9474;            &#9660;                                       &#9474;                           &#9474;
&#9474;   &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                    &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;                 &#9474;
&#9474;   &#9474;  WireGuard       &#9474;                    &#9474;  WireGuard       &#9474;                 &#9474;
&#9474;   &#9474;  encrypts        &#9474; &#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658;  &#9474;  decrypts        &#9474;                 &#9474;
&#9474;   &#9474;  (Layer 3)       &#9474;    encrypted       &#9474;  (Layer 3)       &#9474;                 &#9474;
&#9474;   &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                    &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;                 &#9474;
&#9474;                                                                                &#9474;
&#9474;   - Application code unchanged                                                 &#9474;
&#9474;   - All pod traffic encrypted automatically                                    &#9474;
&#9474;   - Encryption/decryption in kernel (fast)                                     &#9474;
&#9474;   - No certificate management per application                                  &#9474;
&#9474;                                                                                &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
</code></code></pre><p><strong>Pros of CNI encryption:</strong></p><ul><li><p>Transparent to applications</p></li><li><p>Encrypts <em>all</em> pod traffic, not just HTTP/HTTPS</p></li><li><p>Transparent certificate management</p></li><li><p>Faster since it happens in the kernel</p></li><li><p>Simpler to enable in the whole cluster</p></li></ul><p><strong>Cons of CNI encryption:</strong></p><ul><li><p>No mutual authentication at application level: the traffic is simply encrypted</p></li><li><p>No application-level identity: just IP address</p></li><li><p>It encrypts cross-node traffic by default but not traffic within the same-node (must set that)</p></li></ul><p><strong>Use cases:</strong></p><ul><li><p>When you want to encrypt all cluster traffic without a lot of fuss</p></li><li><p>When you have to have encryption (but not application-level mTLS)</p></li><li><p>Defense in depth along with application TLS</p></li></ul><h3>Verifying CNI Encryption</h3><pre><code><code># Capture traffic between nodes (should be encrypted)
# On Node 1, capture traffic to Node 2
sudo tcpdump -i eth0 -nn host 192.168.1.11 and udp port 51820
# Output (WireGuard):
# 14:30:01.123 IP 192.168.1.10.51820 &gt; 192.168.1.11.51820: UDP, length 128

# The payload is encrypted - you won't see pod IPs or application data

# Capture on WireGuard interface (sees decrypted traffic)
sudo tcpdump -i wireguard.cali -nn host 10.244.1.3
# Output:
# 14:30:01.123 IP 10.244.0.5.45678 &gt; 10.244.1.3.8080: Flags [P.], seq 1:100

# Compare: without encryption, eth0 would show pod IPs directly
</code></code></pre><h2>Application-Level TLS (Without Service Mesh)</h2><p>For applications where you must have mutual TLS (<a href="https://en.wikipedia.org/wiki/Mutual_authentication">mTLS</a>) or certificate-based identity <em>without</em> a service mesh, you can just do the TLS directly in the application or a sidecar container in your pod.</p><h3>Application-based TLS</h3><p>The application handles the TLS itself and loads certificates from mounted Secrets (Again, secrets that are ideally managed by a service that updates them.)</p><pre><code><code>apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: my-app:latest
    ports:
    - containerPort: 8443
    env:
    - name: TLS_CERT_FILE
      value: /etc/tls/tls.crt
    - name: TLS_KEY_FILE
      value: /etc/tls/tls.key
    - name: TLS_CA_FILE
      value: /etc/tls/ca.crt
    volumeMounts:
    - name: tls-certs
      mountPath: /etc/tls
      readOnly: true
  volumes:
  - name: tls-certs
    secret:
      secretName: my-app-tls
</code></code></pre><p>Application code (Go example):</p><pre><code><code>// Load certificates
cert, err := tls.LoadX509KeyPair("/etc/tls/tls.crt", "/etc/tls/tls.key")
caCert, err := ioutil.ReadFile("/etc/tls/ca.crt")
caCertPool := x509.NewCertPool()
caCertPool.AppendCertsFromPEM(caCert)

// Configure TLS with mutual authentication
tlsConfig := &amp;tls.Config{
    Certificates: []tls.Certificate{cert},
    ClientCAs:    caCertPool,
    ClientAuth:   tls.RequireAndVerifyClientCert,
}

server := &amp;http.Server{
    Addr:      ":8443",
    TLSConfig: tlsConfig,
}
server.ListenAndServeTLS("", "")
</code></code></pre><p>This does, however place the burden on the application developers to deal with TLS.</p><h3>Certificate Management with cert-manager</h3><p>cert-manager is a service that automates all the certificate management and renewal within Kubernetes.</p><p><strong>Installing cert-manager, example:</strong></p><pre><code><code>kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
</code></code></pre><p><strong>Create a Certificate Authority (self-signed for internal use):</strong></p><pre><code><code>apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: selfsigned-issuer
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: internal-ca
  namespace: cert-manager
spec:
  isCA: true
  commonName: internal-ca
  secretName: internal-ca-secret
  privateKey:
    algorithm: ECDSA
    size: 256
  issuerRef:
    name: selfsigned-issuer
    kind: ClusterIssuer
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: internal-ca-issuer
spec:
  ca:
    secretName: internal-ca-secret
</code></code></pre><p><strong>Issue certificates for applications:</strong></p><pre><code><code>apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-app-cert
  namespace: default
spec:
  secretName: my-app-tls
  duration: 2160h  # 90 days
  renewBefore: 360h  # 15 days before expiry
  subject:
    organizations:
    - my-company
  commonName: my-app.default.svc.cluster.local
  dnsNames:
  - my-app
  - my-app.default
  - my-app.default.svc
  - my-app.default.svc.cluster.local
  issuerRef:
    name: internal-ca-issuer
    kind: ClusterIssuer
</code></code></pre><pre><code><code># Verify certificate was issued
kubectl get certificate my-app-cert
# Output:
# NAME          READY   SECRET       AGE
# my-app-cert   True    my-app-tls   5m

# View certificate details
kubectl get secret my-app-tls -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -text -noout
</code></code></pre><p><strong>For external certs Let&#8217;s Encrypt is a time-tested way to handle it:</strong></p><pre><code><code>apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-account
    solvers:
    - http01:
        ingress:
          class: nginx
</code></code></pre><pre><code><code>apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: app-example-com
  namespace: default
spec:
  secretName: app-example-com-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - app.example.com
</code></code></pre><h2>Network Policies</h2><p>Network Policies are a different level of security and are not encryption, but they are complementary. They determine which pods can communicate with which other pods at Layer 3/4. By default all pods can talk to all other pods and that is often undesirable.</p><h3>How Kubernetes Network Policies Work</h3><p>Network Policies are resources that define ingress and egress rules for pods. The CNI plugin will enforce these rules, for example using iptables rules.</p><pre><code><code>apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432
  - to:  # Allow DNS
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
</code></code></pre><p>What this policy does:</p><ul><li><p>Gets applied to pods with the label <code>app: backend</code></p></li><li><p>It allows ingress only <em>from</em> pods with label <code>app: frontend</code> on running on port 8080</p></li><li><p>Further, it allows egress only <em>to</em> pods with label <code>app: database</code> on running on port 5432</p></li><li><p>And it allows egress to kube-dns for DNS resolution which is kinda important!</p></li></ul><h3>Default Deny Policy</h3><p>As mentioned earlier by default, pods will accept traffic from <em>any</em> source. You can set a default deny policy to restrict it and then allow only that that traffic you <em>explicitly</em> allow:</p><pre><code><code>apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: default
spec:
  podSelector: {}  # Applies to all pods in namespace
  policyTypes:
  - Ingress
  - Egress
</code></code></pre><p></p><h3>Network Policy Limits</h3><ul><li><p>NetworkPolicy operates at L3/L4  (IP addresses and ports)</p></li><li><p>There is no application-layer (L7) filtering (HTTP paths, headers)</p></li><li><p>Pod identity is IP-based, not certificate-based</p></li><li><p>It does not encrypt traffic</p></li><li><p>It requires CNI support (not all CNIs implement NetworkPolicy, notably Flannel)</p></li></ul><h3>Verifying Network Policy Enforcement</h3><pre><code><code># Check if CNI supports NetworkPolicy
kubectl get pods -n kube-system -l k8s-app=calico-node
# or
kubectl get pods -n kube-system -l k8s-app=cilium

# Test connectivity (should be blocked by policy)
kubectl exec -it frontend-pod -- curl -m 5 http://backend:8080
# Output: curl: (28) Connection timed out

# Test allowed connectivity
kubectl exec -it allowed-pod -- curl -m 5 http://backend:8080
# Output: HTTP 200 OK

# View iptables rules created by NetworkPolicy (Calico)
sudo iptables -L cali-pi-xxxx -n -v
</code></code></pre><h2>Combining Encryption Layers</h2><p>To follow a true defense-in-depth approach, you should combine multiple  layers:</p><ol><li><p><strong>TLS at the edge</strong>: HTTPS from clients to load balancer</p></li><li><p><strong>CNI Encryption</strong>: WireGuard for all pod-to-pod traffic (it just does it)</p></li><li><p><strong>Network Policies</strong>: Restricting which pods can communicate to each other</p></li><li><p><strong>Application TLS</strong> (not always needed): mTLS for sensitive services and in higher regulatory environments</p></li></ol><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                    DEFENSE IN DEPTH: COMBINED APPROACH                         &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474;                                                                                &#9474;
&#9474;   Internet                          Cluster                                    &#9474;
&#9474;                                                                                &#9474;
&#9474;   &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;     &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;  &#9474;
&#9474;   &#9474; Client &#9474;     &#9474;    LB    &#9474;     &#9474;                                        &#9474;  &#9474;
&#9474;   &#9492;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9492;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;     &#9474;  &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;      &#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;       &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;  &#9474; Ingress  &#9474;      &#9474; Backend  &#9474;       &#9474;  &#9474;
&#9474;       &#9474;   HTTPS       &#9474;  HTTPS    &#9474;  &#9474;   Pod    &#9474;      &#9474;   Pod    &#9474;       &#9474;  &#9474;
&#9474;       &#9474;&#9668;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658;&#9474;&#9668;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658;&#9474;&#9668;&#9552;&#9474;          &#9474;&#9668;&#9552;&#9552;&#9552;&#9552;&#9552;&#9474;          &#9474;       &#9474;  &#9474;
&#9474;       &#9474;   TLS 1.3     &#9474;  TLS 1.3  &#9474;  &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;      &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;       &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;       &#9474;                 &#9474;              &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;       &#9474;   WireGuard     &#9474;              &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;       &#9474;&#9668;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9552;&#9658;&#9474;              &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;       &#9474;   encrypted     &#9474;              &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;                                        &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;  NetworkPolicy: only frontend          &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;  can reach backend on 8080             &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9474;                                        &#9474;  &#9474;
&#9474;       &#9474;               &#9474;           &#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;  &#9474;
&#9474;                                                                                &#9474;
&#9474;   Layer:  Edge TLS      Ingress TLS    CNI Encryption    NetworkPolicy       &#9474;
&#9474;           (HTTPS)       (HTTPS)        (WireGuard)       (L3/L4 ACL)         &#9474;
&#9474;                                                                                &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
</code></code></pre><h2>Choosing Your Strategy</h2><h3>Decision Flow</h3><pre><code><code>Do you need to encrypt traffic inside the cluster?
&#9500;&#9472;&#9472; No &#8594; TLS termination at load balancer
&#9492;&#9472;&#9472; Yes
    &#9500;&#9472;&#9472; Is compliance satisfied by network-layer encryption?
    &#9474;   &#9500;&#9472;&#9472; Yes &#8594; CNI encryption (WireGuard/IPsec), Simple
    &#9474;   &#9492;&#9472;&#9472; No (need application-level identity)
    &#9474;       &#9500;&#9472;&#9472; Can you use a service mesh?
    &#9474;       &#9474;   &#9500;&#9472;&#9472; Yes &#8594; Istio/Linkerd mTLS (takes care of a lot of the mess for you)
    &#9474;       &#9474;   &#9492;&#9472;&#9472; No &#8594; Application TLS with cert-manager (blech)
    &#9474;       &#9492;&#9472;&#9472; Need E2E encryption?
    &#9474;           &#9492;&#9472;&#9472; Yes &#8594; Re-encryption at each hop
    &#9492;&#9472;&#9472; Do you need to restrict which pods can communicate?
        &#9492;&#9472;&#9472; Yes &#8594; Add Network Policies
</code></code></pre><h2>Troubleshooting Encryption</h2><h3>TLS Certificate Issues</h3><pre><code><code># Test TLS connection to a service
openssl s_client -connect app.example.com:443 -servername app.example.com

# View certificate details
echo | openssl s_client -connect app.example.com:443 2&gt;/dev/null | openssl x509 -text -noout

# Check certificate expiry
kubectl get secret my-app-tls -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -enddate -noout
# Output: notAfter=Mar 15 12:00:00 2024 GMT

# View cert-manager certificate status
kubectl describe certificate my-app-cert
# Look for Ready condition and any error messages

# View cert-manager logs
kubectl logs -n cert-manager -l app=cert-manager -f
</code></code></pre><h3>WireGuard Problems</h3><pre><code><code># Check WireGuard status on node
sudo wg show
# Verify peers are connected and traffic is flowing

# Check for WireGuard errors in CNI logs
# Calico:
kubectl logs -n calico-system -l k8s-app=calico-node | grep -i wireguard

# Cilium:
kubectl logs -n kube-system -l k8s-app=cilium | grep -i wireguard

# Verify kernel module is loaded
lsmod | grep wireguard
# Output: wireguard  81920  0

# Check if traffic is actually encrypted (should see UDP 51820)
sudo tcpdump -i eth0 -nn udp port 51820
</code></code></pre><h3>Network Policy Problems</h3><pre><code><code># List all network policies
kubectl get networkpolicies -A

# Describe a specific policy
kubectl describe networkpolicy backend-policy

# Test connectivity from a debug pod
kubectl run debug --rm -it --image=busybox -- wget -qO- --timeout=5 http://backend:8080

# Check CNI logs for policy enforcement
# Calico:
kubectl logs -n calico-system -l k8s-app=calico-node | grep -i policy

# Cilium:
cilium policy get
cilium monitor --type policy-verdict
</code></code></pre><h2>Summary</h2><p>Encryption in Kubernetes involves a decision making process at various levels:</p><p><strong>Edge encryption</strong> (from client to cluster):</p><ul><li><p>TLS terminates at the load balancer: simplest, but traffic unencrypted inside cluster</p></li><li><p>TLS passthrough: TLS terminates at Ingress Controller within cluster</p></li><li><p>Re-encryption: TLS at every hop, the most secure but more to manage</p></li></ul><p><strong>Cluster encryption</strong> (from pod to pod):</p><ul><li><p>CNI encryption (WireGuard/IPsec): transparent, encrypts <em>all</em> cross-node traffic transparently</p></li><li><p>Application TLS: application controls certificates, enables mTLS</p></li><li><p>Service mesh (not covered here): automates mTLS with sidecars (lots of batteries included and more options for filtering traffic)</p></li></ul><p><strong>Network access control</strong>:</p><ul><li><p>Network Policies: L3/L4 rules restricting which pods can communicate</p></li><li><p>This is NOT encryption, but a complementary security layer to go along with encryption.</p></li></ul><p>This all depends on compliance requirements, tolerance for operational complexity, and whatever threat model you are basing your security policies on. In a lot of cases, TLS at the edge combined with CNI encryption is more than enough and more manageable. For environments requiring application-level identity and mTLS, then application TLS with cert-manager or a service mesh is necessary.</p><h2>References</h2><h3>Kubernetes TLS and Ingress</h3><ul><li><p>Ingress TLS: https://kubernetes.io/docs/concepts/services-networking/ingress/#tls</p></li><li><p>Securing a Cluster: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/</p></li></ul><h3>cert-manager</h3><ul><li><p>cert-manager Documentation: https://cert-manager.io/docs/</p></li><li><p>Installation: https://cert-manager.io/docs/installation/</p></li><li><p>ACME Issuer: https://cert-manager.io/docs/configuration/acme/</p></li><li><p>CA Issuer: https://cert-manager.io/docs/configuration/ca/</p></li></ul><h3>Let&#8217;s Encrypt</h3><ul><li><p>Let&#8217;s Encrypt: https://letsencrypt.org/</p></li><li><p>ACME Protocol: https://datatracker.ietf.org/doc/html/rfc8555</p></li></ul><h3>CNI Encryption</h3><ul><li><p>Calico WireGuard: https://docs.tigera.io/calico/latest/network-policy/encrypt-cluster-pod-traffic</p></li><li><p>Cilium Encryption: https://docs.cilium.io/en/stable/security/network/encryption/</p></li><li><p>WireGuard: https://www.wireguard.com/</p></li></ul><h3>Network Policies</h3><ul><li><p>Network Policies: https://kubernetes.io/docs/concepts/services-networking/network-policies/</p></li><li><p>Network Policy Recipes: https://github.com/ahmetb/kubernetes-network-policy-recipes</p></li></ul><h3>TLS Best Practices</h3><ul><li><p>Mozilla SSL Configuration Generator: https://ssl-config.mozilla.org/</p></li><li><p>TLS 1.3 RFC: https://datatracker.ietf.org/doc/html/rfc8446</p></li></ul><h3>NGINX Ingress Controller</h3><ul><li><p>TLS/HTTPS: https://kubernetes.github.io/ingress-nginx/user-guide/tls/</p></li><li><p>Backend HTTPS: https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#backend-protocol</p></li><li><p>SSL Passthrough: https://kubernetes.github.io/ingress-nginx/user-guide/tls/#ssl-passthrough</p></li></ul><h3>Security Frameworks</h3><ul><li><p>NIST Cryptographic Standards: https://csrc.nist.gov/projects/cryptographic-standards-and-guidelines</p></li><li><p>CIS Kubernetes Benchmark: https://www.cisecurity.org/benchmark/kubernetes</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Kubernetes Networking Deep Dive, Part 3]]></title><description><![CDATA[North-South Traffic]]></description><link>https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-8e0</link><guid isPermaLink="false">https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-8e0</guid><dc:creator><![CDATA[Hugh Tipping]]></dc:creator><pubDate>Fri, 20 Feb 2026 18:00:17 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="2871" height="2885" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2885,&quot;width&quot;:2871,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;black and white analog watch&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="black and white analog watch" title="black and white analog watch" srcset="https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1598944999410-e93772fc48a5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw5fHxjb21wYXNzfGVufDB8fHx8MTc3MTM3NTUxNHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@ray027">Sunil Ray</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>This is the third post in  my four-part series tracking packets as they flow through a Kubernetes cluster. In <a href="https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-f73">Part 2</a>, I went over pod-to-pod (east-west) traffic. Now let&#8217;s talk about traffic from an external user, through a LoadBalancer into the cluster, to a pod, and back again. Yep, all that.</p><p>Every packet destined for a Kubernetes Service has to pass through iptables rules that select a backend pod and modify packet headers. This is important for debugging connectivity problems, latency, and service configuration.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Service Abstraction</h2><p>Let&#8217;s dig a bit into the Service resource type in Kubernetes.</p><p>Pods are &#8220;ephemeral,&#8221; meaning they are temporary. Every time a pod gets created or restarted, it gets a new IP address. Trying to connect to a pod&#8217;s IP address directly is brittle since the IP can change at a moment&#8217;s notice. Instead, use a Service to provide a more stable endpoint that will route your traffic to the pods it exposes.</p><h3>Service Types</h3><p><strong>ClusterIP</strong> (default): This will allocate a virtual IP from the Service CIDR for the service itself. You can only get to this IP from inside the cluster. It appears only in iptables or IPVS rules, not on any kind of network interface.</p><p><strong>NodePort</strong>: This type of service opens a port (default range 30000-32767) directly on every node in the cluster. External traffic can reach the service via <code>&lt;node-ip&gt;:&lt;nodeport&gt;</code>.</p><p><strong>LoadBalancer</strong>: This provisions a load balancer outside of the cluster within whatever platform you&#8217;re using (cloud provider or MetalLB for physical services). The load balancer obtains an external IP address and forwards traffic to the NodePort.</p><pre><code><code># View services and their types
kubectl get svc -o wide
# Output:
# NAME         TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)        AGE
# kubernetes   ClusterIP      10.96.0.1     &lt;none&gt;          443/TCP        30d
# my-app       LoadBalancer   10.96.0.15    203.0.113.50    80:30080/TCP   5d
# internal-api ClusterIP      10.96.0.42    &lt;none&gt;          8080/TCP       5d
</code></code></pre><p>For the my-app service, the <code>80:30080/TCP</code> means: external port 80 coming into the LB maps to the NodePort listening on 30080.</p><h2>Scenario</h2><p>For example, let&#8217;s say we trace traffic to a LoadBalancer service with three backend pods:</p><ul><li><p>External client: 198.51.100.5 (public internet)</p></li><li><p>Load balancer external IP: 203.0.113.50 (provided by the LoadBalancer provisioner)</p></li><li><p>NodePort: 30080</p></li><li><p>Service ClusterIP: 10.96.0.15</p></li><li><p>Backend pods:</p><ul><li><p>Pod 1: 10.244.0.5 on Node 1 (192.168.1.10)</p></li><li><p>Pod 2: 10.244.1.3 on Node 2 (192.168.1.11)</p></li><li><p>Pod 3: 10.244.2.2 on Node 3 (192.168.1.12)</p></li></ul></li></ul><h2>The Ingress, Step by Step</h2><h3>Step 1: A Client Initiates Request (Layer 7/4/3)</h3><p>The client&#8217;s browser connects to http://203.0.113.50 (the load balancer). The client&#8217;s TCP/IP stack creates a packet:</p><ul><li><p>Source IP: 198.51.100.5 (the client)</p></li><li><p>Destination IP: 203.0.113.50 (the load balancer)</p></li><li><p>Source port: 54321 (ephemeral)</p></li><li><p>Destination port: 80 (what the load balancer listens on)</p></li></ul><h3>Step 2: Load Balancer Receives Traffic (Layer 4)</h3><p>The external load balancer receives the packet on its external IP. Then, the load balancer:</p><ol><li><p>Accepts the TCP connection (<a href="https://en.wikipedia.org/wiki/Handshake_(computing)#TCP_three-way_handshake">three way handshake</a>)</p></li><li><p>It selects a healthy backend node from its pool (nodes with NodePort 30080)</p></li><li><p>Then it forwards the traffic to that node</p></li></ol><p>The load balancer performs checks against the NodePort to determine if the node is ready to accept traffic.</p><pre><code><code># Example health check with netcat (what the LB does internally)
# TCP connect to each node on port 30080 to ensure it's responding.
nc -zv 192.168.1.10 30080
nc -zv 192.168.1.11 30080
nc -zv 192.168.1.12 30080
</code></code></pre><p>Depending on how the load balancer is configured:</p><ul><li><p><strong>SNAT mode</strong>: LB changes source IP to its own IP (this helps you restrict incoming traffic only from the LB. You could also place the source IP into a header X-Forwarded-For and have the client read that if the source IP is important)</p></li><li><p><strong>DSR/Transparent mode</strong>: LB preserves client source IP</p></li></ul><h3>Step 3: Packet Arrives at Node (Layer 3)</h3><p>The load balancer forwards the packet to Node 1 (192.168.1.10):</p><ul><li><p>Source IP: 198.51.100.5 (client, preserved)</p></li><li><p>Destination IP: 192.168.1.10 (node)</p></li><li><p>Destination port: 30080 (NodePort port)</p></li></ul><p>The packet goes to the node&#8217;s physical interface (eth0).</p><h3>Step 4: iptables PREROUTING Chain (Layer 3)</h3><p>The packet first passes through the PREROUTING chain in the iptables nat table. This is where Kubernetes service routing starts.</p><pre><code><code>sudo iptables -t nat -L PREROUTING -n --line-numbers
# Output:
# Chain PREROUTING (policy ACCEPT)
# num  target     prot opt source               destination
# 1    KUBE-SERVICES  all  --  0.0.0.0/0        0.0.0.0/0
</code></code></pre><p>From the above, all traffic is sent to the KUBE-SERVICES chain.</p><h3>Step 5: KUBE-SERVICES Chain (Layer 3)</h3><p>The KUBE-SERVICES chain contains rules for all the Services in the cluster. It matches a rule by the destination IP:port combinations.</p><pre><code><code>sudo iptables -t nat -L KUBE-SERVICES -n | head -20
# Output:
# Chain KUBE-SERVICES (2 references)
# target                     prot opt source       destination
# KUBE-SVC-XXXX1             tcp  --  0.0.0.0/0    10.96.0.15    /* default/my-app cluster IP */ tcp dpt:80
# KUBE-NODEPORTS             all  --  0.0.0.0/0    0.0.0.0/0     ADDRTYPE match dst-type LOCAL
</code></code></pre><p>For NodePort traffic, the destination is a local node IP, <em>not the ClusterIP</em>. The rule <code>ADDRTYPE match dst-type LOCAL</code> catches this and then goes to the chain KUBE-NODEPORTS. (Dizzy yet?)</p><h3>Step 6: KUBE-NODEPORTS Chain (Layer 3)</h3><p>This chain matches the actual NodePort numbers:</p><pre><code><code>sudo iptables -t nat -L KUBE-NODEPORTS -n
# Output:
# Chain KUBE-NODEPORTS (1 references)
# target                     prot opt source       destination
# KUBE-EXT-XXXX1             tcp  --  0.0.0.0/0    0.0.0.0/0    /* default/my-app */ tcp dpt:30080
</code></code></pre><p>Traffic to port 30080 then moves to the KUBE-EXT-XXXX1 chain for that particular node (external traffic handling for this service).</p><h3>Step 7: KUBE-EXT Chain and KUBE-SVC Chain (Layer 3)</h3><p>The KUBE-EXT chain handles external traffic policy and then jumps to the service chain:</p><pre><code><code>sudo iptables -t nat -L KUBE-EXT-XXXX1 -n
# Output (externalTrafficPolicy: Cluster):
# Chain KUBE-EXT-XXXX1 (1 references)
# target                     prot opt source       destination
# KUBE-MARK-MASQ             all  --  0.0.0.0/0    0.0.0.0/0
# KUBE-SVC-XXXX1             all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><p>KUBE-MARK-MASQ marks the packet for source NAT (SNAT) later. This is necessary because the packet may be forwarded to a pod on a different node.</p><p>The KUBE-SVC chain does load balancing across endpoints within the cluster (the different available pods):</p><pre><code><code>sudo iptables -t nat -L KUBE-SVC-XXXX1 -n
# Output:
# Chain KUBE-SVC-XXXX1 (2 references)
# target                     prot opt source       destination
# KUBE-SEP-AAAA1             all  --  0.0.0.0/0    0.0.0.0/0    statistic mode random probability 0.33333333349
# KUBE-SEP-BBBB2             all  --  0.0.0.0/0    0.0.0.0/0    statistic mode random probability 0.50000000000
# KUBE-SEP-CCCC3             all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><p>Now, probability rules implement <em>random</em> load balancing for picking which pod:</p><ul><li><p>First rule: 33.3% chance (1/3)</p></li><li><p>Second rule: 50% of remaining (1/2 of 2/3 = 1/3)</p></li><li><p>Third rule: 100% of remaining (1/3)</p></li></ul><p>Each endpoint gets <em>equal</em> probability.</p><h3>Step 8: KUBE-SEP Chain - DNAT (Layer 3)</h3><p>Assume the random selection chooses KUBE-SEP-BBBB2 (Pod 2 on Node 2):</p><pre><code><code>sudo iptables -t nat -L KUBE-SEP-BBBB2 -n
# Output:
# Chain KUBE-SEP-BBBB2 (1 references)
# target                     prot opt source       destination
# KUBE-MARK-MASQ             all  --  10.244.1.3   0.0.0.0/0
# DNAT                       tcp  --  0.0.0.0/0    0.0.0.0/0    tcp to:10.244.1.3:8080
</code></code></pre><p>The DNAT rule rewrites the destination:</p><ul><li><p>Before: dst 192.168.1.10:30080 (the NodePort)</p></li><li><p>After: dst 10.244.1.3:8080 (the Pod&#8217;s actual IP address! We&#8217;ve nearly there!)</p></li></ul><p>The packet now has:</p><ul><li><p>Source IP: 198.51.100.5 (client)</p></li><li><p>Destination IP: 10.244.1.3 (Pod 2)</p></li><li><p>Destination port: 8080</p></li></ul><h3>Step 9: Routing Decision (Layer 3)</h3><p>After PREROUTING, the kernel does some routing magic. The destination 10.244.1.3 is on Node 2, <em>not local to this node</em>. The packet must be <em>forwarded</em>.</p><pre><code><code>ip route get 10.244.1.3
# Output (VXLAN example):
# 10.244.1.3 via 10.244.1.0 dev flannel.1 src 10.244.0.0
</code></code></pre><p>The packet will head out the flannel.1 interface to get to Node 2.</p><h3>Step 10: iptables FORWARD Chain (Layer 3)</h3><p>The packet passes through the FORWARD chain in the filter table:</p><pre><code><code>sudo iptables -L FORWARD -n | head -10
# Output:
# Chain FORWARD (policy ACCEPT)
# target     prot opt source               destination
# KUBE-FORWARD  all  --  0.0.0.0/0        0.0.0.0/0
# KUBE-SERVICES  all  --  0.0.0.0/0       0.0.0.0/0   ctstate NEW
</code></code></pre><h3>Step 11: iptables POSTROUTING Chain - SNAT (Layer 3)</h3><p>Before the packet leaves the node, it passes through POSTROUTING in the nat table (Don&#8217;t worry if it&#8217;s not all familiar to you):</p><pre><code><code>sudo iptables -t nat -L POSTROUTING -n
# Output:
# Chain POSTROUTING (policy ACCEPT)
# target                     prot opt source       destination
# KUBE-POSTROUTING           all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><pre><code><code>sudo iptables -t nat -L KUBE-POSTROUTING -n
# Output:
# Chain KUBE-POSTROUTING (1 references)
# target     prot opt source               destination
# RETURN     all  --  0.0.0.0/0            0.0.0.0/0    mark match ! 0x4000/0x4000
# MARK       all  --  0.0.0.0/0            0.0.0.0/0    MARK xor 0x4000
# MASQUERADE all  --  0.0.0.0/0            0.0.0.0/0
</code></code></pre><p>The packet was marked by KUBE-MARK-MASQ earlier. MASQUERADE performs SNAT, changing the source IP to the node&#8217;s IP:</p><ul><li><p>Before: src 198.51.100.5</p></li><li><p>After: src 192.168.1.10</p></li></ul><p>The packet now has:</p><ul><li><p>Source IP: 192.168.1.10 (Node 1)</p></li><li><p>Destination IP: 10.244.1.3 (Pod 2)</p></li></ul><h3>Step 12: Cross-Node Forwarding (Layer 2/3)</h3><p>The packet is forwarded to Node 2 using the CNI&#8217;s <em>cross-node</em> mechanism (VXLAN, BGP, etc.) as described in <a href="https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-f73">Part 2</a>.</p><h3>Step 13: Packet Arrives at Pod (Layer 3/4/7)</h3><p>On Node 2, the packet is decapsulated (if overlay) and routed to Pod 2. The pod receives:</p><ul><li><p>Source IP: 192.168.1.10 (Node 1, because of SNAT)</p></li><li><p>Destination IP: 10.244.1.3 (Pod 2)</p></li><li><p>Destination port: 8080</p></li></ul><p>The application sees the request as coming from Node 1, <em>not the original client</em>. The client IP has been lost due to SNAT.</p><h2>The Return Path</h2><h3>Step 1: Application Responds (Layer 7/4/3)</h3><p>Once the app is done doing what it needs to do with the packet, Pod 2&#8217;s application sends a response:</p><ul><li><p>Source IP: 10.244.1.3</p></li><li><p>Destination IP: 192.168.1.10 (Node 1, from SNAT)</p></li><li><p>Source port: 8080</p></li><li><p>Destination port: 54321 (client&#8217;s original port, preserved in <a href="https://blog.cloudflare.com/conntrack-tales-one-thousand-and-one-flows/">conntrack</a>)</p></li></ul><h3>Step 2: Packet Routes to Node 1 (Layer 3)</h3><p>The destination 192.168.1.10 is Node 1. The packet is forwarded via the CNI.</p><h3>Step 3: Connection Tracking Reverses NAT (Layer 3)</h3><p>When the packet arrives at Node 1, the kernel&#8217;s connection tracking (conntrack) <em>recognizes it as a reply to an already established connection</em>:</p><pre><code><code>sudo conntrack -L | grep 10.244.1.3
# Output:
# tcp  6 117 TIME_WAIT src=198.51.100.5 dst=192.168.1.10 sport=54321 dport=30080 
#      src=10.244.1.3 dst=192.168.1.10 sport=8080 dport=54321 [ASSURED] mark=0 use=1
</code></code></pre><p>The conntrack entry shows the <em>original connection</em> (client to NodePort) and the reply direction (pod to node). The Linux kernel automatically reverses the NAT (kinda neat, eh?):</p><ul><li><p><strong>Un-SNAT</strong>: Source 10.244.1.3 becomes 192.168.1.10 (and then to the nodeport perspective)</p></li><li><p><strong>Un-DNAT</strong>: Source 192.168.1.10:30080 (from the client&#8217;s perspective)</p></li></ul><p>The packet is then sent back to the client:</p><ul><li><p>Source IP: 192.168.1.10 (Node 1)</p></li><li><p>Destination IP: 198.51.100.5 (client)</p></li><li><p>Source port: 30080</p></li></ul><h3>Step 4: Load Balancer and Client (Layer 3/4)</h3><p>The packet returns through the load balancer to the client. <em>The load balancer maintains its own connection state</em> and may perform additional translations to present the external IP (203.0.113.50) of the Load Balancer itself as the source (this hides your internal infrastructure).</p><p>The client receives the response from 203.0.113.50:80.</p><h2>externalTrafficPolicy</h2><p>As if that wasn&#8217;t enough (and it was a lot), here is some additional information about the behavior of packet routing.</p><p>The default behavior (externalTrafficPolicy: Cluster) is to use SNAT, which <em>loses</em> the client IP. But there are other ways.</p><h3>externalTrafficPolicy: Cluster (Default)</h3><ul><li><p>Traffic can land on any node</p></li><li><p>If the selected pod is on a different node, traffic is forwarded</p></li><li><p>SNAT is applied to ensure return traffic comes back through the same node</p></li><li><p><em>Client IP is lost</em></p></li><li><p>Load is evenly distributed</p></li></ul><h3>externalTrafficPolicy: Local</h3><ul><li><p>Traffic only goes to pods on the node that received it</p></li><li><p>If there are no local pods, the node will actually fail the health checks and the load balancer will stop sending traffic to that node</p></li><li><p>No SNAT is needed because the traffic stays local to the node.</p></li><li><p>The Client IP is preserved</p></li><li><p>Load may not be evenly distributed in cases where nodes have MORE pods so are likely to get more traffic.</p></li></ul><pre><code><code>apiVersion: v1
kind: Service
metadata:
  name: my-app
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local  # Preserve client IP
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080
</code></code></pre><p>With <code>externalTrafficPolicy: Local</code>, the iptables rules will also change:</p><pre><code><code>sudo iptables -t nat -L KUBE-EXT-XXXX1 -n
# Output (externalTrafficPolicy: Local):
# Chain KUBE-EXT-XXXX1 (1 references)
# target                     prot opt source       destination
# KUBE-SVC-XXXX1             all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><p><em>Notice: No KUBE-MARK-MASQ. No, MASQERADEing SNAT will be applied.</em></p><p>The KUBE-SVC chain only contains endpoints local to the node:</p><pre><code><code># On Node 1, which has Pod 1 (10.244.0.5)
sudo iptables -t nat -L KUBE-SVC-XXXX1 -n
# Output:
# Chain KUBE-SVC-XXXX1 (1 references)
# target                     prot opt source       destination
# KUBE-SEP-AAAA1             all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><p>In this example, only one endpoint (the local pod) is listed. Nodes without local pods have this:</p><pre><code><code># On Node 3, which has no pods for this service
sudo iptables -t nat -L KUBE-SVC-XXXX1 -n
# Output:
# Chain KUBE-SVC-XXXX1 (1 references)
# target                     prot opt source       destination
# KUBE-MARK-DROP             all  --  0.0.0.0/0    0.0.0.0/0
</code></code></pre><p>The KUBE-MARK-DROP rule causes the packet to be <em>dropped</em>. This will cause the health check to fail since the packet is essentially thrown away. The load balancerwill see this and will stop sending traffic to this particular node.</p><h3>Choosing Between Policies</h3><p>Use <strong>Cluster</strong> when:</p><ul><li><p>Client IP is not needed (or you are putting the IP in an X-Forwarded-For header at the LB level)</p></li><li><p>Even load distribution is wanted</p></li><li><p>All nodes should be receiving traffic regardless of where the traffic is destined,</p></li></ul><p>Use <strong>Local</strong> when:</p><ul><li><p>Client IP must be preserved (logging, geolocation, rate limiting)</p></li><li><p>Application needs to see a real client IP address for whatever reason</p></li><li><p>It&#8217;s ok to have uneven load balancing</p></li></ul><h2>IPVS Mode Differences</h2><p>I read up that when kube-proxy runs in IPVS mode, the flow is similar to that long path above, but it&#8217;s done in a different manner.</p><h3>IPVS Virtual Servers</h3><p>Instead of iptables chains, IPVS creates <em>virtual servers</em> that you can check on with the ipvsadm command.</p><pre><code><code>sudo ipvsadm -Ln
# Output:
# IP Virtual Server version 1.2.1 (size=4096)
# Prot LocalAddress:Port Scheduler Flags
#   -&gt; RemoteAddress:Port           Forward Weight ActiveConn InActConn
# TCP  10.96.0.15:80 rr
#   -&gt; 10.244.0.5:8080              Masq    1      2          0
#   -&gt; 10.244.1.3:8080              Masq    1      1          0
#   -&gt; 10.244.2.2:8080              Masq    1      3          0
# TCP  192.168.1.10:30080 rr
#   -&gt; 10.244.0.5:8080              Masq    1      2          0
#   -&gt; 10.244.1.3:8080              Masq    1      1          0
#   -&gt; 10.244.2.2:8080              Masq    1      3          0
</code></code></pre><p>IPVS handles both ClusterIP (10.96.0.15:80) and NodePort (192.168.1.10:30080) Service types as virtual servers.</p><h3>IPVS Still Uses iptables</h3><p>IPVS mode s<em>till uses iptables</em> under the hood for a few cases:</p><ul><li><p>Masquerading (SNAT)</p></li><li><p><a href="https://www.cisco.com/c/en/us/tech/quality-of-service-qos/qos-packet-marking/index.html">Packet marking</a></p></li><li><p>NodePort handling on <em>all node IPs</em></p></li></ul><p>For some more torture, here you go:</p><pre><code><code>sudo iptables -t nat -L KUBE-POSTROUTING -n
# Output (IPVS mode):
# Chain KUBE-POSTROUTING (1 references)
# target     prot opt source               destination
# MASQUERADE all  --  0.0.0.0/0            0.0.0.0/0    match-set KUBE-LOOP-BACK dst,dst,src
</code></code></pre><h3>IPVS Load Balancing Algorithms</h3><p>IPVS supports multiple Load Balancing algorithms (e.g. rr = roundrobin)</p><pre><code><code># View current scheduler
sudo ipvsadm -Ln | grep "TCP  10.96"
# Output shows scheduler algorithms that are available: rr, lc, dh, sh, sed, nq

# Let's see how it's configured in the kube-proxy config
kubectl get configmap kube-proxy -n kube-system -o yaml | grep scheduler
# Output: scheduler: "rr"
</code></code></pre><h2>Ingress Controllers</h2><p>Let&#8217;s move a little further out from the level of iptables and ipvs and examine the Ingress Controller. This resource adds another hop to the flow of traffic. Traffic flows:</p><p>Client &#8594; Load Balancer &#8594; NodePort &#8594; <strong>Ingress Controller Pod</strong> &#8594; Backend Pod</p><h3>How Ingress Works</h3><ol><li><p>An Ingress Controller (nginx, envoy, traefik, etc.) runs as pods in the cluster</p></li><li><p>Those pods are exposed via a LoadBalancer or NodePort service (so that it routes traffic to the Ingress)</p></li><li><p>Ingresses let you route based upon things like host or path to the backend services (so that a specific host name or URL will route to a difference running app).</p></li><li><p>The controller receives the traffic and <em>proxies this traffic</em> to backends based upon Ingress rules</p></li></ol><pre><code><code>kubectl get ingress
# Output:
# NAME      CLASS   HOSTS           ADDRESS         PORTS   AGE
# my-app    nginx   app.example.com 203.0.113.50    80      5d
</code></code></pre><h3>Ingress Traffic Path</h3><ol><li><p>A client browser resolves, e.g., app.example.com to 203.0.113.50 (The Ingress Load Balander IP)</p></li><li><p>The traffic arrives at LoadBalancer</p></li><li><p>The load balancer forwards to the NodePort of the Ingress Controller <em>Service</em></p></li><li><p>iptables routes this traffic to an Ingress Controller pod (as we have already discussed)</p></li><li><p>The Ingress Controller examines the Host header and path</p></li><li><p>The Controller then opens new connection to the backend service (ClusterIP)</p></li><li><p>iptables routes to the backend pod as we&#8217;ve discussed before.</p></li><li><p>The response returns back through the controller to the client</p></li></ol><p>The Ingress Controller terminates the original connection and creates a new one, providing L7 routing capabilities.</p><pre><code><code># View Ingress Controller pods and their node placement
kubectl get pods -n ingress-nginx -o wide
# Output:
# NAME                                        READY   STATUS    IP           NODE
# ingress-nginx-controller-5c8d66c76d-abc12   1/1     Running   10.244.0.8   node-1
# ingress-nginx-controller-5c8d66c76d-def34   1/1     Running   10.244.1.9   node-2
</code></code></pre><h2>Observability for North-South Traffic</h2><h3>Viewing iptables <em>counters</em></h3><pre><code><code># Watch packet counts through service chains
sudo iptables -t nat -L KUBE-SVC-XXXX1 -n -v
# Output:
# Chain KUBE-SVC-XXXX1 (2 references)
#  pkts bytes target     prot opt in     out     source               destination
#   847  50K KUBE-SEP-AAAA1  all  --  *      *   0.0.0.0/0            0.0.0.0/0    statistic mode random probability 0.333
#   823  49K KUBE-SEP-BBBB2  all  --  *      *   0.0.0.0/0            0.0.0.0/0    statistic mode random probability 0.500
#   851  51K KUBE-SEP-CCCC3  all  --  *      *   0.0.0.0/0            0.0.0.0/0
</code></code></pre><h3>View the conntrack entries</h3><pre><code><code># Watch connection tracking for a specific service
sudo conntrack -E -p tcp --dport 30080
# Output (live events):
# [NEW] tcp      6 120 SYN_SENT src=198.51.100.5 dst=192.168.1.10 sport=54321 dport=30080
# [UPDATE] tcp   6 60 SYN_RECV src=198.51.100.5 dst=192.168.1.10 sport=54321 dport=30080
# [UPDATE] tcp   6 432000 ESTABLISHED src=198.51.100.5 dst=192.168.1.10 sport=54321 dport=30080
</code></code></pre><h3>Capture traffic at each hop</h3><p>tcpdump is your friend here:</p><pre><code><code># At the node's physical interface (incoming)
sudo tcpdump -i eth0 -nn port 30080

# At the bridge (after DNAT, before forwarding)
sudo tcpdump -i cni0 -nn port 8080

# At the VXLAN interface (cross-node traffic)
sudo tcpdump -i flannel.1 -nn port 8080

# Inside the pod
kubectl exec -it my-pod -- tcpdump -i eth0 -nn port 8080
</code></code></pre><h3>Trace the iptables processing</h3><pre><code><code># Enable iptables tracing (verbose, use sparingly)
sudo iptables -t raw -A PREROUTING -p tcp --dport 30080 -j TRACE
sudo iptables -t raw -A OUTPUT -p tcp --sport 8080 -j TRACE

# View trace in kernel log
sudo dmesg -w | grep TRACE

# Clean up when done
sudo iptables -t raw -D PREROUTING -p tcp --dport 30080 -j TRACE
sudo iptables -t raw -D OUTPUT -p tcp --sport 8080 -j TRACE
</code></code></pre><h2>Troubleshooting the North-South Traffic</h2><h3>Service not reachable from outside</h3><ol><li><p>Verify that LoadBalancer does, in fact, have an external IP address:</p></li></ol><pre><code><code>kubectl get svc my-app
# Check EXTERNAL-IP is not &lt;pending&gt;
</code></code></pre><ol start="2"><li><p>Verify that the NodePort is open:</p></li></ol><pre><code><code># From a node
ss -tlnp | grep 30080
# Output should show kube-proxy listening
</code></code></pre><ol start="3"><li><p>Check that the endpoints exist (you probably won&#8217;t have to do this much if ever):</p></li></ol><pre><code><code>kubectl get endpoints my-app
# Output:
# NAME     ENDPOINTS                                         AGE
# my-app   10.244.0.5:8080,10.244.1.3:8080,10.244.2.2:8080   5d
</code></code></pre><ol start="4"><li><p>Verify iptables rules:</p></li></ol><pre><code><code>sudo iptables -t nat -L KUBE-SERVICES -n | grep my-app
</code></code></pre><h3>Client IP is not visible to application</h3><ol><li><p>Check externalTrafficPolicy:</p></li></ol><pre><code><code>kubectl get svc my-app -o jsonpath='{.spec.externalTrafficPolicy}'
# Output: Cluster (means SNAT is applied)
</code></code></pre><ol><li><p>Change to Local if client IP needed:</p></li></ol><pre><code><code>kubectl patch svc my-app -p '{"spec":{"externalTrafficPolicy":"Local"}}'
</code></code></pre><ol><li><p>Verify pods are running on nodes receiving traffic:</p></li></ol><pre><code><code>kubectl get pods -o wide -l app=my-app
</code></code></pre><h3>Connection timeouts</h3><ol><li><p>Check if SNAT is happening when it actually shouldn&#8217;t:</p></li></ol><pre><code><code>sudo conntrack -L -d &lt;pod-ip&gt; | head
# Is the source IP the client's or the node's?
</code></code></pre><ol start="2"><li><p>Verify that the CNI is forwarding cross-node traffic:</p></li></ol><pre><code><code># On source node
sudo tcpdump -i flannel.1 -nn host &lt;pod-ip&gt;
</code></code></pre><ol start="3"><li><p>Check that the pod is healthy:</p></li></ol><pre><code><code>kubectl describe pod &lt;pod-name&gt; | grep -A5 Conditions
</code></code></pre><h2>Summary</h2><p>North-south traffic through a LoadBalancer service follows this path:</p><ol><li><p>Client connects to external load balancer IP address</p></li><li><p>Load balancer forwards to the NodePort on a <em>healthy</em> node</p></li><li><p>iptables PREROUTING/KUBE-SERVICES chains intercept the packet</p></li><li><p>KUBE-SVC chain <em>randomly</em> selects a backend pod (this is the load balancing decision)</p></li><li><p>KUBE-SEP chain performs DNAT to the pod IP</p></li><li><p>If the pod is on a different node, SNAT is applied (externalTrafficPolicy: Cluster)</p></li><li><p>Packet is forwarded to the pod via CNI</p></li><li><p>Return traffic uses conntrack to <em>reverse</em> the NAT translations</p></li></ol><p>Two choices for the configuration:</p><ul><li><p><strong>externalTrafficPolicy: Cluster</strong>: Even load distribution, loses client IP</p></li><li><p><strong>externalTrafficPolicy: Local</strong>: Preserves client IP, may have uneven distribution</p></li></ul><p>Part 4 will cover encryption in flight: where TLS terminates, CNI-level encryption options, and how to achieve end-to-end encryption without a service mesh.</p><div><hr></div><h2>References</h2><h3>Official Kubernetes Documentation</h3><ul><li><p>Service: https://kubernetes.io/docs/concepts/services-networking/service/</p></li><li><p>Service Types: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types</p></li><li><p>External Traffic Policy: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip</p></li><li><p>Ingress: https://kubernetes.io/docs/concepts/services-networking/ingress/</p></li><li><p>Ingress Controllers: https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/</p></li></ul><h3>kube-proxy Documentation</h3><ul><li><p>kube-proxy Modes: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/</p></li><li><p>IPVS Proxy Mode: https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs</p></li><li><p>Virtual IPs and Service Proxies: https://kubernetes.io/docs/reference/networking/virtual-ips/</p></li></ul><h3>Linux Networking</h3><ul><li><p>iptables: https://netfilter.org/documentation/</p></li><li><p>iptables-extensions (statistic module): https://man7.org/linux/man-pages/man8/iptables-extensions.8.html</p></li><li><p>conntrack: https://conntrack-tools.netfilter.org/</p></li><li><p>conntrack man page: https://man7.org/linux/man-pages/man8/conntrack.8.html</p></li><li><p>IPVS: http://www.linuxvirtualserver.org/software/ipvs.html</p></li><li><p>ipvsadm: https://man7.org/linux/man-pages/man8/ipvsadm.8.html</p></li></ul><h3>Cloud Load Balancers</h3><ul><li><p>AWS ELB: https://docs.aws.amazon.com/elasticloadbalancing/</p></li><li><p>GCP Load Balancing: https://cloud.google.com/load-balancing/docs</p></li><li><p>Azure Load Balancer: https://docs.microsoft.com/en-us/azure/load-balancer/</p></li></ul><h3>MetalLB (Bare Metal Load Balancer)</h3><ul><li><p>MetalLB: https://metallb.universe.tf/</p></li></ul><h3>Ingress Controllers</h3><ul><li><p>NGINX Ingress Controller: https://kubernetes.github.io/ingress-nginx/</p></li><li><p>Traefik: https://doc.traefik.io/traefik/providers/kubernetes-ingress/</p></li><li><p>Envoy/Contour: https://projectcontour.io/</p></li><li><p>HAProxy Ingress: https://haproxy-ingress.github.io/</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Kubernetes Networking Deep Dive: Part 2]]></title><description><![CDATA[Pod-to-Pod Communication]]></description><link>https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-f73</link><guid isPermaLink="false">https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part-f73</guid><dc:creator><![CDATA[Hugh Tipping]]></dc:creator><pubDate>Sun, 15 Feb 2026 18:18:27 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4032" height="2268" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2268,&quot;width&quot;:4032,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A wooden sign pointing to two different directions&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A wooden sign pointing to two different directions" title="A wooden sign pointing to two different directions" srcset="https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1718724588305-fbc4d628d0f9?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1fHxsZWZ0JTIwcmlnaHR8ZW58MHx8fHwxNzcwNTg2MTgxfDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@reskp">Jametlene Reskp</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>This is the second post in a four-part series tracing packets through a Kubernetes cluster. In <a href="https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part">Part 1</a>, we covered the foundational concepts: network namespaces, veth pairs, CNI, and kube-proxy. Now we trace actual packets between pods.</p><p>Pod-to-pod communication is often called &#8220;<em>east-west traffic</em>&#8221; and is the most common network traffic in a cluster. A pod connects to a service, a microservice calls another, in-cluster databases receive queries from application pods, etc. This traffic stays <em>within</em> the cluster and does not go through a load balancer.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Let&#8217;s look at two scenarios: pods on the same node (communication through a bridge) and pods on different nodes (requiring overlay encapsulation or direct routing).</p><h2>Scenario Setup</h2><p>For upcoming examples, let&#8217;s use the following setup:</p><ul><li><p>Pod A: IP 10.244.0.5, running on Node 1 (192.168.1.10)</p></li><li><p>Pod B: IP 10.244.0.6, also running on Node 1</p></li><li><p>Pod C: IP 10.244.1.5, running on Node 2 (192.168.1.11)</p></li></ul><p>Pod A runs a client application. Pod B and Pod C each run a server listening on port 8080. Let&#8217;s trace what happens when Pod A makes an HTTP request to Pod B (on the same node) and then to Pod C (on a different node).</p><h2>Pod-to-Pod Communication on the Same Node</h2><p>When two pods are running on the same node, the network traffic never leaves the host. The packet travels via the Linux bridge (that acts as a virtual switch on each node)  connecting all the local pod veth endpoints.</p><h3>Packet Flow Steps</h3><p><strong>Step 1: Application makes connection (OSI Layer 7)</strong></p><p>The application in Pod A calls the <code>connect()</code> system call to establish a TCP connection to 10.244.0.6:8080. (A standard socket operation)</p><p><strong>Step 2: Kernel starts building the TCP/IP packet (OSI Layers 4 and 3)</strong></p><p>The kernel&#8217;s TCP/IP stack creates the packet:</p><ul><li><p>Source IP: 10.244.0.5</p></li><li><p>Destination IP: 10.244.0.6</p></li><li><p>Source port: ephemeral, chosen by the OS (e.g., 45678)</p></li><li><p>Destination port: 8080</p></li><li><p>Protocol: TCP</p></li></ul><p><strong>Step 3: Routing is decided in Pod A&#8217;s networking namespace (OSI Layer 3)</strong></p><p>The kernel looks at Pod A&#8217;s routing table to determine the outgoing interface:</p><pre><code><code># Inside Pod A
ip route

# Output:
# default via 10.244.0.1 dev eth0
# 10.244.0.0/24 dev eth0 proto kernel scope link src 10.244.0.5
</code></code></pre><p>The destination 10.244.0.6 matches the <code>10.244.0.0/24</code> route, which says &#8220;send directly via eth0.&#8221; A gateway is not needed because the destination is on the same subnet.</p><p><strong>Step 4: ARP resolution (Layer 2)</strong></p><p>Before sending the packet, the kernel will need the MAC address of 10.244.0.6. It checks the ARP** ( cache (a.k.a. Neighbor Table) or sends an ARP request:</p><pre><code><code># Inside Pod A's namespace; 
ip neigh show

# Output:
# 10.244.0.6 dev eth0 lladdr 62:a1:b2:c3:d4:e6 REACHABLE
# 10.244.0.1 dev eth0 lladdr 8a:1b:2c:3d:4e:5f REACHABLE
</code></code></pre><p>The kernel builds an Ethernet frame with:</p><ul><li><p>Source MAC: Pod A&#8217;s eth0 MAC</p></li><li><p>Destination MAC: Pod B&#8217;s eth0 MAC (62:a1:b2:c3:d4:e6)</p></li></ul><p>**ARP = Address Resolution Protocol: For discovery of OSI Layer 2 (MAC) address.</p><p><strong>Step 5: Packet leaves Pod A via veth pair (OSI Layer 2)</strong></p><p>The frame exits through Pod A&#8217;s eth0, which is one end of a veth pair. The packet emerges from the other end (veth-pod-a) in the <em>node&#8217;s</em> namespace.</p><p><strong>Step 6: Bridge forwards the frame (OSI Layer 2)</strong></p><p>The host-side veth is attached to a bridge (commonly named <code>cni0</code>, <code>cbr0</code>, or <code>docker0</code> depending on CNI). The bridge operates like a Layer 2 switch:</p><ol><li><p>It receives the frame on port veth-pod-a</p></li><li><p>It looks up the destination MAC in its forwarding table</p></li><li><p>It finds that MAC 62:a1:b2:c3:d4:e6 is reachable on port veth-pod-b</p></li><li><p>It forwards the frame out that port</p></li></ol><p>You can view the bridge&#8217;s MAC table:</p><pre><code><code># On the host (FDB = Forwarding Database)
bridge fdb show br cni0 | grep -i "62:a1"

# Output:
# 62:a1:b2:c3:d4:e6 dev veth-pod-b master cni0
</code></code></pre><p><strong>Step 7: Packet enters Pod B via veth pair (OSI Layer 2)</strong></p><p>The frame enters veth-pod-b in the host namespace and emerges from eth0 in Pod B&#8217;s namespace.</p><p><strong>Step 8: Kernel delivers to application (OSI Layers 3, 4, 7 &#8594; Back up the OSI stack)</strong></p><p>Pod B&#8217;s kernel:</p><ol><li><p>Receives the packet</p></li><li><p>Takes out the the Ethernet header (no longer needed), sees that it is an IP packet</p></li><li><p>Verifies that the destination IP matches its own (10.244.0.6)</p></li><li><p>Gets rid of the IP header (no longer needed here), sees it is TCP and destined for port 8080</p></li><li><p>Delivers the payload to the application listening on that port</p></li></ol><p>The return traffic (Pod B&#8217;s response) follows the same path in reverse.</p><h3>Observing Same-Node Traffic</h3><p>You can check out this traffic at different points:</p><pre><code><code># Capture on the bridge (sees ALL local pod traffic)
sudo tcpdump -i cni0 -nn host 10.244.0.5 and host 10.244.0.6

# Output:
# 14:23:01.234567 IP 10.244.0.5.45678 &gt; 10.244.0.6.8080: Flags [S], seq 123456789
# 14:23:01.234789 IP 10.244.0.6.8080 &gt; 10.244.0.5.45678: Flags [S.], seq 987654321, ack 123456790

# Capture on a specific veth (sees only that pod's traffic, great for troubleshooting)
sudo tcpdump -i veth-pod-a -nn port 8080
</code></code></pre><h3>iptables&#8217; Involvement</h3><p>For direct pod-to-pod communication (not through a Service), iptables is not heavily involved. The packet passes through the FORWARD chain in the filter table, but unless you have <a href="https://kubernetes.io/docs/concepts/services-networking/network-policies/">NetworkPolicies</a> configured, the default is to ACCEPT.</p><p>Another fun iptables command for you:</p><pre><code><code>sudo iptables -L FORWARD -n -v | head -5
# Output:
# Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
#  pkts bytes target     prot opt in     out     source               destination
#  1.2M  890M KUBE-FORWARD  all  --  *      *       0.0.0.0/0            0.0.0.0/0
#  1.2M  890M CNI-FORWARD   all  --  *      *       0.0.0.0/0            0.0.0.0/0
</code></code></pre><p>The CNI-FORWARD chain generally contains rules for NetworkPolicy enforcement if your CNI supports it (note that not all do!)</p><h2>Cross-Node Pod-to-Pod Traffic</h2><p>When pods are on different nodes, the packet has to travel on the physical network connecting the two nodes. This is where the capabilities of the CNI plugin matters. Let&#8217;s take a high-level look at both overlay (VXLAN) and routed (BGP) approaches.</p><h3>The Central Problem</h3><p>Node 1 has a packet destined for 10.244.1.5 (Pod C on Node 2). The physical network between nodes has no information about pod IP addresses. It only knows how to route traffic between node IP addresses (192.168.1.10 and 192.168.1.11).</p><p>We have two solutions:</p><ol><li><p><strong>Overlay</strong>: <em>Encapsulate</em> the pod-to-pod packet inside a node-to-node packet</p></li><li><p><strong>Routed</strong>: Configure the physical network to route the pod CIDRs themselves </p></li></ol><h3>Cross-Node Communication: Overlay (VXLAN)</h3><p>VXLAN (Virtual Extensible LAN) creates a Layer 2 overlay on top of a Layer 3 network. Eek, what&#8217;s that?<br><br>Well, pod traffic is encapsulated in UDP packets sent between nodes.</p><h4>Step-by-Step Packet Flow</h4><p><strong>Step 1: Application initiates connection (OS Layer 7)</strong></p><p>Pod A&#8217;s application connects to 10.244.1.5:8080 (Pod C). Standard socket connection like above.</p><p><strong>Step 2: Kernel builds the TCP/IP packet (OSI Layers 4 and 3)</strong></p><ul><li><p>Source IP: 10.244.0.5</p></li><li><p>Destination IP: 10.244.1.5</p></li><li><p>Source ephemeral port: 45678</p></li><li><p>Destination port: 8080</p></li></ul><p><strong>Step 3: Routing decision done in Pod A&#8217;s namespace (OSI Layer 3)</strong></p><pre><code><code># Inside Pod A
ip route

# Output:
# default via 10.244.0.1 dev eth0
# 10.244.0.0/24 dev eth0 proto kernel scope link src 10.244.0.5
</code></code></pre><p>The destination 10.244.1.5 <em>does not match</em> 10.244.0.0/24, so the packet will go to the default gateway (10.244.0.1).</p><p><strong>Step 4: Packet reaches host namespace via veth</strong></p><p>The packet exits eth0 in Pod A, and emerges from veth-pod-a in the host namespace.</p><p><strong>Step 5: Host&#8217;s routing table lookup (OSI Layer 3)</strong></p><p>The host&#8217;s kernel takes a look at its routing table:</p><pre><code><code># On Node 1
ip route

# Output (Flannel VXLAN example):
# default via 192.168.1.1 dev eth0
# 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
# 10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
# 10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
</code></code></pre><p>The destination 10.244.1.5 matches <code>10.244.1.0/24 via flannel.1</code>. (if that&#8217;s the CNI you&#8217;re using, but note that the Flannel CNI does not have NetworkPolicy capabilities). The <code>flannel.1</code> interface is a VXLAN Tunnel Endpoint (VTEP).</p><p><strong>Step 6: VXLAN encapsulation (OSI Layer 2 over Layer 3)</strong></p><p>The flannel.1 interface:</p><ol><li><p>Looks up which node owns 10.244.1.0/24 (Node 2 in our case: 192.168.1.11)</p></li><li><p>It then <em>encapsulates</em> the original Ethernet frame within a VXLAN header</p></li><li><p>It wraps that in a UDP packet (destination port 4789)</p></li><li><p>And finally it&#8217;s wrapped in an IP packet (192.168.1.10 to 192.168.1.11)</p></li></ol><p>You can see the VXLAN FDB (forwarding database):</p><pre><code><code># On Node 1
bridge fdb show dev flannel.1

# Output:
# 5a:2b:3c:4d:5e:6f dst 192.168.1.11 self permanent
# 7a:8b:9c:0d:1e:2f dst 192.168.1.12 self permanent
</code></code></pre><p>This maps the VTEP MAC addresses to node IPs.</p><p><strong>Step 7: Outer packet sent to Node 2 (Layers 3 and 2)</strong></p><p>The encapsulated packet is then routed normally over the network like any other packet going between nodes to 192.168.1.11:</p><ul><li><p>Source IP: 192.168.1.10 (Node 1)</p></li><li><p>Destination IP: 192.168.1.11 (Node 2)</p></li><li><p>Protocol: UDP</p></li><li><p>Destination port: 4789 (VXLAN)</p></li><li><p>Payload: VXLAN header + original Ethernet frame</p></li></ul><p><strong>Step 8: Node 2 receives and removes the encapsulation (OSI Layer 3)</strong></p><p>Node 2&#8217;s kernel:</p><ol><li><p>Receives UDP packet on port 4789</p></li><li><p>Recognizes it as VXLAN traffic for the flannel.1 interface</p></li><li><p>Nixes the Ethernet, IP, UDP, and VXLAN headers (tear that envelope open!)</p></li><li><p>Extracts the original Ethernet frame</p></li></ol><p><strong>Step 9: Host routing to the destination local pod (Layer 3)</strong></p><p>The extracted packet has destination 10.244.1.5. Node 2&#8217;s routing table:</p><pre><code><code># On Node 2
ip route

# Output:
# 10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
</code></code></pre><p>A local route so the packet goes to the cni0 bridge.</p><p><strong>Step 10: Bridge forwards to Pod C (Layer 2)</strong></p><p>The bridge looks up the MAC address for 10.244.1.5 and forwards the frame to veth-pod-c.</p><p><strong>Step 11: Pod C receives packet (Layers 3, 4, 7)</strong></p><p>The packet enters Pod C&#8217;s namespace through eth0. The kernel delivers it to the application on port 8080.</p><h4>VXLAN Packet Structure</h4><p>At step 7, the packet on the wire looks something like this. (Yeah a bit in the weeds. I had help drawing this.)</p><pre><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                    VXLAN ENCAPSULATED PACKET                               &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474; Outer         &#9474; Outer IP       &#9474; UDP      &#9474; VXLAN  &#9474; Inner    &#9474; Inner IP  &#9474;
&#9474; Ethernet      &#9474; Header         &#9474; Header   &#9474; Header &#9474; Ethernet &#9474; Packet    &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474; dst: router   &#9474; src: 192.168.  &#9474; src:     &#9474; VNI:   &#9474; dst: Pod &#9474; src:      &#9474;
&#9474; MAC           &#9474; 1.10           &#9474; random   &#9474; 1      &#9474; C MAC    &#9474; 10.244.   &#9474;
&#9474;               &#9474; dst: 192.168.  &#9474; dst:     &#9474;        &#9474; src: Pod &#9474; 0.5       &#9474;
&#9474; src: Node1    &#9474; 1.11           &#9474; 4789     &#9474;        &#9474; A MAC    &#9474; dst:      &#9474;
&#9474; MAC           &#9474;                &#9474;          &#9474;        &#9474;          &#9474; 10.244.   &#9474;
&#9474;               &#9474;                &#9474;          &#9474;        &#9474;          &#9474; 1.5       &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474;    14 bytes   &#9474;    20 bytes    &#9474;  8 bytes &#9474;8 bytes &#9474; 14 bytes &#9474; 20+ bytes &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;
                &#9474;                                    &#9474;
                &#9474;&#9668;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472; 50 bytes overhead &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9658;&#9474;

Total overhead: 50 bytes (outer Ethernet + outer IP + UDP + VXLAN + inner Ethernet)
This is why pod MTU is typically 1450 when node MTU is 1500.</code></pre><h4>Viewing VXLAN Traffic</h4><pre><code><code># On Node 1, capture VXLAN-encapsulated traffic
sudo tcpdump -i eth0 -nn udp port 4789

# Output:
# 14:30:01.123 IP 192.168.1.10.52341 &gt; 192.168.1.11.4789: VXLAN, flags [I] (0x08), vni 1
# IP 10.244.0.5.45678 &gt; 10.244.1.5.8080: Flags [S], seq 123456789

# Capture on the VXLAN interface (sees non-encapsulated traffic)
sudo tcpdump -i flannel.1 -nn host 10.244.1.5

# Output:
# 14:30:01.123 IP 10.244.0.5.45678 &gt; 10.244.1.5.8080: Flags [S], seq 123456789
</code></code></pre><h3>Cross-Node Communication: Routed (BGP)</h3><p>In routed (BGP***) mode, there is no encapsulation. Pod IPs are <em>advertised</em> via BGP (or even static routes) so the physical network knows how to route them.</p><h4>Prerequisites</h4><p>Routed mode requires one of:</p><ul><li><p>BGP peering </p><ul><li><p>between nodes and network routers</p></li><li><p>between nodes themselves</p></li></ul></li><li><p>Static routes configured on network infrastructure</p></li><li><p>Cloud VPC route table entries</p></li></ul><p>***BGP = Border Gateway Protocol: standardized gateway protocol to exchange routing and reachability information among autonomous systems</p><h4>Step-by-Step Packet Flow</h4><p><strong>Steps 1-4: Same as overlay</strong></p><p>Pod A builds a packet for 10.244.1.5, it exits via veth to the host namespace.</p><p><strong>Step 5: Host routing table lookup (Layer 3)</strong></p><p>Notice that the host&#8217;s routing table in BGP mode looks a bit different:</p><pre><code><code># On Node 1 (Calico BGP mode)
ip route

# Output:
# default via 192.168.1.1 dev eth0
# 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
# 10.244.1.0/24 via 192.168.1.11 dev eth0 proto bird
# 10.244.2.0/24 via 192.168.1.12 dev eth0 proto bird
</code></code></pre><p>The route for 10.244.1.0/24 points <em>directly to Node 2&#8217;s IP</em> (192.168.1.11) <em>via the physical interface (eth0 - on the host, not to be confused with a pod&#8217;s eth0)</em>. The <code>proto bird</code> indicates these routes were installed by the BIRD BGP daemon used by Calico CNI (for this example).</p><p><strong>Step 6: Packet sent to Node 2 (OSI Layer 3)</strong></p><p>The packet is sent directly with:</p><ul><li><p>Source IP: 10.244.0.5 (Pod A, unchanged)</p></li><li><p>Destination IP: 10.244.1.5 (Pod C, unchanged)</p></li></ul><p>At Layer 2:</p><ul><li><p>Source MAC: Node 1&#8217;s eth0</p></li><li><p>Destination MAC: Next hop (a router or Node 2 if on same Layer 2 network segment. In this case it&#8217;s the same segment.)</p></li></ul><p>Remember that there is no encapsulation. The pod IPs are visible on the physical network.</p><p><strong>Step 7: Physical network routing</strong></p><p>The physical network needs to know how to route 10.244.1.0/24 to Node 2. This happens via:</p><ul><li><p>BGP: Nodes advertise their pod CIDRs. Routers learn the routes.</p></li><li><p>Static routes: Network admin configures routes on routers (I&#8217;m not fond of static routing. It makes maintenance tougher.)</p></li><li><p>Cloud VPC: Cloud provider handles the routing.</p></li></ul><p><strong>Step 8: Node 2 receives packet (OSI Layer 3)</strong></p><p>Node 2 receives a packet with destination 10.244.1.5. Its routing table:</p><pre><code><code># On Node 2
ip route

# Output:
# 10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
</code></code></pre><p>The destination is local so route to cni0 bridge.</p><p><strong>Steps 9-10: Same as with overlay</strong></p><p>Bridge forwards to Pod C. Application receives the packet.</p><h4>Observing BGP Routes</h4><p>Again, in the weeds.</p><pre><code><code># View BGP-learned routes (Calico with BIRD)
sudo calicoctl node status

# Output:
# IPv4 BGP status
# +--------------+-------------------+-------+----------+-------------+
# | PEER ADDRESS |     PEER TYPE     | STATE |  SINCE   |    INFO     |
# +--------------+-------------------+-------+----------+-------------+
# | 192.168.1.11 | node-to-node mesh | up    | 10:23:45 | Established |
# | 192.168.1.12 | node-to-node mesh | up    | 10:23:47 | Established |
# +--------------+-------------------+-------+----------+-------------+

# View routes learned via BGP
ip route show proto bird

# Output:
# 10.244.1.0/24 via 192.168.1.11 dev eth0
# 10.244.2.0/24 via 192.168.1.12 dev eth0

# Capture unencapsulated pod traffic on physical interface
sudo tcpdump -i eth0 -nn host 10.244.1.5

# Output:
# 14:35:01.234 IP 10.244.0.5.45678 &gt; 10.244.1.5.8080: Flags [S], seq 123456789
</code></code></pre><h3>IPinIP: A Lighter Overlay</h3><p>Some CNIs (notably Calico) support IPinIP as a lighter-weight alternative to VXLAN. IPinIP encapsulates the original IP packet directly in another IP packet, without the UDP and VXLAN headers.</p><pre><code><code>&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;
&#9474;                    IPINIP PACKET                               &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9516;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474; Outer         &#9474; Outer IP       &#9474; Inner IP Packet               &#9474;
&#9474; Ethernet      &#9474; Header         &#9474; (original pod-to-pod packet)  &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474; dst: router   &#9474; src: 192.168.  &#9474; src: 10.244.0.5               &#9474;
&#9474; MAC           &#9474; 1.10           &#9474; dst: 10.244.1.5               &#9474;
&#9474;               &#9474; dst: 192.168.  &#9474; + TCP header + payload        &#9474;
&#9474;               &#9474; 1.11           &#9474;                               &#9474;
&#9474;               &#9474; proto: 4 (IPIP)&#9474;                               &#9474;
&#9500;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9532;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9508;
&#9474;    14 bytes   &#9474;    20 bytes    &#9474;         20+ bytes             &#9474;
&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9524;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;

Overhead: 20 bytes (just the outer IP header)
Pod MTU can be 1480 instead of 1450.
</code></code></pre><p>Calico can use IPinIP for cross-subnet traffic and direct routing for same-subnet traffic (&#8221;CrossSubnet&#8221; mode).</p><h2>Comparing Approaches</h2><p>Here are some nerdy numbers I looked up to help you compare overhead for different approaches. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vWyG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vWyG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 424w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 848w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 1272w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vWyG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png" width="1276" height="908" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:908,&quot;width&quot;:1276,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152101,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://techblog.hughtipping.com/i/187330795?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vWyG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 424w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 848w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 1272w, https://substackcdn.com/image/fetch/$s_!vWyG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb95d80c-3b45-4130-971b-1b8f304feba6_1276x908.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Connection Tracking</h2><p>Regardless of the approach, the Linux connection tracking system (conntrack) maintains state for TCP and UDP flows. This is important for return traffic (and <em>stateful</em> firewalling).</p><pre><code><code># View connection tracking entries
sudo conntrack -L | grep 10.244.0.5

# Output:
# tcp      6 117 TIME_WAIT src=10.244.0.5 dst=10.244.1.5 sport=45678 dport=8080 
#          src=10.244.1.5 dst=10.244.0.5 sport=8080 dport=45678 [ASSURED] use=1
</code></code></pre><p>The conntrack entry shows both directions of the connection</p><ul><li><p>src=10.244.0.5 dst=<strong>10.244.1.5</strong> sport=45678 dport=<strong>8080</strong></p></li><li><p>src=<strong>10.244.1.5</strong> dst=10.244.0.5 sport=<strong>8080</strong> dport=45678</p></li></ul><p>which lets the kernel match return packets to the original flow.</p><h2>Troubleshooting Pod-to-Pod Communication</h2><p>Here is a non-exhaustive list of commands to help you troubleshoot communication problems between pods.</p><h3>Check routing</h3><pre><code><code># From inside a pod, check route to destination
ip route get 10.244.1.5

# Output:
# 10.244.1.5 via 10.244.0.1 dev eth0 src 10.244.0.5

# On the host, check route to remote pod CIDR
ip route get 10.244.1.5

# Output (VXLAN):
# 10.244.1.5 via 10.244.1.0 dev flannel.1 src 10.244.0.0
# Output (BGP):
# 10.244.1.5 via 192.168.1.11 dev eth0 src 192.168.1.10
</code></code></pre><h3>Check bridge connectivity</h3><pre><code><code># Verify bridge exists and has interfaces
bridge link show

# Output:
# 8: veth12345@if2: &lt;BROADCAST,MULTICAST,UP&gt; mtu 1450 master cni0 state forwarding
# 9: veth67890@if2: &lt;BROADCAST,MULTICAST,UP&gt; mtu 1450 master cni0 state forwarding

# Check bridge MAC table
bridge fdb show br cni0 | head
</code></code></pre><h3>Check VXLAN state (overlay networks)</h3><pre><code><code># Verify VXLAN interface exists
ip -d link show flannel.1

# Output:
# 4: flannel.1: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue state UNKNOWN
#     link/ether 5a:2b:3c:4d:5e:6f brd ff:ff:ff:ff:ff:ff promiscuity 0
#     vxlan id 1 local 192.168.1.10 dev eth0 srcport 0 0 dstport 4789 nolearning

# Check VXLAN FDB entries
bridge fdb show dev flannel.1
</code></code></pre><h3>Capture traffic at each hop</h3><pre><code><code># Inside source pod
tcpdump -i eth0 -nn host 10.244.1.5

# On source node bridge
sudo tcpdump -i cni0 -nn host 10.244.0.5

# On source node physical interface (shows encapsulated or raw traffic)
sudo tcpdump -i eth0 -nn host 192.168.1.11  # or host 10.244.1.5 for BGP

# On source node VXLAN interface (shows inner traffic)
sudo tcpdump -i flannel.1 -nn host 10.244.1.5
</code></code></pre><h2>Summary</h2><p>Pod-to-pod communication takes different paths depending on pod locality:</p><p><strong>Same-node</strong>: Packets traverse veth pairs and a Linux bridge. This is Layer 2 switching within the host. iptables is not involved unless NetworkPolicies are in place.</p><p><strong>Cross-node with overlay (VXLAN/IPinIP)</strong>: Packets are encapsulated with an outer header containing node IPs. The physical network only sees traffic between nodes. The inner pod IPs are hidden.</p><p><strong>Cross-node with BGP routing</strong>: Packets are sent directly with pod IPs intact. The physical network must have routes for pod CIDRs. No encapsulation overhead.</p><p>Part 3 will trace north-south traffic: a packet from an external user through a LoadBalancer service, into the cluster, and back. This is where iptables and kube-proxy become central to the packet&#8217;s journey.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Kubernetes Networking Deep Dive: Part 1]]></title><description><![CDATA[Foundations]]></description><link>https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part</link><guid isPermaLink="false">https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part</guid><dc:creator><![CDATA[Hugh Tipping]]></dc:creator><pubDate>Sat, 07 Feb 2026 05:50:11 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4000" height="2250" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2250,&quot;width&quot;:4000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;an aerial view of a building under construction&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="an aerial view of a building under construction" title="an aerial view of a building under construction" srcset="https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1685233503586-0cb11f46e99f?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMXx8Zm91bmRhdGlvbnN8ZW58MHx8fHwxNzcwNTgyMTU0fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As promised in the <a href="https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive">Introduction</a> here is the first post in series following the life of a packet through a Kubernetes cluster. Before I start, let&#8217;s establish the foundational concepts: how Kubernetes allocates IP addresses to resources, how pods get their network interfaces, and what components in a cluster are responsible for routing traffic.</p><h2><strong>The Kubernetes Networking Model</strong></h2><p>Kubernetes has three fundamental requirements in any networking implementation:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ol><li><p>Pods can communicate with all other pods on any node without NAT</p></li><li><p>Nodes can communicate with all pods without NAT</p></li><li><p>The IP that a pod sees for itself is the same IP that other pods see for it</p></li></ol><p>These requirements are defined in the Kubernetes documentation and any CNI plugin you install in your cluster must satisfy them. This also means every pod gets a routable IP address.</p><p>This is different from, say, default Docker networking, where containers on different hosts cannot communicate without port mapping or overlay configuration (was always a bit annoying which is why I like to work in Kubernetes). Kubernetes abstracts this away: from an app perspective, all pods are directly reachable by IP, though services are preferred for connectivity.</p><p><strong>Reference</strong>: Kubernetes Networking Model documentation at: <a href="https://kubernetes.io/docs/concepts/services-networking/">https://kubernetes.io/docs/concepts/services-networking/</a></p><h2><strong>IP Address Allocation</strong></h2><p>A Kubernetes cluster uses two separate IP ranges defined when a cluster is first created.</p><h3><strong>Pod CIDR</strong></h3><p>The &#8220;Pod CIDR&#8221; is the IP range from which all pod IP addresses are allocated.</p><p>For example: the CIDR block 10.244.0.0/16 provides 65,536 addresses. The cluster divides this range among all the nodes in a cluster. Each node gets a chunk, say via a /24 subnet mask (providing 256 addresses for the node), and from here Kubernetes assigns IPs to pods that get scheduled on that node.</p><p>For example in a 3-node cluster:</p><ul><li><p>Cluster pod CIDR: 10.244.0.0/16</p></li><li><p>Node 1 range: 10.244.0.0/24 (pods get 10.244.0.2, 10.244.0.3, etc.)</p></li><li><p>Node 2 range: 10.244.1.0/24 (pods get 10.244.1.2, 10.244.1.3, etc.)</p></li><li><p>Node 3 range: 10.244.2.0/24 (pods get 10.244.2.2, 10.244.2.3, etc.)</p></li></ul><p>The node&#8217;s kubelet, along with the CNI plugin, handles the IP assignment when a pod starts up.</p><p><strong>NOTE:</strong></p><p>Remember from Kubernetes basics that Kubernetes itself <em>does not provide any networking functionality</em> per se. It has a Container Networking Interface (CNI) which lets you install whatever networking plugin you wish and the kubelet doesn&#8217;t have to care which. I&#8217;ll talk more about that later.</p><h3><strong>Service CIDR</strong></h3><p>The Service CIDR is a different range used for ClusterIP services. For example, let&#8217;s use 10.96.0.0/12. Unlike pod IPs, service IPs are actually virtual. They DO NOT get assigned to any network interface. They are only entries in <a href="https://www.netfilter.org/projects/iptables/index.html">iptables</a> (or <a href="https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/">IPVS rules</a>) for redirecting network traffic to pod endpoints. Neither are exactly fun to manage which is part of the beauty of Kubernetes. It takes care of all that for you.</p><p>When a Service gets created, Kubernetes grabs an IP from this range and updates the routing rules on EVERY node to handle traffic destined for that specific Service IP.</p><h3><strong>Showing the Cluster CIDR Configuration</strong></h3><p>You can check out the CIDR ranges for your cluster with kubectl commands like the below:</p><pre><code># View the pod CIDR, for example
kubectl cluster-info dump | grep -m 1 cluster-cidr
# The output would contain something like: cluster-cidr=10.244.0.0/16

# View the service CIDR, for example
kubectl cluster-info dump | grep -m 1 service-cluster-ip-range
# The output would contain something like: service-cluster-ip-range=10.96.0.0/12

# View the CIDR allocated to a specific node
kubectl get node node-1 -o jsonpath=&#8217;{.spec.podCIDR}&#8217;
# Output would be something like: 10.244.0.0/24</code></pre><p><strong>Reference</strong>: Cluster Networking documentation at: <a href="https://kubernetes.io/docs/concepts/cluster-administration/networking/">https://kubernetes.io/docs/concepts/cluster-administration/networking/</a></p><h2><strong>Network Namespaces and the Pod Sandbox</strong></h2><p>As I was researching this topic, I dug into some interesting things about how Kubernetes actually grabs IP addresses from the Pod CIDR, and handles pod creation and isolation,</p><p>Every pod has its own <em><a href="https://en.wikipedia.org/wiki/Linux_namespaces">Linux network namespace</a></em>. A network namespace (<em>not</em> to be confused with a Kubernetes Namespace) provides isolated networking components: its own interfaces, routing tables, iptables rules, etc. Processes running in one networking namespace cannot see nor interact with network resources in another namespace unless <em>explicitly</em> connected. This helps give pods their own isolated environments for their containers.</p><h3><strong>The Pause Container</strong></h3><p>Whenever Kubernetes creates a new pod, the container runtime (e.g. containerd, cri-o) first creates what is called a &#8220;pause&#8221; container, also known as a &#8220;sandbox&#8221; container. This pause container doesn&#8217;t actually do anything. It just waits forever, but holds the network namespace for the upcoming workload containers for the pod.</p><p>Then these workload containers (e.g. nginx, your app, a logging sidecar) join this existing networking namespace instead of creating their own. This shared namespace is how containers <em>within the same pod</em> share the <em>same</em> IP address and can communicate with each other over localhost. This eases the <a href="https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/">sidecar paradigm</a>. Kinda neat.</p><p>You can check out pause containers on a node (though you&#8217;ll seldom have to do so, it&#8217;s still interesting to see how things work):</p><pre><code># On a node that has containerd running, you use the ctr command
sudo ctr -n k8s.io containers list | grep pause
# Output example:
# a1b2c3d4e5f6    registry.k8s.io/pause:3.9    io.containerd.runc.v2

# Or with crictl if you are using cri-o
sudo crictl ps -a | grep pause
# Output example:
# 7f8e9d0c1b2a   3 hours ago   Running   pause   0   abc123def456   nginx-pod</code></pre><p>Here&#8217;s more info on <a href="https://kubernetes.io/docs/setup/production-environment/container-runtimes/">container runtimes</a>.</p><h3><strong>Examining Network Namespaces</strong></h3><p>With root access on a node, you can inspect a pod&#8217;s networking namespace. Note that in many cases you may not have direct access to a node, especially in a production environment, so you may be able to do it via a pod with the right <a href="https://kubernetes.io/docs/tasks/configure-pod-container/security-context/">security context</a>.</p><pre><code># List network namespaces (requires root on the node or a pod must have CAP_SYS_ADMIN)
sudo lsns -t net

# Output example:
#         NS TYPE NPROCS   PID USER    NETNSID NSFS                           COMMAND
# 4026531840 net     145     1 root unassigned                               /sbin/init
# 4026532509 net       2  1842 65535          0 /run/netns/cni-a1b2c3d4-e5f6   /pause
# 4026532592 net       3  2156 65535          1 /run/netns/cni-f7g8h9i0-j1k2   /pause

# You can also &#8220;enter&#8221; a pod&#8217;s network namespace and inspect it
POD_PID=$(sudo crictl inspect &lt;container-id&gt; | jq .info.pid)
sudo nsenter -t $POD_PID -n ip addr

# Output example:
# 1: lo: &lt;LOOPBACK,UP,LOWER_UP&gt; mtu 65536 qdisc noqueue state UNKNOWN
#     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#     inet 127.0.0.1/8 scope host lo
#        valid_lft forever preferred_lft forever
# 3: eth0@if12: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue state UP
#     link/ether 62:a1:b2:c3:d4:e5 brd ff:ff:ff:ff:ff:ff link-netnsid 0
#     inet 10.244.0.5/24 brd 10.244.0.255 scope global eth0
#        valid_lft forever preferred_lft forever</code></pre><p>That funny-looking eth0@if12 name indicates that this is one end of a veth pair (next section), with the other end being interface index 12 on the node itself.</p><p><strong>Reference</strong>: Linux network namespaces documentation in man 7 network_namespaces and Kubernetes Pod documentation at: <a href="https://kubernetes.io/docs/concepts/workloads/pods/">https://kubernetes.io/docs/concepts/workloads/pods/</a></p><h2><strong>Virtual Ethernet Pairs</strong></h2><p>To understand a veth (virtual ethernet) pair, think of a networking cable (virtual, of course) connecting two different <em>network namespaces</em>. This functionality is part of Linux. Packets sent to one side of the pair come out of the other side. Kubernetes standard networking uses veth pairs to join pod namespaces to the host namespace and is how packets travel into and out of a node in a cluster. Without a veth pair (or similar such as macvlan), the pod's network namespace would be isolated, unable to communicate with anything else.</p><p>When a pod is created:</p><ol><li><p>The CNI plugin creates the veth pair for the pod.</p></li><li><p>One end of the pair is placed in the pod namespace (usually called something like eth0)</p></li><li><p>The other end stays in the host namespace (named vethXXXXXX or something like that)</p></li><li><p>The host end is attached to a bridge or configured with routes. More on that further below.</p></li></ol><h3><strong>Taking a look at the veth Pairs</strong></h3><pre><code># On the node, list veth interfaces
ip link show type veth

# Output something like:
# 12: veth9f8e7d6c@if3: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue master cni0 state UP
#     link/ether 8a:1b:2c:3d:4e:5f brd ff:ff:ff:ff:ff:ff link-netns cni-a1b2c3d4-e5f6

# The &#8220;master cni0&#8221; means these are attached to a bridge named cni0
# The &#8220;link-netns&#8221; shows which network namespace the other end is located in
# View the bridge and its attached interfaces

bridge link show

# Output:
# 12: veth9f8e7d6c@if3: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 master cni0 state forwarding</code></pre><p><strong>Reference</strong>: <a href="https://man7.org/linux/man-pages/man4/veth.4.html">veth documentation</a> in man 4 veth</p><h2><strong>Container Network Interface (CNI)</strong></h2><p>The CNI is a Kubernetes spec and set of libraries that configure network interfaces for Linux containers. As mentioned above, Kubernetes does not implement networking <em>directly</em>. Instead, it defers the functionality to CNI plugins that do the actual work; this &#8220;loose coupling&#8221; provides the flexibility to use different types of networking, including cloud networking.</p><h3><strong>What does the CNI actually </strong><em><strong>do</strong></em><strong>?</strong></h3><p>When a kubelet needs to set up the networking for a new pod:</p><ol><li><p>The <strong>kubelet</strong> will call whichever container runtime is installed (containerd, cri-o)</p></li><li><p>The <strong>runtime</strong> creates the pod pause container with a new network namespace</p></li><li><p>The <strong>runtime</strong> invokes the CNI plugin specified in the node&#8217;s CNI configuration.</p></li><li><p>The <strong>CNI plugin</strong> then configures these in the networking namespace: creation of  interfaces, assignment of IPs, setting up routes within the pod&#8217;s routing table.</p></li><li><p>The <strong>CNI plugin</strong> then returns the configuration to the runtime, which reports it back to the kubelet</p><p></p></li></ol><p>CNI plugins perform three operations:</p><ul><li><p>ADD (configure networking for a new container)</p></li><li><p>DEL (clean up networking when container stops)</p></li><li><p>CHECK (verify configuration is correct).</p></li></ul><p>Are you still with me? Good... let&#8217;s keep going! &#128517;</p><h3><strong>CNI Configuration</strong></h3><p>CNI configuration lives in /etc/cni/net.d/ on each node. For example, using a <a href="https://github.com/flannel-io/flannel">flannel</a> CNI plugin the below is a standard config. I won&#8217;t go into what the configuration options are.</p><pre><code>cat /etc/cni/net.d/10-flannel.conflist

# Output:
# {
#   &#8220;name&#8221;: &#8220;cbr0&#8221;,
#   &#8220;cniVersion&#8221;: &#8220;0.3.1&#8221;,
#   &#8220;plugins&#8221;: [
#     {
#       &#8220;type&#8221;: &#8220;flannel&#8221;,
#       &#8220;delegate&#8221;: {
#         &#8220;hairpinMode&#8221;: true,
#         &#8220;isDefaultGateway&#8221;: true
#       }
#     },
#     {
#       &#8220;type&#8221;: &#8220;portmap&#8221;,
#       &#8220;capabilities&#8221;: {
#         &#8220;portMappings&#8221;: true
#       }
#     }
#   ]
# }</code></pre><h3><strong>Overlay vs Routed Networking</strong></h3><p>Here&#8217;s where it gets even more interesting...</p><p>CNI plugins fall into two large categories depending on how they handle cross-node traffic:</p><p><strong>Overlay Networks</strong></p><ul><li><p>Encapsulate pod traffic in an outer packet with node IPs</p></li><li><p>Function on <em>any</em> kind of network infrastructure</p></li><li><p>Add overhead: extra headers reduce the <em>effective</em> MTU* , plus encap/decap adds CPU overhead</p></li><li><p>Examples: Flannel (VXLAN mode), Calico (VXLAN or IPinIP mode), Cilium (VXLAN mode)</p></li></ul><p>*MTU = Maximum Transmission Unit, the largest packet size a network link can carry, usually 1500 bytes for Ethernet. With the overhead of encapsulating with more headers, you can transmit less data.</p><p><strong>Routed Networks</strong> (BGP, host routing)</p><ul><li><p>Pod IPs are routed directly on the <em>physical network</em> connecting nodes</p></li><li><p>Require network infrastructure configuration: BGP peering or static routes</p></li><li><p>No encapsulation overhead, full MTU available</p></li><li><p>Pod IPs visible in network flow logs and to firewalls</p></li><li><p>Examples: Calico (BGP mode), Cilium (native routing)</p></li></ul><p><strong>Oof that&#8217;s a lot!</strong> I&#8217;ll go into both in detail in Part 2 when tracing cross-node pod-to-pod communication.</p><p><strong>Reference</strong>: CNI specification at <a href="https://www.cni.dev/docs/spec/">https://www.cni.dev/docs/spec/</a> and Kubernetes Network Plugins documentation at: <a href="https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/">https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/</a></p><h2><strong>kube-proxy and Service Routing</strong></h2><p>The kube-proxy is a Kubernetes component that runs on <em>every node in a cluster</em> and maintains all the network rules defined to route traffic to Services. Despite how it&#8217;s named, it does <em>not</em> proxy any traffic itself. Instead, it configures the kernel&#8217;s packet filtering and NAT facilities.</p><h3><strong>What kube-proxy actually does</strong></h3><p>kube-proxy watches the Kubernetes API for <a href="https://kubernetes.io/docs/concepts/services-networking/service/">Service</a> and <a href="https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/">EndpointSlice</a> (the latter you generally won&#8217;t have to deal with directly) objects. Whenever these change, it updates the node&#8217;s packet processing (e.g. iptables) rules to:</p><ol><li><p>Intercept traffic whose destination is a Service&#8217;s ClusterIP</p></li><li><p>Select a backend pod (load balancing)</p></li><li><p>Redirect the traffic to that pod&#8217;s IP (DNAT - mapping a public IP/port to private)</p></li><li><p>Handle return traffic back to the source</p></li></ol><h3><strong>What kube-proxy Does </strong><em><strong>Not</strong></em><strong> Do</strong></h3><p>kube-proxy is not involved in pod-to-pod communication that does not go through a Service. When Pod A communicates directly with Pod B&#8217;s IP address, the traffic is handled by the CNI networking layer. kube-proxy&#8217;s rules are only followed when the destination is a Service IP or <a href="https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport">NodePort</a>.</p><h3><strong>kube-proxy Modes: iptables vs IPVS</strong></h3><p>kube-proxy can operate in two main modes (a third, <a href="https://kubernetes.io/blog/2025/02/28/nftables-kube-proxy/">nftables</a>, is newer and I won&#8217;t be covering it). The mode affects how Service routing rules are implemented.</p><h4><strong>iptables Mode</strong></h4><p>In iptables mode, kube-proxy creates iptables firewall rules for each Service and endpoint.</p><p>Advantages:</p><ul><li><p>iptables has been around for a while so it&#8217;s mature and well-understood (as well as one can understand that dark art!)</p></li><li><p>No additional kernel modules are required</p></li><li><p>Works on any Linux distribution (well, I&#8217;d say <em>most</em>)</p></li></ul><p>Disadvantages:</p><ul><li><p>Rule count can grow with more Services and endpoints</p></li><li><p>Rule updates require rewriting entire chains adding latency for a large set of rules.</p></li><li><p>Sequential rule evaluation can add latency (if you have a large list and the matching rule is towards the end)</p></li></ul><p>Performance characteristics:</p><ul><li><p>Works well up to approximately 1,000 Services after which things start to slow down because of the sequential processing of rules.</p></li><li><p>Rule-update latency increases beyond 5,000 Services and updates are <em>not atomic</em> (i.e. the whole thing needs to be updated if there is one change)</p></li><li><p>Memory usage for rules can become quite a bit as you scale up</p></li></ul><p>Viewing iptables rules (if you like to torture yourself):</p><pre><code># List Service-related NAT rules
sudo iptables -t nat -L KUBE-SERVICES -n | head -20

# Output example:
# Chain KUBE-SERVICES (2 references)
# target     prot opt source               destination
# KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  0.0.0.0/0    10.96.0.1     /* default/kubernetes:https cluster IP */ tcp dpt:443
# KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  0.0.0.0/0    10.96.0.10    /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
# KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  0.0.0.0/0    10.96.0.10    /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53

# Count total Service-related rules
sudo iptables -t nat -L -n | wc -l
# Output example: 847</code></pre><h4><strong>IPVS Mode</strong></h4><p>In IPVS mode, kube-proxy uses the kernel&#8217;s IPVS (IP Virtual Server) subsystem. IPVS is good at load balancing and uses <em><a href="https://en.wikipedia.org/wiki/Hash_table">hash tables</a></em> for instant lookup regardless of the number of services so that speeds things up.</p><p>Advantages:</p><ul><li><p>Rule matching via hash tables (fast!)</p></li><li><p>Supports multiple load balancing algorithms</p></li><li><p>Lower latency rule updates</p></li><li><p>Better performance at scale (10,000+ Services or so)</p></li></ul><p>Disadvantages:</p><ul><li><p>Requires IPVS kernel modules</p></li><li><p>More complex debugging (<strong>ipvsadm</strong> command)</p></li><li><p>Still uses iptables for some functions (masquerading (SNAT), NodePort handling)</p></li></ul><p>Viewing IPVS rules (here&#8217;s some more fun!):</p><pre><code># Check if IPVS mode is active
sudo ipvsadm -Ln | head -10

# Output:
# IP Virtual Server version 1.2.1 (size=4096)
# Prot LocalAddress:Port Scheduler Flags
#   -&gt; RemoteAddress:Port           Forward Weight ActiveConn InActConn
# TCP  10.96.0.1:443 rr
#   -&gt; 192.168.1.10:6443            Masq    1      3          0
# TCP  10.96.0.10:53 rr
#   -&gt; 10.244.0.2:53                Masq    1      0          0
#   -&gt; 10.244.1.3:53                Masq    1      0          0
# UDP  10.96.0.10:53 rr
#   -&gt; 10.244.0.2:53                Masq    1      0          12
#   -&gt; 10.244.1.3:53                Masq    1      0          8

# View IPVS connection tracking
sudo ipvsadm -Lnc | head -10

# Output:
# IPVS connection entries
# pro expire state       source             virtual            destination
# TCP 14:56  ESTABLISHED 10.244.0.5:48892   10.96.0.1:443      192.168.1.10:6443</code></pre><h4><strong>Choosing Between Modes (The right mode for the right job)</strong></h4><p>Use iptables mode when things are smaller:</p><ul><li><p>Running smaller clusters (under 1,000 Services)</p></li><li><p>You want to keep it simple</p></li><li><p>You don&#8217;t/can&#8217;t have IPVS kernel modules installed</p></li></ul><p>Use IPVS mode when things scale up:</p><ul><li><p>Running larger clusters (1,000+ Services)</p></li><li><p>You need specific load balancing algorithms</p></li><li><p>Service creation/update latency reduction is important</p></li></ul><p>To configure the mode, set --proxy-mode in kube-proxy&#8217;s configuration:</p><pre><code># Check current mode
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode

# Output: mode: &#8220;ipvs&#8221; or &#8220;iptables&#8221; or empty (defaults to iptables)</code></pre><p><strong>Reference</strong>: kube-proxy documentation at: <a href="https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/">https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/</a>  and IPVS-based proxying at <a href="https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs">https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-ipvs</a></p><h2><strong>OSI Model Context</strong></h2><p>As we trace packets through the cluster in later posts, we will reference <a href="https://www.cloudflare.com/learning/ddos/glossary/open-systems-interconnection-model-osi/">OSI layers</a> to clarify where processing occurs:</p><ul><li><p><strong>Layer 7 (Application)</strong>: HTTP, gRPC, DNS queries. This is where your application code operates.</p></li><li><p><strong>Layer 4 (Transport)</strong>: TCP/UDP. Ports, connections, and load balancing decisions happen here.</p></li><li><p><strong>Layer 3 (Network)</strong>: IP addresses and routing. CNI plugins, iptables, NAT, and IPVS operate at this layer.</p></li><li><p><strong>Layer 2 (Data Link)</strong>: MAC addresses and switching. Bridges like cni0 and VXLAN encapsulation operate here.</p></li></ul><p>Kubernetes networking operates mostly at Layers 3 and 4, with Layer 2 involvement for local bridging and overlay encapsulation.</p><h2><strong>Summary</strong></h2><p>This post covered the foundational concepts required to understand Kubernetes networking:</p><ul><li><p>The flat network model guarantees pod-to-pod communication without NAT</p></li><li><p>Pod CIDR provides addresses for pods; Service CIDR provides virtual IPs for Services</p></li><li><p>Each pod runs in an isolated network namespace</p></li><li><p>Veth pairs connect pod namespaces to the host network</p></li><li><p>CNI plugins handle the actual network interface configuration</p></li><li><p>kube-proxy handles Service routing rules using iptables or IPVS</p></li><li><p>Overlay and routed networking are the two main approaches for cross-node traffic</p></li></ul><p>Part 2 will trace packets through pod-to-pod communication, covering both same-node traffic (through the bridge) and cross-node traffic (using VXLAN overlay and BGP routing).</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Kubernetes Networking Deep Dive: Intro]]></title><description><![CDATA[A Four-Part Series on the Life of a Packet within Kubernetes]]></description><link>https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive</link><guid isPermaLink="false">https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive</guid><dc:creator><![CDATA[Hugh Tipping]]></dc:creator><pubDate>Sat, 31 Jan 2026 11:36:44 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4507" height="3004" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3004,&quot;width&quot;:4507,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a bunch of blue wires connected to each other&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a bunch of blue wires connected to each other" title="a bunch of blue wires connected to each other" srcset="https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1683322499436-f4383dd59f5a?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxN3x8bmV0d29ya3xlbnwwfHx8fDE3NzA1ODIxMTB8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@scottrodgerson">Scott Rodgerson</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>I was having a fairly random tech conversation with a friend of mine and out of curiosity, he asked me some questions about Kubernetes networking, knowing I had spent time in that universe. As I explained to him the high-level basics, he dug deeper (as I am happy he always does) detailed questions about the movement of the packet into, out of, and through a cluster. I soon came to realize some gaps in my knowledge.</p><p>I went back home and started looking up answers. I found my efforts snowballing into a lot of information. Then it hit me: why not start off my first Substack posts with a <em>series</em> on Kubernetes networking, so that others could benefit from this work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>I decided to do this as a &#8220;Life of a Packet&#8221; through a Kubernetes cluster. In this series of posts, I will trace out the path  from an external user through a load balancer, into the cluster, to a container in a pod, and back out again. I will then cover pod-to-pod communication within the cluster and touch upon encryption.</p><p>My goal is to illustrate what not only within Kubernetes constructs, but also what happens at the network level, below the abstractions of Kubernetes. I&#8217;ll take a look at iptables rules, network namespaces, veth pairs, and CNI plugins. I will map a packet&#8217;s journey including OSI layers: Layer 2 switching, Layer 3 routing, and Layer 4 load balancing.</p><p>I purposefully kept this cloud services agnostic, and I do not cover the concept of the Service Meshes since that warrants posts of its own. This will be standard, simple, Kubernetes networking infrastructure, providing foundational knowledge to anyone growing their Kubernetes skills.</p><p><em>Here&#8217;s a  breakdown of the series:</em></p><h3>Part 1: Foundations</h3><p>This is where I will go over core concepts:</p><ul><li><p>The Kubernetes networking model</p></li><li><p>Pod and service CIDR allocation</p></li><li><p>Linux network namespaces</p></li><li><p>VETH pairs</p></li><li><p>The Container Network Interface (CNI).</p></li></ul><p>I will also cover kube-proxy&#8217;s role and compare iptables vs IPVS from a high-level.</p><h3>Part 2: Pod-to-Pod Communication (East-West Traffic)</h3><p>I&#8217;ll then trace a packet from one pod to another otherwise known as &#8220;East-West&#8221; traffic.</p><p>This will go over two scenarios:</p><ul><li><p>same-node communication (through a bridge)</p></li><li><p>cross-node communication</p></li></ul><p>For cross-node traffic, I will explain overlay networking (VXLAN encapsulation) and routed networking (BGP). Don&#8217;t get too scared. I&#8217;ll have diagrams to show the packet structure at each hop, including what gets encapsulated and where. Of course, if it&#8217;s too in-the-weeds you can always skip this bit. Weeds can be fun, though.</p><h3>Part 3: External Traffic to Pods (North-South Traffic)</h3><p>Once I&#8217;m finished going over traffic within a cluster, I&#8217;ll pull back a bit and discuss traffic into and out of a cluster, called &#8220;North-South&#8221; traffic. This will begin its journey at an external user, go through a Load Balancer service, then land on a pod.</p><p>I will dig into iptables: PREROUTING, the KUBE-SERVICES chain, service endpoint selection, DNAT, and SNAT. (Wow. Say that 10 times fast.)</p><p>I&#8217;ll also show the return path and how connection tracking enables it. And I&#8217;ll touch upon externalTrafficPolicy and its tradeoffs for client IP preservation.</p><h3>Part 4: Encryption In-Flight</h3><p>Finally, I get into encryption. Oh, yeah, this is always fun, &#8220;TLS and Certificates and Keys... on my&#8221;.</p><p>I will cover:</p><ul><li><p>TLS termination at the load balancer</p></li><li><p>TLS passthrough to ingress controllers</p></li><li><p>backend TLS to pods</p></li><li><p>CNI-level encryption for pod-to-pod traffic.</p></li></ul><p>I will also touch upon the tradeoffs among the different termination points and briefly cover network policies as a complementary security layer.</p><h2><strong>What You Will Need</strong></h2><p>If you haven&#8217;t been scared off already, note that for these posts I will assume some familiarity with basic networking concepts (TCP/IP) and core Kubernetes concepts (pods, services, nodes, the kubelet). It couldn&#8217;t hurt to review the networking OSI model but deep expertise is NOT required. I will include some command examples you can try out for yourself.</p><p>Next post: <strong><a href="https://techblog.hughtipping.com/p/kubernetes-networking-deep-dive-part">Kubernetes Networking Deep Dive, Part 1: Foundations</a></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://techblog.hughtipping.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Hugh&#8217;s Tech Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>