Troubleshooting NGINX Ingress Controller Issues in Kubernetes

Anju
8 min readOct 1, 2024

--

Problem Statement:

In this post, we will walk through troubleshooting and resolving an issue with the NGINX ingress controller in a Kubernetes cluster. The symptoms included the inability to access a service through the ingress route, despite the service and pods being in a seemingly healthy state. We will go step by step, from initial investigation to final resolution.

The “err_empty_response” error message can occur when accessing a website in any browser and means that data is not being sent or transmitted. It can be caused by various things, including: Connectivity issues: A poor internet connection or problems with the server.

Verify DNS Resolution


nslookup abc.example.com
Server: fe80::1%15
Address: fe80::1%15#53

Non-authoritative answer:
Name: abc.example.com
Address: 3.126.45.22
Name: abc.example.com
Address: 35.156.157.122
Name: abc.example.com
Address: 3.121.102.222


curl -v http://abc.example.com/v1/sys/health
* Host abc.example.com:80 was resolved.
* IPv6: (none)
* IPv4: 3.126.45.22, 35.156.157.122, 3.121.102.222
* Trying 3.126.45.22:80...
* Connected to abc.example.com (3.126.45.22) port 80
> GET /v1/sys/health HTTP/1.1
> Host: abc.example.com
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
* Empty reply from server
* Closing connection
curl: (52) Empty reply from server

Verify the Pods:

k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
example-app-0 1/1 Running 0 37m 10.0.13.137 ip-10-0-4-171.eu-central-1.compute.internal <none> <none>
example-app-1 1/1 Running 0 36m 10.0.131.24 ip-10-0-137-40.eu-central-1.compute.internal <none> <none>
example-app-2 1/1 Running 0 36m 10.0.80.81 ip-10-0-86-5.eu-central-1.compute.internal <none> <none>
example-app-agent-injector-d998867b5-pbp26 1/1 Running 0 28d 10.0.76.61 ip-10-0-86-5.eu-central-1.compute.internal <none> <none>
#Get the pod port from the svc configuration and try to curl the application from the cluster

kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://10.0.13.137:8200"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: unable to upgrade connection: container curl-test not found in pod curl-test
* Trying 10.0.13.137:8200...
* Connected to 10.0.13.137 (10.0.13.137) port 8200 (#0)
> GET / HTTP/1.1
> Host: 10.0.13.137:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 307 Temporary Redirect
< Cache-Control: no-store
< Content-Type: text/html; charset=utf-8
< Location: /ui/
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:31:15 GMT
< Content-Length: 40
<
<a href="/ui/">Temporary Redirect</a>.

* Connection #0 to host 10.0.13.137 left intact
pod "curl-test" deleted

The output indicates that the example-app server running on 10.0.13.137:8200 is responding with a 307 Temporary Redirect to /ui/. This is expected behavior when you hit the example-app HTTP API root (/) as it redirects to the UI endpoint.

To avoid the redirect and check the example-app status via the API, you can use the /v1/sys/health endpoint.

Check the health status of the app:

kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://10.0.13.137:8200/v1/sys/health"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: Internal error occurred: error attaching to container: container is in CONTAINER_EXITED state
* Trying 10.0.13.137:8200...
* Connected to 10.0.13.137 (10.0.13.137) port 8200 (#0)
> GET /v1/sys/health HTTP/1.1
> Host: 10.0.13.137:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Cache-Control: no-store
< Content-Type: application/json
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:34:10 GMT
< Content-Length: 295
<
{"initialized":true,"sealed":false,"standby":false,"performance_standby":false,"replication_performance_mode":"disabled","replication_dr_mode":"disabled","server_time_utc":17277250,"version":"1.14.0","cluster_name":"example-app-2a8cb73a","cluster_id":"cfd1ddad-423b-1abc-66f8-793f2d32f4ec"}
* Connection #0 to host 10.0.13.137 left intact
pod "curl-test" deleted

he output shows that the example-app server on 10.0.13.137:8200 is responding with a status 200 OK, indicating that example-app is initialized, unsealed, and running normally.

Verify the Service:

k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
example-app ClusterIP 172.20.107.169 <none> 8200/TCP,8201/TCP 656d
example-app-active ClusterIP 172.20.171.183 <none> 8200/TCP,8201/TCP 656d
example-app-agent-injector-svc ClusterIP 172.20.252.59 <none> 443/TCP 656d
example-app-internal ClusterIP None <none> 8200/TCP,8201/TCP 656d
example-app-standby ClusterIP 172.20.79.111 <none> 8200/TCP,8201/TCP 656d
example-app-ui ClusterIP 172.20.65.57 <none> 8200/TCP 656d
kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://172.20.171.183:8200/v1/sys/health"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: Internal error occurred: error attaching to container: container is in CONTAINER_EXITED state
E0930 19:08:34.912672 78036 v3.go:79] EOF
E0930 19:08:34.914639 78036 v3.go:79] EOF
* Trying 172.20.171.183:8200...
* Connected to 172.20.171.183 (172.20.171.183) port 8200 (#0)
> GET /v1/sys/health HTTP/1.1
> Host: 172.20.171.183:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Cache-Control: no-store
< Content-Type: application/json
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:38:34 GMT
< Content-Length: 295
<
{"initialized":true,"sealed":false,"standby":false,"performance_standby":false,"replication_performance_mode":"disabled","replication_dr_mode":"disabled","server_time_utc":17277250,"version":"1.14.0","cluster_name":"example-app-2a8cb73a","cluster_id":"cfd1ddad-423b-1abc-66f8-793f2d32f4ec"}
* Connection #0 to host 172.20.171.183 left intact
pod "curl-test" deleted



kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://172.20.79.111:8200/v1/sys/health"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: unable to upgrade connection: container curl-test not found in pod curl-test
* Trying 172.20.79.111:8200...
* Connected to 172.20.79.111 (172.20.79.111) port 8200 (#0)
> GET /v1/sys/health HTTP/1.1
> Host: 172.20.79.111:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 429 Too Many Requests
< Cache-Control: no-store
< Content-Type: application/json
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:38:46 GMT
< Content-Length: 294
<
{"initialized":true,"sealed":false,"standby":false,"performance_standby":false,"replication_performance_mode":"disabled","replication_dr_mode":"disabled","server_time_utc":17277250,"version":"1.14.0","cluster_name":"example-app-2a8cb73a","cluster_id":"cfd1ddad-423b-1abc-66f8-793f2d32f4ec"}
* Connection #0 to host 172.20.79.111 left intact
pod "curl-test" deleted


kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://172.20.65.57:8200"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: unable to upgrade connection: container curl-test not found in pod curl-test
* Trying 172.20.65.57:8200...
* Connected to 172.20.65.57 (172.20.65.57) port 8200 (#0)
> GET / HTTP/1.1
> Host: 172.20.65.57:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 307 Temporary Redirect
< Cache-Control: no-store
< Content-Type: text/html; charset=utf-8
< Location: /ui/
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:39:14 GMT
< Content-Length: 40
<
<a href="/ui/">Temporary Redirect</a>.

* Connection #0 to host 172.20.65.57 left intact
pod "curl-test" deleted



kubectl run curl-test --rm -i --tty --image=curlimages/curl:7.87.0 -- /bin/sh -c "curl -v http://172.20.107.169:8200/v1/sys/health"

If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/curl-test, falling back to streaming logs: unable to upgrade connection: container curl-test not found in pod curl-test
* Trying 172.20.107.169:8200...
* Connected to 172.20.107.169 (172.20.107.169) port 8200 (#0)
> GET /v1/sys/health HTTP/1.1
> Host: 172.20.107.169:8200
> User-Agent: curl/7.87.0-DEV
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Cache-Control: no-store
< Content-Type: application/json
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:39:26 GMT
< Content-Length: 295
<
{"initialized":true,"sealed":false,"standby":false,"performance_standby":false,"replication_performance_mode":"disabled","replication_dr_mode":"disabled","server_time_utc":17277250,"version":"1.14.0","cluster_name":"example-app-2a8cb73a","cluster_id":"cfd1ddad-423b-1abc-66f8-793f2d32f4ec"}
* Connection #0 to host 172.20.107.169 left intact
pod "curl-test" deleted

All the test were successful for various example-app services using curl and their corresponding ClusterIP addresses:

  1. Active example-app Service (172.20.171.183): Responded with HTTP 200 OK.
  2. Standby example-app Service (172.20.79.111): Responded with HTTP 429 Too Many Requests (which is expected for standby nodes).
  3. example-app UI Service (172.20.65.57): Redirected to /ui/, which indicates that the service is functioning properly and the UI is accessible.
  4. Default example-app Service (172.20.107.169): Responded with HTTP 200 OK, showing that the service is healthy.

Now we have confirmation that both the active and standby example-app instances are working correctly, and the example-app UI is accessible via its ClusterIP endpoint.

Verify and Test Ingress Endpoint :

k get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
example-ingress nginx abc.example.com 80, 443 57m

k describe ingress example-ingress
.......
........
........
..........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 58m nginx-ingress-controller Scheduled for sync
Normal CreateCertificate 58m cert-manager-ingress-shim Successfully created Certificate "example-app-tls"
Normal Sync 58m nginx-ingress-controller Scheduled for sync
Normal Sync 58m nginx-ingress-controller Scheduled for sync

k get cert
NAME READY SECRET AGE
example-app-tls True example-app-tls 59m
kubectl run curl-test --rm -i --tty --image=curlimages/curl -- /bin/sh -c "curl -v http://example-ingress:8200/v1/sys/health"

If you don't see a command prompt, try pressing enter.
* Host example-ingress:8200 was resolved.
* IPv6: (none)
* IPv4: 172.20.107.169
* Trying 172.20.107.169:8200...
* Connected to example-ingress (172.20.107.169) port 8200
* using HTTP/1.x
> GET /v1/sys/health HTTP/1.1
> Host: example-ingress:8200
> User-Agent: curl/8.10.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Cache-Control: no-store
< Content-Type: application/json
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Date: Mon, 30 Sep 2024 13:44:47 GMT
< Content-Length: 295
<
{"initialized":true,"sealed":false,"standby":false,"performance_standby":false,"replication_performance_mode":"disabled","replication_dr_mode":"disabled","server_time_utc":17277250,"version":"1.14.0","cluster_name":"example-app-2a8cb73a","cluster_id":"cfd1ddad-423b-1abc-66f8-793f2d32f4ec"}
* Connection #0 to host example-ingress left intact
Session ended, resume using 'kubectl attach curl-test -c curl-test -i -t' command when the pod is running
pod "curl-test" deleted

The successful curl command output confirms that the example-app service is reachable and responding correctly on port 8200 within the cluster.

Since it works within the cluster, the issue likely lies in the ingress setup or the connection between the ingress controller and the example-app service.

Describe the example-app ingress

k get ingress example-ingress -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
.
.
.
spec:
ingressClassName: nginx
rules:
- host: abc.example.com
http:
paths:
- backend:
service:
name: example-app
port:
number: 8200
path: /
pathType: Prefix
tls:
- hosts:
- abc.example.com
secretName: example-app-tls
status:
loadBalancer: {}

The status.loadBalancer field is currently empty ({}), which indicates that no external IP or load balancer has been assigned to this ingress. This might be why the ingress isn't responding to requests. Ensure that your ingress controller is correctly provisioned and has an external IP.

Check The Ingress Controller:

k get pods                                                              
NAME READY STATUS RESTARTS AGE
ingress-nginx-ingress-controller-698c6795d5-9dshl 0/1 Running 2 (7d6h ago) 11d
ingress-nginx-ingress-controller-698c6795d5-fmb4j 0/1 Running 3 (27h ago) 11d
ingress-nginx-ingress-controller-698c6795d5-jnjfz 0/1 Running 1 (7d7h ago) 14d
ingress-nginx-ingress-controller-default-backend-5bmn2s 1/1 Running 0 14d
k describe pod ingress-nginx-ingress-controller-698c6795d5-9dshl
.
.
.
.
.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 4m34s (x71269 over 8d) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning RELOAD 4m28s (x144351 over 7d6h) nginx-ingress-controller (combined from similar events): Error reloading NGINX: exit status 1
2024/09/30 13:56:50 [notice] 288745#288745: signal process started
2024/09/30 13:56:50 [alert] 288745#288745: kill(23, 1) failed (3: No such process)
nginx: [alert] kill(23, 1) failed (3: No such process)
kubectl logs ingress-nginx-ingress-controller-698c6795d5-9dshl
.
.
.
.
Error reloading NGINX: exit status 1
2024/09/30 13:12:11 [alert] 287507#287507: kill(23, 1) failed (3: No such process)
nginx: [alert] kill(23, 1) failed (3: No such process)

The ingress controller pods are not in good shape. We can try restart them inorder to fix the issue:

k delete pod ingress-nginx-ingress-controller-698c6795d5-9dshl

Verify the application now:

curl -s -o /dev/null -w "%{http_code}" https://abc.example.com/v1/sys/health

200

Conclusion

In this troubleshooting exercise, we:

  • Verified that the application pods were running and healthy.
  • Ensured the service was correctly configured and reachable within the cluster.
  • Checked the ingress configuration to verify that it was set up correctly.
  • Investigated and resolved the issue by restarting the NGINX ingress controller pods.

This step-by-step process highlights the importance of checking all aspects of the Kubernetes stack, from pod health to service and ingress configurations. When encountering similar issues, reviewing logs and verifying the configuration should be the first step toward resolution.

--

--

Anju
Anju

Written by Anju

A DevOps engineer who loves automating everything (almost), exploring new places, and finding peace in nature. Always looking for the next adventure!

No responses yet