Troubleshooting Kubernetes Kubernetes Master Class Series

Troubleshooting Kubernetes Kubernetes Master Class Series
#Rancherk8s April 16, 2019

Sebastiaan van Steenis
Support Engineer Matthew Scheer Marketing Manager #RancherK8s

Rancher Master Class Series:
Trying to keep this to minutes Questions are always welcome Use the questions tab to write your questions We may respond to all, so mark your question as private if needed. #Rancherk8s

This session is being recorded!
#Rancherk8s

Join the conversation on Slack
#masterclass #RancherK8s

Upcoming Classes http://rancher.com/kubernetes-master-class/
#RancherK8s

Sebastiaan van Steenis Support Engineer https://github.com/superseb
Who am I Sebastiaan van Steenis Support Engineer

Troubleshooting techniques Kubernetes components Kubernetes resources
Agenda Kubernetes basics Troubleshooting techniques Kubernetes components Kubernetes resources Note: all examples are based on Kubernetes cluster built using RKE v0.2.1 but can be used on any Kubernetes distribution with minor adjustments.

Stores the state of the cluster control plane (master) kube-apiserver
Kubernetes basics etcd Stores the state of the cluster control plane (master) kube-apiserver front end to your cluster, interacts with state in etcd kube-controller-manager controllers that ensure cluster is according to desired state kube-scheduler schedules your workloads according to requirements worker kubelet cluster node agent kube-proxy provides networking logic for workloads

Agenda

Distributed key value store All data of Kubernetes is stored in etcd
Majority of nodes / quorum etcd cluster of 2 is worse than 1 Sensitive to disk write latency Space quota History of keyspace Compaction & defrag

etcd Nodes with etcd role Majority Failure Tolerance 1 2 3 4 5 6 7 8 9

The entry point for your cluster Uses etcd to maintain state
kube-apiserver The entry point for your cluster Uses etcd to maintain state Active-active round-robin possible

kube-controller-manager
Controller loops for getting cluster in desired state Commonly known controllers Node Replication Endpoints Service Account Talks to kube-apiserver One active per cluster, using leader election

Schedules pods according to requirements (resource/label/taints etc)
kube-scheduler Schedules pods according to requirements (resource/label/taints etc) Talks to kube-apiserver One active per cluster, using leader election

Troubleshooting techniques
Collect Collect current state If need to recover as soon as possible, and processes seem to hang, collect goroutines using kill -s SIGABRT $PID Timeline of events Retrieve data from tools interacting with cluster and from depending infrastructure Monitoring/trending data

Assess What error/logging is shown Search existing issues Relate timestamps from logging to events from timeline What triggered this situation/what changed Does it affect everything or a certain part (hardware/datacenter/region/availability zone) How does this differ from your baseline/tests

Fix Compare working and non-working components/resources Isolate the issue Test components directly (Temporarily) remove affected node from pool Stop the component to make sure you are talking to the right component (IP change or duplicate IP) Apply workaround/fix found in existing issue Reproduce in test environment

Follow up Second most important step What was the root cause What situation was caught or not caught What could be recovered/replaced automatically What took the most time

Check etcd members (this is not a live view)
Troubleshooting etcd Check etcd members (this is not a live view) Check endpoint health (this is a live view) # docker exec etcd etcdctl member list 2e40f74444dd07db, started, etcd , 51f47c0d9a3a4102, started, etcd , a d29ae2be, started, etcd , # docker exec etcd etcdctl endpoint health --endpoints=$(docker exec etcd /bin/sh -c "etcdctl member list | cut -d, -f5 | sed -e 's/ //g' | paste -sd ','") is healthy: successfully committed proposal: took = ms is healthy: successfully committed proposal: took = ms is healthy: successfully committed proposal: took = ms

The host firewall is preventing network communication.
Troubleshooting etcd Check etcd logging (docker logs etcd) health check for peer xxx could not connect: dial tcp IP:2380: getsockopt: connection refused A connection to the address shown on port 2380 cannot be established. Check if the etcd container is running on the host with the address shown. xxx is starting a new election at term x The etcd cluster has lost it’s quorum and is trying to establish a new leader. This can happen when the majority of the nodes running etcd go down/unreachable. connection error: desc = "transport: Error while dialing dial tcp :2379: i/o timeout"; Reconnecting to { : <nil>} The host firewall is preventing network communication. rafthttp: request cluster ID mismatch The node with the etcd instance logging rafthttp: request cluster ID mismatch is trying to join a cluster that has already been formed with another peer. The node should be removed from the cluster, and re-added. rafthttp: failed to find member The cluster state (/var/lib/etcd) contains wrong information to join the cluster. The node should be removed from the cluster, the state directory should be cleaned and the node should be re-added.

mvcc: database space exceeded
etcd Compaction and defrag Check logging mvcc: database space exceeded Check alarm status Compact # docker exec etcd etcdctl alarm list memberID:x alarm:NOSPACE memberID:x alarm:NOSPACE memberID:x alarm:NOSPACE rev=$(docker exec etcd etcdctl endpoint status --write-out json | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*') docker exec etcd etcdctl compact "$rev” compacted revision xxx

Compaction and defrag Defrag Disarm alarm etcd
docker exec etcd etcdctl defrag --endpoints=$(docker exec etcd /bin/sh -c "etcdctl member list | cut -d, -f5 | sed -e 's/ //g' | paste -sd ',’”) Finished defragmenting etcd member[ docker exec etcd etcdctl alarm disarm docker exec etcd etcdctl alarm list <empty>

Configure loglevel to be DEBUG
etcd Configure loglevel to be DEBUG Restore loglevel to be INFO # curl -XPUT -d '{"Level":"DEBUG"}' --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) :48: N | etcdserver/api/etcdhttp: globalLogLevel set to "DEBUG" # curl -XPUT -d '{"Level":"INFO"}' --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) :48: N | etcdserver/api/etcdhttp: globalLogLevel set to "DEBUG"

Available at https://127.0.0.1:2379/metrics Slow disk
etcd Metrics Available at Slow disk wal_fsync_duration_seconds (99% under 10 ms) A wal_fsync is called when etcd persists its log entries to disk before applying them. backend_commit_duration_seconds (99% under 25 ms) A backend_commit is called when etcd commits an incremental snapshot of its most recent changes to disk. Leader changes

etcd # curl -s --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) | grep wal_fsync_duration_seconds # HELP etcd_disk_wal_fsync_duration_seconds The latency distributions of fsync called by wal. # TYPE etcd_disk_wal_fsync_duration_seconds histogram etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 3873 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 4287 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 4409 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 4478 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 4495 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 4496 etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 4496 etcd_disk_wal_fsync_duration_seconds_sum etcd_disk_wal_fsync_duration_seconds_count 4496

etcd # curl -s --cacert $(docker exec etcd printenv ETCDCTL_CACERT) --cert $(docker exec etcd printenv ETCDCTL_CERT) --key $(docker exec etcd printenv ETCDCTL_KEY) | grep ^etcd_server_leader_changes_seen_total etcd_server_leader_changes_seen_total 1

Check etcd connectivity/responsesiveness
kube-apiserver Check etcd connectivity/responsesiveness kube-apiserver -> each etcd server Configured as --etcd-servers in kube-apiserver parameters

kube-apiserver # for etcdserver in $(docker inspect kube-apiserver --format='{{range .Args}}{{.}}{{"\n"}}{{end}}' | grep etcd-servers | awk -F= '{ print $2 }' | tr ',' '\n'); do SSLDIR=$(docker inspect kube-apiserver --format '{{ range .Mounts }}{{ if eq .Destination "/etc/kubernetes" }}{{ .Source }}{{ end }}{{ end }}'); echo "Validating connection to ${etcdserver}/health"; curl -w '\nConnect:%{time_connect}\nStart Transfer: %{time_starttransfer}\nTotal: %{time_total}\nResponse code: %{http_code}\n' --cacert $SSLDIR/ssl/kube-ca.pem --cert $SSLDIR/ssl/kube-apiserver.pem --key $SSLDIR/ssl/kube-apiserver-key.pem "${etcdserver}/health"; done Validating connection to {"health": "true"} Connect:0.001 Start Transfer: 0.104 Total: 0.104 Response code: 200 Validating connection to Start Transfer: 0.103 Total: 0.103 Validating connection to Start Transfer: 0.143 Total: 0.143

Check kube-apiserver responsiveness
Can be executed from workstation or any node to test network between node and kube-apiserver (non controlplane nodes use nginx-proxy to connect to kube-apiserver) # for cip in $(kubectl get nodes -l "node-role.kubernetes.io/controlplane=true" -o do kubectl --kubeconfig kube_config_cluster.yml --server get nodes -v6 2>&1 | grep round_trippers; done I :49: 4876 round_trippers.go:438] GET OK in 69 milliseconds I :49: 4878 round_trippers.go:438] GET OK in 74 milliseconds

kube-controller-manager
Find current leader # kubectl -n kube-system get endpoints kube-controller-manager -o jsonpath='{.metadata.annotations.control-plane\.alpha\.kubernetes\.io/leader}' {"holderIdentity":"seb-doctl-ubuntu-5_96fb83ba e9-a7a7-429a019f0230","leaseDurationSeconds":15,"acquireTime":" T08:42:57Z","renewTime":" T10:36:25Z","leaderTransitions":1}

Find current leader kube-scheduler
# kubectl -n kube-system get endpoints kube-scheduler -o jsonpath='{.metadata.annotations.control-plane\.alpha\.kubernetes\.io/leader}' {"holderIdentity":"seb-doctl-ubuntu-5_1ceba e9-b a019f0230","leaseDurationSeconds":15,"acquireTime":" T08:43:00Z","renewTime":" T10:37:09Z","leaderTransitions":1}

For example, nodefs or imagefs stats regarding DiskPressure
kubelet Check kubelet logging As this is the node agent, it will contain the most information regarding operations that it is executing based on scheduling requests Check kubelet stats For example, nodefs or imagefs stats regarding DiskPressure # curl -sLk --cacert /etc/kubernetes/ssl/kube-ca.pem --cert /etc/kubernetes/ssl/kube-node.pem --key /etc/kubernetes/ssl/kube-node-key.pem

Is state Running and how many Restarts?
Generic Get pods Is state Running and how many Restarts? Check Liveness/Readiness probes Logs Logging usually shows (depends on app) why it can’t start properly (or why it is not able to respond to the Liveness/Readiness probes) Describe pods Events shown in human readable format

kubectl describe pod $pod
Generic Describe $resource (human readable output of parameters of a resource) kubectl describe pod $pod Volumes: task-pv-storage: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: task-pv-claim ReadOnly: false default-token-zzpnj: Type: Secret (a volume populated by a Secret) SecretName: default-token-zzpnj Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ---- ---- Warning FailedScheduling 13m (x134 over 104m) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 4 times) Warning FailedScheduling 57s (x16 over 10m) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 4 times)

Get events with filter Generic
$ kubectl get events --field-selector involvedObject.kind=Pod -w LAST SEEN TYPE REASON KIND MESSAGE 7m26s Normal Scheduled Pod Successfully assigned default/liveness-exec to 3m42s Normal Pulling Pod pulling image "k8s.gcr.io/busybox" 4m57s Normal Pulled Pod Successfully pulled image "k8s.gcr.io/busybox" 4m57s Normal Created Pod Created container 4m57s Normal Started Pod Started container 113s Warning Unhealthy Pod Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory 3m42s Normal Killing Pod Killing container with id docker://liveness:Container failed liveness probe.. Container will be killed and recreated.

Check Pending pods Generic
$ kubectl get pods --all-namespaces -o go-template='{{range .items}}{{if eq .status.phase "Pending"}}{{.spec.nodeName}}{{" "}}{{.metadata.name}}{{" "}}{{.metadata.namespace}}{{" "}}{{range .status.conditions}}{{.message}}{{";"}}{{end}}{{"\n"}}{{end}}{{end}} <no value> task-pv-pod default pod has unbound immediate PersistentVolumeClaims (repeated 4 times); canal-f44ms kube-system <no value>;

Nodes Check for differences in nodes kubectl get nodes -o custom-columns=NAME:.metadata.name,OS:.status.nodeInfo.osImage,KERNEL:.status.nodeInfo.kernelVersion,RUNTIME:.status.nodeInfo.containerRuntimeVersion,KUBELET:.status.nodeInfo.kubeletVersion,KUBEPROXY:.status.nodeInfo.kubeProxyVersion NAME OS KERNEL RUNTIME KUBELET KUBEPROXY Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5 Ubuntu LTS generic docker:// v v1.13.5

Nodes Check taints kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints NAME TAINTS <none> <none> [map[effect:NoSchedule key:node-role.kubernetes.io/controlplane value:true]] <none> <none> [map[effect:NoExecute key:node-role.kubernetes.io/etcd value:true]] [map[effect:NoExecute key:node-role.kubernetes.io/etcd value:true]] [map[effect:NoExecute key:node-role.kubernetes.io/etcd value:true]] <none> [map[effect:NoSchedule key:node-role.kubernetes.io/controlplane value:true]]

Nodes Show node conditions kubectl get nodes -o go-template='{{range .items}}{{$node := .}}{{range .status.conditions}}{{$node.metadata.name}}{{": "}}{{.type}}{{":"}}{{.status}}{{"\n"}}{{end}}{{end}}' $ kubectl get nodes -o go-template='{{range .items}}{{$node := .}}{{range .status.conditions}}{{$node.metadata.name}}{{": "}}{{.type}}{{":"}}{{.status}}{{"\n"}}{{end}}{{end}}' : MemoryPressure:False : DiskPressure:False : PIDPressure:False : Ready:True : MemoryPressure:False : DiskPressure:False : PIDPressure:False : Ready:True

Show node conditions that could cause issues
Nodes Show node conditions that could cause issues $ kubectl get nodes -o go-template='{{range .items}}{{$node := .}}{{range .status.conditions}}{{if ne .type "Ready"}}{{if eq .status "True"}}{{$node.metadata.name}}{{": "}}{{.type}}{{":"}}{{.status}}{{"\n"}}{{end}}{{else}}{{if ne .status "True"}}{{$node.metadata.name}}{{": "}}{{.type}}{{"}}{{.status}}{{"\n"}}{{end}}{{end}}{{end}}{{end}}' : DiskPressure:True : Ready:Unknown

kubectl get events --field-selector involvedObject.kind=Node
Nodes kubectl get events --field-selector involvedObject.kind=Node # kubectl get events --field-selector involvedObject.kind=Node -w LAST SEEN TYPE REASON KIND MESSAGE 0s Normal NodeNotReady Node Node status is now: NodeNotReady

Ports opened in (host) firewall Overlay networking is usually UDP
Keep MTU in mind (especially across peering/tunnels) Simple overlay network check available in Troubleshooting guide:

DNS Check if internal cluster name resolves kubectl run -it --rm --restart=Never busybox --image=busybox: nslookup kubernetes.default Check if external name resolves kubectl run -it --rm --restart=Never busybox --image=busybox: nslookup Simple DNS check available in Troubleshooting guide:

Check upstream DNS nameservers
$ kubectl -n kube-system get pods -l k8s-app=kube-dns --no-headers -o custom-columns=NAME:.metadata.name,HOSTIP:.status.hostIP | while read pod host; do echo "Pod ${pod} on host ${host}"; kubectl -n kube-system exec $pod -c kubedns cat /etc/resolv.conf; done Pod kube-dns-58bd5b8dd7-6rv6n on host nameserver nameserver Pod kube-dns-58bd5b8dd7-frdnd on host Pod kube-dns-58bd5b8dd7-thxxp on host

Node(s) with NGINX ingress controller
Ingress controller (NGINX by default in RKE) Internet Load balancer Node(s) with NGINX ingress controller Pods

Check responsiveness of Ingress Controller
$ kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=POD:.metadata.name,NODE:.spec.nodeName,IP:.status.podIP --no-headers | while read ingresspod nodename podip; do echo "=> Testing from ${ingresspod} on ${nodename} (${podip})"; curl -o /dev/null --connect-timeout 5 -s -w 'Connect: %{time_connect}\nStart Transfer: %{time_starttransfer}\nTotal: %{time_total}\nResponse code: %{http_code}\n' -k done => Testing from nginx-ingress-controller-2qznb on ( ) Connect: Start Transfer: Total: Response code: 000 => Testing from nginx-ingress-controller-962mb on ( ) Connect: Start Transfer: Total: Response code: 200 => Testing from nginx-ingress-controller-b6mbm on ( ) Connect: Start Transfer: Total: => Testing from nginx-ingress-controller-rwc8c on ( ) Connect: Start Transfer: Total:

Check responsiveness Ingress controller -> pods
=> Testing from nginx-ingress-controller-226mt on ( ) ==> Found host foo.bar.com with service nginx in default ==> Connecting to on Connect: Start Transfer: Total: Response code: 200 OK => Testing from nginx-ingress-controller-9krxk on ( ) Connect: Start Transfer: Total: => Testing from nginx-ingress-controller-b6c7g on ( ) Connect: Start Transfer: Total:

Ingress Check static NGINX config for pod in $(kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=NAME:.metadata.name --no-headers); do kubectl -n ingress-nginx exec $pod -- cat /etc/nginx/nginx.conf; done Output hard to diff, use checksum to find differences for pod in $(kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=NAME:.metadata.name --no-headers); do echo $pod; kubectl -n ingress-nginx exec $pod -- cat /etc/nginx/nginx.conf | md5; done Different checksum, eliminate instance specific and randomized lines for pod in $(kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=NAME:.metadata.name --no-headers); do echo $pod; kubectl -n ingress-nginx exec $pod -- cat /etc/nginx/nginx.conf | grep -v nameservers | grep -v resolver | grep -v "PEM sha" | md5; done

Ingress Check dynamic NGINX config for pod in $(kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=NAME:.metadata.name --no-headers); do echo $pod; kubectl -n ingress-nginx exec $pod -- curl -s done Output hard to diff, use checksum to find differences for pod in $(kubectl -n ingress-nginx get pods -l app=ingress-nginx -o custom-columns=NAME:.metadata.name --no-headers); do echo $pod; kubectl -n ingress-nginx exec $pod -- curl -s | md5; done Note: this changed to a local socket (/tmp/nginx-status-server.sock) in later version. And a kubectl plugin (

kubectl -n ingress-nginx logs -l app=ingress-nginx
View logging kubectl -n ingress-nginx logs -l app=ingress-nginx Enable debug logging kubectl -n ingress-nginx edit ds nginx-ingress-controller Add --v=2 up to --v=5, depending on how verbose

Break your (test) environment often Master your tools
Recommendations Break your (test) environment often Master your tools Don’t assume, check Make it easy for yourself, use labels/naming convention (for example, region or availability zone) Get comfortable with debug and recovery processes Make sure you have environment data to use as reference (baseline) Make sure you have centralized logging/metrics (baseline) Improve your process after each occurrence

https://rancher.com/docs/rancher/v2.x/en/troubleshooting/
Resources

Get started in two easy steps
Step 1: Prepare a Linux Host Rancher requires a single host installed with either Ubuntu (kernel v3.10+) or RHEL/CentOS 7.3 as well as at least 2GB of memory, 20GB of local disk and a supported version of Docker. Step 2: Start the server To install and run Rancher server, execute the following Docker command on your host: $ sudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:latest

Rancher is an Enterprise Container Management Platform
Self Service Kubernetes Environments Unified Cluster Operations Let me be more precise about it the offering – (Software.) (Services?) – Support) Central IT DevOps - User Interface - Monitoring - Service Catalog - Logging - Alerting - CI/CD - Provisioning - Security - Auth/RBAC - Capacity - Policy - Cost RKE RKE EKS GKE AKS Any Infrastructure #RancherK8s

Rancher Quick Start Guide

Rancher, RancherOS, RKE are in GitHub

Thank you @Rancher_Labs · #Rancherk8s
Rancher.com/kubernetes-master-class

Troubleshooting Kubernetes Kubernetes Master Class Series

Similar presentations

Presentation on theme: "Troubleshooting Kubernetes Kubernetes Master Class Series"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Troubleshooting Kubernetes Kubernetes Master Class Series

Similar presentations

Presentation on theme: "Troubleshooting Kubernetes Kubernetes Master Class Series"— Presentation transcript:

Similar presentations

About project

Feedback