Exploring etcd snapshots with koff
We have looked at etcd
performance and how critical it is to cluster health, but what if we want to see what’s in etcd
?
If you have existing etcd
backups / snapshots or want to take one now, there is a lot you can do and look at!
If you deleted someone important and want to get it back, you can get it from a backup and recreate it in your cluster.
If your etcd
database is huge, you can parse through a fresh snapshot and see where the issues are without touching production.
Accessing and parsing your etcd
data isn’t a daily occurance, but when the time comes, it’s incredibly useful.
In this module we will utilize the koff
tool to showcase get and inspect commnads on both standard Kubernetes
objects types as well as traditionally stored etcd
keys.
koff use
command
-
To get started you want to verify that
koff
is installed in your path and working -
Once you have confirmed this, you want to call the
koff use
command to tellkoff
theetcd.db
to utilize for review -
Then run the
koff get -h
to view the available flags.
cd ~/Module10/
$ koff version
koff version: v1.0.1
hash: 01cddf8
https://github.com/gmeghnag/koff
$ koff use module10-etcd_snapshot_2024.db
$ koff get -h
Usage:
koff get [flags]
Flags:
-A, --all-namespaces Show resources across all namespaces.
-h, --help help for get
-n, --namespace string Namespace.
--no-headers Hide headers.
-o, --output string Output format. One of: json|yaml|wide
-K, --show-kind Show kind.
--show-managed-fields Show managedFields when output is one of: json, yaml.
-N, --show-namespace Show namespace.
Viewing resources in etcd
The koff get
command allows you to get any resource inside the etcd
database like you were using oc
View all of the cluster namespaces with koff get namespaces
:
$ koff get namespaces
NAME STATUS AGE
namespace/aap Active 1y
namespace/aap25 Active 102d
namespace/agilleyaap Active 329d
namespace/anztam Active 136d
namespace/camel-test Active 1y
namespace/cert-manager Active 135d
namespace/cert-manager-operator Active 135d
namespace/cnpg-system Active 1y
Get all of the pods running inside a namespace with koff get pods -n <namespace>
$ koff get pods -n app
NAME READY STATUS RESTARTS AGE
ansible-lightspeed-operator-controller-manager-bccffff68-wqj5h 2/2 Running 91 102d
automation-controller-operator-controller-manager-85b5464dldtnw 2/2 Running 89 102d
automation-hub-operator-controller-manager-847659f4bf-ll4jm 2/2 Running 97 102d
eda-server-operator-controller-manager-6df5c675c8-dxqcv 2/2 Running 56 75d
example-pah-api-6dd5bb466-g5kv5 0/1 Pending 0 83d
Continue to explore the etcd
database with some addition koff get commands, just like you would use oc get
against a running cluster.
$ koff get service -A
$ koff get endpoints -A
$ koff get configmaps -A
$ koff get daemonset -A
$ koff get deployment -A
Inspecting the etcd database
The koff etcd inspect
command allows you to inspect the etcd
databse in it’s true key/value pair structure
View all of the object inside the etcd
database:
$ koff etcd inspect module10-etcd_snapshot_2024.db
/kubernetes.io/apiregistration.k8s.io/apiservices/v1.apiextensions.k8s.io
/kubernetes.io/prioritylevelconfigurations/system
/kubernetes.io/apiregistration.k8s.io/apiservices/v1.apps
/kubernetes.io/apiregistration.k8s.io/apiservices/v1.
/kubernetes.io/apiregistration.k8s.io/apiservices/v1.authentication.k8s.io
Inspect a specifc resource, in this case a node (actually called a minion in kubernetes), to see it’s full configuration:
$ koff etcd inspect module10-etcd_snapshot_2024.db /kubernetes.io/minions/master-1.ocp413shared.tamlab.brq2.redhat.com
{
"apiVersion": "v1",
"kind": "Node",
"metadata": {
"annotations": {
"machineconfiguration.openshift.io/controlPlaneTopology": "HighlyAvailable",
"machineconfiguration.openshift.io/currentConfig": "rendered-master-151f24e2087c2b5521a7b9af2c4d5b16",
"machineconfiguration.openshift.io/desiredConfig": "rendered-master-151f24e2087c2b5521a7b9af2c4d5b16",
"machineconfiguration.openshift.io/desiredDrain": "uncordon-rendered-master-151f24e2087c2b5521a7b9af2c4d5b16",
"machineconfiguration.openshift.io/lastAppliedDrain": "uncordon-rendered-master-151f24e2087c2b5521a7b9af2c4d5b16",
"machineconfiguration.openshift.io/lastSyncedControllerConfigResourceVersion": "635663717",
"machineconfiguration.openshift.io/reason": "",
"machineconfiguration.openshift.io/state": "Done",
"nfd.node.kubernetes.io/master.version": "undefined",
"volumes.kubernetes.io/controller-managed-attach-detach": "true"
},
"creationTimestamp": "2023-08-18T21:08:47Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "master-1.ocp413shared.tamlab.brq2.redhat.com",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/control-plane": "",
"node-role.kubernetes.io/master": "",
"node.openshift.io/os_id": "rhcos"
},
Maybe you deleted an important ConfigMap
or Secret
and needed to get it back from a backup!
$ koff etcd inspect module10-etcd_snapshot_2024.db /kubernetes.io/configmaps/firewall-rule/httpd-ex-1-sys-config
{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": {
"creationTimestamp": "2024-05-02T04:11:08Z",
"name": "httpd-ex-1-sys-config",
"namespace": "firewall-rule",
$ koff etcd inspect module10-etcd_snapshot_2024.db /kubernetes.io/secrets/quay/tamlab-quay-config-secret-98gh285gcd
{
"apiVersion": "v1",
"data": {
"config.yaml": ""
},
"kind": "Secret",