YAML has become a very popular choice for configuration files. It takes what many developers are used to, JSON, and increases readability. This is mainly done by removing unnecessary characters like brackets and quotes, as well as using an indentation approach, which is more intuitive for humans to parse.
You might think that the structure of YAML closely resembles that of JSON. YAML is a superset of JSON, so it’s not entirely untrue. JSON being a superset means that any
.json file can be parsed by a YAML parser. Yes, this means you can write all your YAML in JSON instead, but with how popular YAML is, and how widespread its use is in guides and tutorials, it’s a good idea to learn how it works. This is especially true in Kubernetes, where all resources are typically defined as
There are many possibilities within YAML, but at the base of it you only need to know two concepts: lists and maps. Although a
.yaml file can seem complex, it all boils down to these two concepts. A map is a collection of keys and values, an example of which can be seen at the top of almost any Kubernetes configuration file. Take a look at the first two lines defining a Pod:
Notice the lack of quotes. Based on the underlying context, YAML will know what data type is needed. Here both values will be parsed as strings, whereas a value of
1 will be parsed as an integer, and a value of
true will be parsed as a boolean. Values in a map are not limited to only simple data types. The value of a map can also be another map, as is the case for the metadata field in a Kubernetes configuration file:
You can think of maps as objects in JSON in that they are key-value objects with the possibility of nesting objects.
The second concept you need to know are lists. You can find these in JSON as arrays. Here’s how they look in YAML:
To make a list, you make a new line and start with a dash. These lists can be endless, and values can be strings, integers, booleans, and even maps.
Now that you know the two core concepts of any
.yaml file, there’s one last important thing to know. Indentation. Because YAML aims to be human-readable, it relies heavily on indentation. As seen in the example of a nested map, this was denoted by indenting the value-map by two spaces. It’s very important that you keep your indentation in order, as it can make it tough to troubleshoot at times.
Representing Kubernetes objects with YAML (deployment example)
You may be used to creating Pods, Deployments, Services etc. in Kubernetes via the
kubectl create command. This way of creating objects is indeed valid and great for learning purposes. However, when running Kubernetes in production you often want to have all your objects defined as
.yaml files. This makes it easier for others to know what’s running in the cluster, and allows for your deployments to be version controlled.
Kubernetes makes it easy to know how any objects created are defined in YAML. When you run
kubectl get, add the flag
-o yaml. This will output the objects in YAML, rather than the typical list view. Try running
kubectl create deployment nginx --image=nginx. This will create a deployment that you can now view by running
kubectl get deployment nginx -o yaml. You’ll see a lot of lines being printed, displaying everything there is to know about the
nginx deployment. Thankfully, when writing the
.yaml files you don’t need to write all the lines you see printed in your terminal, since when using the
kubectl get command Kubernetes is also showing you all the auto-generated fields. When you remove those, you get the following configuration file:
While this can seem like a lot if you’re used to only running
kubectl create commands, it’s recommended that you start looking into configuration definitions. This will not only help you be more knowledgeable about Kubernetes in general, but will also help you in terms of keeping your deployments defined as code.
If you don’t want to go through the process of creating an object, viewing it, and trimming it down, there’s another option built into
kubectl. For example, if you want to know what a Pod definition looks like, you can run
kubectl run nginx --image=nginx --dry-run=client -o yaml. The
--dry-run=client part is typically used to validate a
create command. The
-o yaml will make it output it in YAML, given you the following output:
Comparing these two definitions, you can see that there are similarities, like
kind. These fields are required, whereas some other fields are optional. Whether these fields are required or not depends on what type of object you are creating.
Required and important optional fields
As noted before, some required fields have to be set in all configuration files.
spec all have to be set. You can read more about them here.
Other than these, some fields are generally accepted as best practice to have. A common field to set is
metadata. When you’re going to create a new object, take a look at some examples online, as these important optional fields typically vary depending on the type of object.
One of the most commonly used fields that will impact how your service is running are annotations. Annotations are used by different services like operators, for example like the Datadog Agent, which is used for logging and comprehensive Kubernetes monitoring. The Datadog Agent will look for specific annotations as a way of figuring out which deployments it should scan for logs.
YAML best practices
Now that you know the basis of YAML, you are ready to start writing your own configurations. However, as with anything in software, there are some best practices you should follow.
First, you should not be using tabs, you should use spaces. This is a very hot topic between developers, but in terms of YAML it’s not so much an opinion. Many YAML parsers will fail if you are using tabs to indent your file, which is why you should use spaces.
Another best practice has more to do with making your life easier as an engineer. Use a monospaced font when viewing and editing
.yaml files. This makes it a lot easier to spot any errors in indentation. Taking it a step further, you can consider installing or configuring your text editor/IDE to visually show spaces and tabs.
Finally, use as little indentation as possible. While keys and values on the same level must be indented the same amount, YAML isn’t too picky about how much they’re indented. These two files are both equally valid:
With a small example like this, indentation may seem insignificant, but once you start nesting many maps and lists, it can become tough to manage. Keeping your indentation to a minimum in width can help quite a bit with readability. If you do choose to indent according to the second example, make sure you’re consistent. While different indentation levels can be mixed in a single file, it will quickly become very hard to read if it’s not at least consistent.
You now know a bit more about how YAML plays into Kubernetes as a whole, and you can start writing your own
.yaml files. You are now able to more comprehensively define your configurations, share them with others, and version control them. This also aids in making them more replicable, as many tools exist which let you expand on configuration files, like Kustomize and Helm.
Time to go ahead and take a look at how you can best implement YAML into your workflow. Remember to use spaces for indentation, and remember: less is more.