Deploying Ansible AWX (or any other app, probably!) on Kubernetes From Day 0

I’ve been looking recently at learning Ansible (bit late to the party I know!) after I attended a one day training course. During the course I learned about the core Ansible Engine and then spent some time on Ansible Tower. Ansible AWX is the upstream open source project of RedHat’s Tower, so I figured I’d start out with deploying an instance of AWX in my lab.

Introduction

I’m pretty much a complete noob when it comes to Kubernetes! I’ve had a little exposure during work on my contributions to the VMware Event Broker Appliance (VEBA) project. I also had a go at setting up Tanzu in my lab fairly recently, but I have yet to deploy a full K8s stack and application of my own. So when I read the AWX install guide and saw that the current preferred installation method is to use a Kubernetes Operator, I figured I should finally get involved with K8s properly!

Note - I'm deploying AWX in this guide but the principles will apply to any Kubernetes application deployed using an operator/helm/etc. I've intentionally left all my mistakes and learnings in the process and I guide you through how I identified and progressed from each point. Hopefully this will help you get off the ground with Kubernetes too. Let me know in the comments if this was helpful!

First things first, I decided that for some of the things I’d like to automate with Ansible in my lab (full lab shutdown for example) I really needed my Kubernetes cluster to actually just be a single, lightweight node that would run on my MacMini ESXi host or even just a Raspberry Pi. The AWX install guide suggested using minikube which sounded ideal. However I fairly quickly learned that minikube is aimed at development environments running on a user workstation. As a result exposing the AWX UI over my network wasn’t easy, and I’d have to hack together a systemd file to launch minikube after a reboot.

After a bit of research I decided that a single node K8s cluster built using kubeadm was probably the best option. I’ve seen Kubernetes used in this way in a number of VMware appliances (including VEBA), so I started by digging in to the VEBA source code to figure out the steps needed to get my single node K8s stack up.

So here are the steps I took to get AWX running. It wasn’t plain sailing for me but I learned a lot along the way, so I’ll explain what I learned as we go through the build below.

Deploy the VM

First, let’s deploy a Linux VM – this guide should work using most popular flavours of Linux but I chose to use Photon OS in the end as it’s ideally suited to running containerised workloads, and there’s a downloadable OVA template ready to go too. I built my VM with 2 vCPUs and 4GB of RAM (I may have to increase this in the future but it’s fine for now at least). Importantly I also added an extra 20 GB disk for local storage for AWX (more on this later).

VM settings

Once you have your VM up and running, lets sort out the normal ‘new VM’ stuff and deal with some pre-requisites (I’ve included links/details that work for Photon OS – other OS’s may vary!):

  • Change the default root password (‘changeme’) using the VM console
  • Configure a static IP or DHCP reservation – I chose to use a DHCP reservation as I tend prefer that way in my lab. A static address can be set in Photon OS by following this guide
  • Set the hostname with hostnamectl set-hostname awx.lab.fqdn
  • Configure NTP:
    • vi /etc/systemd/timesyncd.conf
    • Add NTP server addresses under the [Time] section
    • systemctl restart systemd-timesyncd
  • Partition and mount storage. As I added a dedicated disk for local storage above, I simply need to partition the entire device:
    • fdisk -l
    • fdisk /dev/sdb
      • g (create a GPT partition table)
      • n (create a new partition – hit enter 3 times to accept all the defaults)
      • p (to print the new disk layout for you to check)
      • w (to write the partition table and exit)
    • mkfs -t ext3 /dev/sdb1 (format the new partition as ext3)
    • mkdir /mnt/local-storage (create a mount point)
    • echo "/dev/sdb1 /mnt/local-storage ext3 defaults 0 0" >> /etc/fstab (add the mount definition)
    • mount -a (mount the volume)
    • df -h (confirm the mount and size is 20G)
  • Apply all the latest OS updates:
    • tdnf update
  • Install kubeadm plus some handy utilities:
    • tdnf install kubernetes-kubeadm wget less
  • Disable the firewall (or create suitable rules – I’m not covering that here though)
    • systemctl stop iptables
    • systemctl disable iptables

At this point we should be ready to start the AWX install, so give your machine a reboot to ensure everything comes back cleanly then let’s get started.

Deploy the Kubernetes Stack

First up we need to enable and start the docker daemon:

[root@awx ~]# systemctl enable docker
[root@awx ~]# systemctl start docker

Then we can initialize our Kubernetes Control Plane ready for use. Thankfully there’s a handy tool for this job called kubeadm. You can just run kubeadm init and you’ll get a Kubernetes Control Plane built, but we might like a little more control than that, such as defining which version of Kubernetes to install and choosing the IP subnet to use for the internal Kubernetes network (which is an overlay network so sits on top of your standard network).

Like it or not, in Kubernetes all config like this is done using YAML files. To initialise the Control Plane we’ll need a YAML file called kubeconfig.yml that looks something like this:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
networking:
  podSubnet: 10.16.0.0/16
  serviceSubnet: 10.96.0.0/12

This config file tells kubeadm to build us a control plane using Kubernetes version 1.19.0 and to use the 10.16.0.0/16 subnet for pods to communicate on, and 10.96.0.0/12 for services (more on this later). You need to make sure that these subnets are not ones you use in your network (I use 192.168.0.0/16 for all my home and lab networking).

We can then run kubeadm init to set up our control plane like so:

[root@awx ~]# kubeadm init --ignore-preflight-errors SystemVerification --skip-token-print --config kubeconfig.yml

This will take a few minutes on a new system while kubeadm downloads the images needed and gets everything set up for you. Once it is complete you’ll need to run the following to be able to connect to and manage your new Kubernetes stack:

[root@awx ~]# mkdir -p $HOME/.kube
[root@awx ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@awx ~]# chown $(id -u):$(id -g) $HOME/.kube/config

Now run the following and check the status of all the pods. After a short while you should end up with 7 pods all in the Running status with the exception of the two ‘coredns’ pods:

[user@computer ~]$ kubectl get pods -A
NAMESPACE     NAME                                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-f9fd979d6-7xshl                              0/1     Pending   0          4d22h
kube-system   coredns-f9fd979d6-k7qtl                              0/1     Pending   0          4d22h
kube-system   etcd-awx01.lab.core.pilue.co.uk                      1/1     Running   0          4d22h
kube-system   kube-apiserver-awx01.lab.core.pilue.co.uk            1/1     Running   0          4d22h
kube-system   kube-controller-manager-awx01.lab.core.pilue.co.uk   1/1     Running   0          4d22h
kube-system   kube-proxy-sc5fs                                     1/1     Running   0          4d22h
kube-system   kube-scheduler-awx01.lab.core.pilue.co.uk            1/1     Running   0          4d22h

Debugging Pending Pods

The ‘coredns’ pods however are stuck in the Pending status. When this happens it is usually is due to a resource constraint (CPU/memory) preventing the control plane from scheduling the pod. We can find out more about the reason by running the following:

[root@awx ~]# kubectl describe pod coredns-f9fd979d6-7xshl -n kube-system
...
Events:
  Type     Reason            Age                      From               Message
  ----     ------            ----                     ----               -------
  Warning  FailedScheduling  102s (x4737 over 4d22h)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

This returns a lot of information about the pod (most of which I have omitted above) including, right at the very end of the output, the reason the pod is stuck in the Pending state. It seems that our single node isn’t “ready”, and it mentions a “taint”, so let’s look at the node to find out more:

[root@awx ~]# kubectl describe nodes

Again, loads of information, and this time nothing untoward in the ‘Events’ section at the end of the output. If you scroll up to somewhere near the top however you’ll find the ‘Taints’ section and just below that the ‘Conditions’ section.

...
Taints:             node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
...
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 04 May 2021 19:42:27 +0000   Tue, 04 May 2021 19:42:02 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 04 May 2021 19:42:27 +0000   Tue, 04 May 2021 19:42:02 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 04 May 2021 19:42:27 +0000   Tue, 04 May 2021 19:42:02 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Tue, 04 May 2021 19:42:27 +0000   Tue, 04 May 2021 19:42:02 +0000   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
...

Here we see two taints, one of which is the one referred to in our coredns pod above (“node.kubernetes.io/not-ready”), and the other is “node-role.kubernetes.io/master:NoSchedule”. This is telling the control plane that this is the Kubernetes master node and therefore don’t run any pods on it. But in our single node stack we need our pods to all run on a single node, so we can fix this bit by just removing the taint:

[root@awx ~]# kubectl taint nodes --all node-role.kubernetes.io/master-

All very well and good, but what about the “not-ready” bit I hear you ask? The answer to that is in the conditions section. This details resource “pressure” on the node – all report OK in this case, except the last line which tells us that the “network plugin is not ready: cni config uninitialized”.

Container Networking and CNI

CNI stands for Container Network Interface, which is a specification for managing the networking within a Kubernetes cluster. We must install a suitable CNI plugin in order for our pods to be able to communicate with each other. There are a number of options here with varying features, however after a messy learning curve I settled on the Antrea plugin from VMware. To install the CNI plugin we need a YAML config file again, but this time we can download it from Github:

[root@awx ~]# wget https://github.com/vmware-tanzu/antrea/releases/download/v1.0.0/antrea.yml -O /root/antrea.yml
[root@awx ~]# kubectl apply -f /root/antrea.yml

This bit takes quite a while (nearly 5 minutes in my lab) to pull the required images and create the pods. If you keep an eye on the pod status eventually the 2 Antrea pods and shortly after the 2 coredns pods will all move to the Running status

[root@awx ~]# kubectl get pods -A
NAMESPACE     NAME                                                 READY   STATUS    RESTARTS   AGE
kube-system   antrea-agent-rq5vd                                   2/2     Running   0          7m4s
kube-system   antrea-controller-7747fbf7ff-m84cq                   1/1     Running   0          7m5s
kube-system   coredns-f9fd979d6-4vzmm                              1/1     Running   0          48m
kube-system   coredns-f9fd979d6-h5fg5                              1/1     Running   0          48m
kube-system   etcd-awx01.lab.core.pilue.co.uk                      1/1     Running   0          48m
kube-system   kube-apiserver-awx01.lab.core.pilue.co.uk            1/1     Running   0          48m
kube-system   kube-controller-manager-awx01.lab.core.pilue.co.uk   1/1     Running   0          48m
kube-system   kube-proxy-qfbfw                                     1/1     Running   0          48m
kube-system   kube-scheduler-awx01.lab.core.pilue.co.uk            1/1     Running   0          48m

Storage

Some basics first. For AWX to keep its database we need to define some persistent storage. In Kubernetes this is called a Persistent Volume (PV) and these are typically built by a Persistent Volume Claim (PVC) which is a request for a PV that meets certain specifications. Those specifications are defined by a Storage Class (SC).

Persistent data in a traditional Kubernetes multi-node cluster needs to be held on shared storage of some kind so that pods can be run on any node. But for our single node Kubernetes stack we can use a local partition to hold our data. Rancher Labs have a local storage provisioner YAML we can use for this. So let’s get it, and modify the YAML to point to the partition we defined above:

[root@awx ~]# wget https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.19/deploy/local-path-storage.yaml -O /root/local-path-storage.yaml
[root@awx ~]# sed -i "s#/opt/local-path-provisioner#/mnt/local-storage#g" /root/local-path-storage.yaml
[root@awx ~]# kubectl apply -f /root/local-path-storage.yaml

Now, if we run kubectl get sc we should see our new Storage Class that the AWX application should be able to use as part of the Persistent Volume Claim it will define.

[root@awx ~]# kubectl get sc
NAME         PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  50s

The AWX Operator

At the time of writing the current preferred way to install AWX is using a Kubernetes Operator, which is a fancy name for an application installer! First we have to configure the operator which can be done with a single line:

[root@awx ~]# kubectl apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.9.0/deploy/awx-operator.yaml

Then to deploy AWX we need to create a basic YAML file called awx.yml to define the config of our deployment:

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx

…then apply this config file and monitor the pods getting created:

[root@awx ~]# kubectl apply -f awx.yml && watch kubectl get pods

After a couple of minutes you should see the AWX operator, postgres and awx pods created:

NAME                          READY   STATUS    RESTARTS   AGE
awx-b5f6cf4d4-fpjwk           0/4     Pending   0          34s
awx-operator-f768499d-bqxx5   1/1     Running   0          13m
awx-postgres-0                0/1     Pending   0          46s

However, once again we have two pods stuck in the Pending state. Let’s look at the postgres pod to start with as it’s the first created:

[root@awx ~]# kubectl describe pods awx-postgres-0
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  9h    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

This time the reason for the pod not getting scheduled is because it “has unbound immediate PersistentVolumeClaims”. What does this mean? Let’s take a look at the Persistent Volume Claim to find out:

[root@awx ~]# kubectl describe pvc
Name:          postgres-awx-postgres-0
Namespace:     default
StorageClass:
Status:        Pending
Volume:
Labels:        app.kubernetes.io/component=database
               app.kubernetes.io/managed-by=awx-operator
               app.kubernetes.io/name=awx-postgres
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Mounted By:    awx-postgres-0
Events:
  Type    Reason         Age                    From                         Message
  ----    ------         ----                   ----                         -------
  Normal  FailedBinding  4m38s (x2301 over 9h)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

It says it couldn’t find an existing Persistent Volume to fulfil the claim (which is a request for a volume) and that no Storage Class is set. If you check higher up in the output the StorageClass entry is indeed empty. But wait a moment – didn’t we create a storage class above? Well, yes we did, however after some research I found that if an application doesn’t define a specific Storage Class then the Kubernetes cluster must have a ‘default’ storage class which will be dynamically used instead.

Setting the Default Storage Class

So let’s set our ‘local-path’ Storage Class as the default:

[root@awx ~]# kubectl patch sc local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
[root@awx ~]# kubectl get sc
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  9h

Now, it turns out that if we’d done this to start with our Persistent Volume Claim would have picked up our default Storage Class. However once the PVC is created it won’t automatically pick up our modified Storage Class. So we’ll need to export the current PVC’s config, delete the PVC, then remake it:

[root@awx ~]# kubectl get pvc postgres-awx-postgres-0 -o=yaml > awx-pvc.yml
[root@awx ~]# kubectl delete pvc postgres-awx-postgres-0
[root@awx ~]# kubectl apply -f awx-pvc.yml

Now, this time if you look at the PVC you will see it has a storage class of local-path, and that it has a status of Bound. Equally, looking at our pods we should find that the postgres pod is now running:

[root@awx ~]# kubectl get pvc
NAME                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
postgres-awx-postgres-0   Bound    pvc-7cefae87-e379-492f-854c-8cb393f85b65   8Gi        RWO            local-path     7s
root@awx01 [ ~ ]# kubectl get po
NAME                          READY   STATUS    RESTARTS   AGE
awx-b5f6cf4d4-fpjwk           0/4     Pending   0          20m
awx-operator-f768499d-bqxx5   1/1     Running   0          43h
awx-postgres-0                1/1     Running   0          20m

So, our final challenge is to figure out why the AWX pods are stuck pending – by now you know what to do:

[root@awx ~]# kubectl describe po awx-b5f6cf4d4-fpjwk
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  143m  default-scheduler  0/1 nodes are available: 1 Insufficient cpu.

Apparently we’ve run out of CPU! Or rather this pod is requesting more CPU than our cluster can guarantee. Each pod can have a CPU request and a CPU limit set. The request defines how much CPU Kubernetes should reserve for the pod (i.e. it is a guarantee of a minimum amount of available CPU). The limit defines a maximum amount of CPU a pod is allowed to use.

Resource Management

In Kubernetes the CPU requests and limits are defined in terms of numbers of CPU cores or fractions thereof. The total amount of CPU available is equal to the total number of physical cores in the cluster. So a request value of 1 would reserve an entire CPU core. A request of ‘500m’ means half a core (think 500 milli-cores!), or a request of ‘100m’ is 0.1 (or 10%) of a core.

If we look in to the node again we can see the total number of CPU requests the node is currently allocating:

[root@awx ~]# kubectl describe node
...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests     Limits
  --------           --------     ------
  cpu                1350m (67%)  0 (0%)
  memory             140Mi (3%)   340Mi (8%)
  ephemeral-storage  0 (0%)       0 (0%)
  hugepages-2Mi      0 (0%)       0 (0%)
Events:              <none>

Just above the section I’ve pasted above you can see the breakdown pod by pod to see if any could be reduced. Potentially we could reduce the requests of the antrea-agent pod (it has a CPU request of 400m) but I’m loathed to adjust the core bits and pieces that make up the cluster.

So what else can we do? Well we could just give the node more CPU cores in our VM – that would definitely fix it, but I started out wanting this VM to be lightweight enough to run on my MacMini ESXi host (which only has a dual core i5 CPU), so that’s not really an option for me. So instead let’s see what the awx pod has set in terms of CPU requests:

[user@computer ~]$ kubectl describe po awx-b5f6cf4d4-fpjwk | grep cpu
      cpu:     1
      cpu:     500m

So it looks like the pod has two containers in it which together are requesting 1.5 CPU cores (or ‘1500m’). We already saw above that the current requests on the node total 1350m, and 1500 + 1350 is definitely more than 2000m (or 2 CPU cores)!

Given that these numbers are guaranteed minimums, not actually how much CPU our lightweight cluster needs, I’m prepared to drop the requested CPU amounts for the awx pod. But how? Well we could manually edit the pod, or according to the documentation for the AWX Operator, we can set the resource requests in our awx.yml file, so let’s do that. Open up the YAML file in your text editor and modify it to look like the below:

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
spec:
  tower_web_resource_requirements:
    requests:
      cpu: 400m
      memory: 2Gi
    limits:
      cpu: 1000m
      memory: 4Gi
  tower_task_resource_requirements:
    requests:
      cpu: 250m
      memory: 1Gi
    limits:
      cpu: 500m
      memory: 2Gi

Now we can simply apply the YAML again and let kubernetes do its thing:

[root@awx ~]# kubectl apply -f awx.yml && watch kubectl get po

After several minutes or so you’ll see that a new awx pod is created and started, and the Pending one is deleted.

[root@awx ~]# kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
awx-c489b5569-k22fl           4/4     Running   0          25s
awx-operator-f768499d-bqxx5   1/1     Running   0          2d15h
awx-postgres-0                1/1     Running   0          14m

So, finally it looks like everything is running. Time to test it out by browsing to the IP address of my VM from my laptop:

Oh! Well that’s not what I expected… more digging to do!

Ingress

After some reading I learned that Kubernetes needs to “expose” an application in order for it to be accessible externally. This is done using a Kubernetes service of which there are several types including ClusterIP, NodePort , Ingress, and LoadBalancer. ClusterIP is the simplest and is the default but only exposes the application internally within the cluster. This is the default state with AWX as we can see if we take a look at our AWX services:

[root@awx ~]# kubectl get services
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
awx-operator-metrics   ClusterIP   10.107.215.113   <none>        8383/TCP,8686/TCP   6d21h
awx-postgres           ClusterIP   None             <none>        5432/TCP            24h
awx-service            ClusterIP   10.98.101.248    <none>        80/TCP              24h
kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP             6d23h

If we look at the awx-service line, it tells us that the service is of type ClusterIP, that the internal IP for the service is 10.11.71.177, that it’s available on port 80 internally. For our single node cluster perhaps the easiest way to get access externally is to change the service to a NodePort type. The AWX Operator gives us a way to do this, so change your awx.yml file to match the below and reapply it:

---
 apiVersion: awx.ansible.com/v1beta1
 kind: AWX
 metadata:
   name: awx
 spec:
   tower_ingress_type: NodePort
   tower_web_resource_requirements:
     requests:
       cpu: 400m
       memory: 2Gi
     limits:
       cpu: 1000m
       memory: 4Gi
   tower_task_resource_requirements:
     requests:
       cpu: 250m
       memory: 1Gi
     limits:
       cpu: 500m
       memory: 2Gi
[root@awx ~]# kubectl apply -f awx.yml && watch kubectl get svc

After a short period, the awx-service should change to type NodePort and it gains an extra port number:

[root@awx ~]# kubectl get svc
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
awx-operator-metrics   ClusterIP   10.107.215.113   <none>        8383/TCP,8686/TCP   6d22h
awx-postgres           ClusterIP   None             <none>        5432/TCP            24h
awx-service            NodePort    10.98.101.248    <none>        80:32441/TCP        24h
kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP             7d

It now says that the service is available on port 80 internally, and port 32441 externally. If we navigate to http://<vm ip>:32442 you should get the AWX login screen:

Hurrah! But, it’d be nicer if it was on port 80 wouldn’t it? We can do this a number of ways too, but for our single node cluster I found the simplest to make work was to use an Ingress Controller, and specifically I had the most success with the Contour Ingress Controller. More YAML, and kubectl:

[root@awx ~]# wget https://projectcontour.io/quickstart/contour.yaml -O /root/contour.yaml
[root@awx ~]# kubectl apply -f contour.yaml

Again, after a short wait you should see 4 more pods of which 3 will move to the Running status and one will show as Completed (which is expected!):

[root@awx ~]# kubectl get pods -n projectcontour
NAMESPACE        NAME                                                 READY   STATUS      RESTARTS   AGE
projectcontour   contour-8d5c9679b-c8hzz                              1/1     Running     0          2m1s
projectcontour   contour-8d5c9679b-kg4cp                              1/1     Running     0          2m1s
projectcontour   contour-certgen-v1.15.0-fmfxn                        0/1     Completed   0          2m1s
projectcontour   envoy-5jc5d                                          2/2     Running     0          2m1s

Now let’s tweak our awx.yml to use an Ingress Controller and reapply it again.

---
 apiVersion: awx.ansible.com/v1beta1
 kind: AWX
 metadata:
   name: awx
 spec:
   tower_ingress_type: Ingress
   tower_hostname: awx.lab.core.pilue.co.uk
   tower_web_resource_requirements:
     requests:
       cpu: 400m
       memory: 2Gi
     limits:
       cpu: 1000m
       memory: 4Gi
   tower_task_resource_requirements:
     requests:
       cpu: 250m
       memory: 1Gi
     limits:
       cpu: 500m
       memory: 2Gi
[root@awx ~]# kubectl apply -f awx.yml && watch kubectl get po,svc,ing

After another wait of a minute or so you should get an Ingress configured and displayed in the output. Finally things should look something like the below, and you should be able to access the AWX UI on the standard port 80 in your browser!

[root@awx ~]# kubectl get po,svc,ing
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
NAME                                READY   STATUS    RESTARTS   AGE
pod/awx-847cc945d4-4h98q            4/4     Running   0          21m
pod/awx-operator-5595d6fc57-qz8cx   1/1     Running   0          2d5h
pod/awx-postgres-0                  1/1     Running   0          28m

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/awx-operator-metrics   ClusterIP   10.107.215.113   <none>        8383/TCP,8686/TCP   7d22h
service/awx-postgres           ClusterIP   None             <none>        5432/TCP            28m
service/awx-service            NodePort    10.96.0.118      <none>        80:30374/TCP        27m
service/kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP             8d

NAME                             CLASS    HOSTS                        ADDRESS   PORTS   AGE
ingress.extensions/awx-ingress   <none>   awx.lab.core.pilue.co.uk             80      21m

HTTPS and Secrets

Just port 80 isn’t really good enough these days though is it!? So how do we get HTTPS configured? Well, there are two parts to this. The documentation for the AWX Operator says that to configure TLS we need to define the tower_ingress_tls_secret variable in our awx.yml. But what do we define as it’s value? Well this is where we need to define the name of a “secret” that is available in our Kubernetes cluster. In Kubernetes, a “secret” is a way of storing securely some confidential information, such as a password, or in this case the certificate and its private key.

So first we need to make a certificate. Here I generate a self signed certificate for simplicity, but this will work just as well with a CA signed certificate.

[root@awx ~]# openssl req -x509 -nodes -days 365 -newkey rsa:2048 -out ingress-tls.crt -keyout ingress-tls.key -subj "/CN=awx.lab.core.pilue.co.uk/O=awx-ingress-tls"

Now we need to store the certificate and private key in a Kubernetes secret:

[root@awx ~]# kubectl create secret tls awx-ingress-tls --key ingress-tls.key --cert ingress-tls.crt

Now we can modify our awx.yml to include the above variable and secret name, then apply it:

---
 apiVersion: awx.ansible.com/v1beta1
 kind: AWX
 metadata:
   name: awx
 spec:
   tower_ingress_type: Ingress
   tower_hostname: awx.lab.core.pilue.co.uk
   tower_ingress_tls_secret: awx-ingress-tls
   tower_web_resource_requirements:
     requests:
       cpu: 400m
       memory: 2Gi
     limits:
       cpu: 1000m
       memory: 4Gi
   tower_task_resource_requirements:
     requests:
       cpu: 250m
       memory: 1Gi
     limits:
       cpu: 500m
       memory: 2Gi
[root@awx ~]# kubectl apply -f awx.yml

After a minute or so the ingress will now include port 443 and you should be able to get at your AWX login screen using HTTPS!

[root@awx ~]# kubectl get ing,po,svc
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
NAME                             CLASS    HOSTS                        ADDRESS   PORTS     AGE
ingress.extensions/awx-ingress   <none>   awx01.lab.core.pilue.co.uk             80, 443   25h

NAME                                READY   STATUS    RESTARTS   AGE
pod/awx-847cc945d4-9djmm            4/4     Running   0          4m3s
pod/awx-operator-5595d6fc57-qz8cx   1/1     Running   0          3d6h
pod/awx-postgres-0                  1/1     Running   0          25h

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/awx-operator-metrics   ClusterIP   10.107.215.113   <none>        8383/TCP,8686/TCP   8d
service/awx-postgres           ClusterIP   None             <none>        5432/TCP            25h
service/awx-service            NodePort    10.96.0.118      <none>        80:30374/TCP        25h
service/kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP             9d

And Finally – Logging In!

Last but not least we need to know how to log in to our new AWX install. Not surprisingly this information is held in a secret, but is encoded as base64 so needs decoding to be of use. Here’s how to quickly get the admin password out of the secret:

[root@awx ~]# kubectl get secret awx-admin-password -o jsonpath='{.data.password}' | base64 --decode
jOK0ccKmCoNqeLt6iwpq5FS0Igenkd6u

Once you’ve logged in, you can change the password to one of your choosing. This will be held in the postgres database as normal so will be persisted from now on.

And that’s it. I hope you’ve found the layout of this post helpful in starting to learn the ins and outs of Kubernetes and how to deploy an app successfully. And I hope I got everything right too! I’m far from an expert in K8s and am right at the beginning of my journey with it so there’s a good chance there are inaccuracies above. Do let me know in the comments if you spot anything that should be corrected!

3 thoughts on “Deploying Ansible AWX (or any other app, probably!) on Kubernetes From Day 0

    1. Thanks for your comment – glad this post was helpful!

      External DB config is covered in the official docs by the look of it (https://github.com/ansible/awx-operator#external-postgresql-service) – assuming you have access to a PostgreSQL DB instance then you should be able to populate the example yaml file linked above and apply it (although I’ve not tried this so there may be other steps!)

      As for data migration, looking at the docs link you copied, it looks similar in that you create 2 Kubernetes secrets using the templates in the documentation for access to the previous AWX instance, then add refs to those secrets in to your awx.yaml

      I’ll see if I can update this post with an example in the coming days.

  1. Oh my god, thank you for this. You encountered all of the same hiccups I did in my lab k8s cluster. I learned a ton from this article. Bless you 😀

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.