Using MetalLb with Kind (LINUX)

Preamble

When using metallb with kind we are going to deploy it in l2-mode. This means that we need to be able to connect to the ip addresses of the node subnet. If you are using linux to host a kind cluster. You will not need to do this as the kind node ip addresses are directly attached.

Install Kind

(here)[https://github.com/kubernetes-sigs/kind/releases]

curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64
chmod +x ./kind
mv ./kind /some-dir-in-your-PATH/kind

curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.19.0/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl

Problem Statement

Kubernetes on bare metal doesn’t come with an easy integration for things like services of type LoadBalancer.

This mechanism is used to expose services inside the cluster using an external Load Balancing mechansim that understands how to route traffic down to the pods defined by that service.

Most implementations of this are relatively naive. They place all of the available nodes behind the load balancer and use tcp port knocking to determine if the node is “healthy” enough to forward traffic to it.

You can define an externalTrafficPolicy on a service of type LoadBalancer and this can help get the behaviour that you want. From the docs:

$ kubectl explain service.spec.externalTrafficPolicy
KIND:     Service
VERSION:  v1

FIELD:    externalTrafficPolicy <string>

DESCRIPTION:
     externalTrafficPolicy denotes if this Service desires to route external
     traffic to node-local or cluster-wide endpoints. "Local" preserves the
     client source IP and avoids a second hop for LoadBalancer and Nodeport type
     services, but risks potentially imbalanced traffic spreading. "Cluster"
     obscures the client source IP and may cause a second hop to another node,
     but should have good overall load-spreading.

And Metallb has a decent write up on what they do when you configure this stuff:

https://metallb.universe.tf/usage/#traffic-policies

With Metallb there are a different set of assumptions.

Metallb can operate in two distinct modes.

A Layer 2 mode that will use vrrp to arp out for the external ip or VIP on the lan. This means that all traffic for the service will be attracted to only one node and dispersed across the pods defined by the service fromt there.

A bgp mode with externalTrafficPolicy: local metallb will announce the external ip or VIP from all of the nodes where at least one pod is running.

the bgp mode relies on ecmp to balance traffic back to the pods. ECMP is a great solution for this problem and I HIGHLY recommend you use this model if you can.

That said I haven’t created a bgp router for my kind cluster so we wil use the l2-mode for this experiment

Let’s do this thing!

First let’s bring up a 2 node kind cluster with the following config.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

❯ kind create cluster --config config
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.19.1) 🖼 
 ✓ Preparing nodes 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a nice day! 👋

Then we need to see if we can ping the node ip of the nodes themselves.

❯ kubectl config get-contexts
CURRENT   NAME                                                  CLUSTER                                               AUTHINFO                                              NAMESPACE 
*         kind-kind                                             kind-kind                                             kind-kind                                   

❯ kubectl get nodes
NAME                 STATUS   ROLES    AGE   VERSION
kind-control-plane   Ready    master   13m   v1.19.1
kind-worker          Ready    <none>   12m   v1.19.1  

❯ kubectl get nodes -o wide
NAME                 STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                     KERNEL-VERSION       CONTAINER-RUNTIME
kind-control-plane   Ready    master   14m   v1.19.1   172.18.0.2    <none>        Ubuntu Groovy Gorilla (development branch)   4.15.0-117-generic   containerd://1.4.0
kind-worker          Ready    <none>   13m   v1.19.1   172.18.0.3    <none>        Ubuntu Groovy Gorilla (development branch)   4.15.0-117-generic   containerd://1.4.0

❯ ping 172.18.0.2
PING 172.18.0.2 (172.18.0.2) 56(84) bytes of data.
64 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from 172.18.0.2: icmp_seq=2 ttl=64 time=0.021 ms

At this point we need to determine the network that is being used for the node ip pool. Since kind nodes are associated with the docker network named “bridge” we can inspect that directly.

I am using a pretty neat tool called jid here that is a repl for json.

❯ docker network inspect bridge | jid

[Filter]> .[0].IPAM
{
  "Config": [         
    {                   
      "Gateway": "172.17.0.1",
      "Subnet": "172.17.0.0/16"
    }                 
  ],                    
  "Driver": "default",                                                   
  "Options": null                                                                      
}

So we can see that there is an allocated network of 172.17.0.0/16 in my case.

Let’s swipe the last 10 ip addresses from that allocation and use them for the metallb configuration.

Now we are going to deploy a service!

First let’s create a service of type loadbalancer and see what happens before we install metallb.

I am going to use the echo server for this. I prefer the one built by inanimate. Here is the source and image: inanimate/echo-server

❯ kubectl create deployment echo --image inanimate/echo-server --replicas=3 --port=8080
deployment.apps/echo created

❯ kubectl expose deployment echo --type=LoadBalancer
service/echo exposed

❯ kubectl get svc
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
echo         LoadBalancer   10.96.242.156   <pending>     8080:31987/TCP   43s
kubernetes   ClusterIP      10.96.0.1       <none>        443/TCP          71m

We can see that the EXTERNAL-IP field is pending. This is because there is nothing available in the cluster to manage this type of service.

Now on to the metallb part!

First read the docs https://metallb.universe.tf/installation/

Then we can get started on installing this to our cluster.

# see what changes would be made, returns nonzero returncode if different
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl diff -f - -n kube-system

# actually apply the changes, returns nonzero returncode on errors only
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl apply -f - -n kube-system

❯ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
namespace/metallb-system created

❯ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
podsecuritypolicy.policy/controller configured
podsecuritypolicy.policy/speaker configured
serviceaccount/controller created
serviceaccount/speaker created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller unchanged
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker unchanged
role.rbac.authorization.k8s.io/config-watcher created
role.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller unchanged
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker unchanged
rolebinding.rbac.authorization.k8s.io/config-watcher created
rolebinding.rbac.authorization.k8s.io/pod-lister created
daemonset.apps/speaker created
deployment.apps/controller created
   
❯ kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
secret/memberlist created

❯ kubectl get pods -n metallb-system -w
NAME                        READY   STATUS    RESTARTS   AGE
controller-fb659dc8-26w57   1/1     Running   0          119s
speaker-8mx4z               1/1     Running   0          119s
speaker-gvqnb               1/1     Running   0          119s

We can see that metallb is now installed but we aren’t done yet!

now we need to add a configuration that will use a few of the unused ip addresses from the node ip pool (172.17.0.0/16)

Now if we look at our existing service we can see that the EXTERNAL-IP is still pending

This is because we haven’t yet applied the config for metallb.

Here is the config:

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 172.17.255.1-172.17.255.250

You can apply this to your cluster with

❯ kubectl apply -f km-config.yaml
configmap/config created

❯ kubectl get svc
NAME   TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
echo   LoadBalancer   10.96.242.156   172.17.255.1   8080:31987/TCP   24m

Why always me?

Explorer

Kind: MetalLB