Using MetalLb with Kind (LINUX)
Preamble
When using metallb with kind we are going to deploy it in l2-mode.
This means that we need to be able to connect to the ip addresses of the
node subnet. If you are using linux to host a kind cluster. You will not
need to do this as the kind node ip addresses are directly attached.
Install Kind
(here)[https://github.com/kubernetes-sigs/kind/releases]
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64
chmod +x ./kind
mv ./kind /some-dir-in-your-PATH/kind
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.19.0/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
Problem Statement
Kubernetes on bare metal doesn’t come with an easy integration for things like services of type LoadBalancer.
This mechanism is used to expose services inside the cluster using an external Load Balancing mechansim that understands how to route traffic down to the pods defined by that service.
Most implementations of this are relatively naive. They place all of the available nodes behind the load balancer and use tcp port knocking to determine if the node is “healthy” enough to forward traffic to it.
You can define an externalTrafficPolicy on a service of type
LoadBalancer and this can help get the behaviour that you want. From
the docs:
$ kubectl explain service.spec.externalTrafficPolicy
KIND: Service
VERSION: v1
FIELD: externalTrafficPolicy <string>
DESCRIPTION:
externalTrafficPolicy denotes if this Service desires to route external
traffic to node-local or cluster-wide endpoints. "Local" preserves the
client source IP and avoids a second hop for LoadBalancer and Nodeport type
services, but risks potentially imbalanced traffic spreading. "Cluster"
obscures the client source IP and may cause a second hop to another node,
but should have good overall load-spreading.
And Metallb has a decent write up on what they do when you configure this stuff:
https://metallb.universe.tf/usage/#traffic-policies
With Metallb there are a different set of assumptions.
Metallb can operate in two distinct modes.
A Layer 2 mode that will use vrrp to arp out for the external ip or VIP on the lan. This means that all traffic for the service will be attracted to only one node and dispersed across the pods defined by the service fromt there.
A bgp mode with externalTrafficPolicy: local metallb will announce the
external ip or VIP from all of the nodes where at least one pod is
running.
the bgp mode relies on ecmp to balance traffic back to the pods. ECMP is a great solution for this problem and I HIGHLY recommend you use this model if you can.
That said I haven’t created a bgp router for my kind cluster so we wil use the l2-mode for this experiment
Let’s do this thing!
First let’s bring up a 2 node kind cluster with the following config.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker❯ kind create cluster --config config
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.19.1) 🖼
✓ Preparing nodes 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Have a nice day! 👋
Then we need to see if we can ping the node ip of the nodes themselves.
❯ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* kind-kind kind-kind kind-kind
❯ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready master 13m v1.19.1
kind-worker Ready <none> 12m v1.19.1
❯ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kind-control-plane Ready master 14m v1.19.1 172.18.0.2 <none> Ubuntu Groovy Gorilla (development branch) 4.15.0-117-generic containerd://1.4.0
kind-worker Ready <none> 13m v1.19.1 172.18.0.3 <none> Ubuntu Groovy Gorilla (development branch) 4.15.0-117-generic containerd://1.4.0
❯ ping 172.18.0.2
PING 172.18.0.2 (172.18.0.2) 56(84) bytes of data.
64 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from 172.18.0.2: icmp_seq=2 ttl=64 time=0.021 ms
At this point we need to determine the network that is being used for the node ip pool. Since kind nodes are associated with the docker network named “bridge” we can inspect that directly.
I am using a pretty neat tool called
jid here that is a repl for json.
❯ docker network inspect bridge | jid
[Filter]> .[0].IPAM
{
"Config": [
{
"Gateway": "172.17.0.1",
"Subnet": "172.17.0.0/16"
}
],
"Driver": "default",
"Options": null
}
So we can see that there is an allocated network of 172.17.0.0/16 in
my case.
Let’s swipe the last 10 ip addresses from that allocation and use them for the metallb configuration.
Now we are going to deploy a service!
First let’s create a service of type loadbalancer and see what happens before we install metallb.
I am going to use the echo server for this. I prefer the one built by
inanimate. Here is the
source and image:
inanimate/echo-server
❯ kubectl create deployment echo --image inanimate/echo-server --replicas=3 --port=8080
deployment.apps/echo created
❯ kubectl expose deployment echo --type=LoadBalancer
service/echo exposed
❯ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
echo LoadBalancer 10.96.242.156 <pending> 8080:31987/TCP 43s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 71m
We can see that the EXTERNAL-IP field is pending. This is because
there is nothing available in the cluster to manage this type of
service.
Now on to the metallb part!
First read the docs https://metallb.universe.tf/installation/
Then we can get started on installing this to our cluster.
# see what changes would be made, returns nonzero returncode if different
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl diff -f - -n kube-system
# actually apply the changes, returns nonzero returncode on errors only
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl apply -f - -n kube-system
❯ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
namespace/metallb-system created
❯ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
podsecuritypolicy.policy/controller configured
podsecuritypolicy.policy/speaker configured
serviceaccount/controller created
serviceaccount/speaker created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller unchanged
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker unchanged
role.rbac.authorization.k8s.io/config-watcher created
role.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller unchanged
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker unchanged
rolebinding.rbac.authorization.k8s.io/config-watcher created
rolebinding.rbac.authorization.k8s.io/pod-lister created
daemonset.apps/speaker created
deployment.apps/controller created
❯ kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
secret/memberlist created
❯ kubectl get pods -n metallb-system -w
NAME READY STATUS RESTARTS AGE
controller-fb659dc8-26w57 1/1 Running 0 119s
speaker-8mx4z 1/1 Running 0 119s
speaker-gvqnb 1/1 Running 0 119s
We can see that metallb is now installed but we aren’t done yet!
now we need to add a configuration that will use a few of the unused ip
addresses from the node ip pool (172.17.0.0/16)
Now if we look at our existing service we can see that the EXTERNAL-IP
is still pending
This is because we haven’t yet applied the config for metallb.
Here is the config:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 172.17.255.1-172.17.255.250You can apply this to your cluster with
❯ kubectl apply -f km-config.yaml
configmap/config created
❯ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
echo LoadBalancer 10.96.242.156 172.17.255.1 8080:31987/TCP 24m
Why always me?