Skip to main content
Solved

Kasten changing workload nordport svc on import jobs


Hey all - We’re trialing Kasten as a way to backup our clusters and also to have a ‘hot’ fail-over cluster in certain environments.

Part of this deployment means I have ‘Cluster A’ backup it’s workloads, and ‘Cluster B’ importing them and restoring them to its cluster.

The problem we are seeing is when cluster B imports and restores the workload, all our nodePort services are changing to random ports.

I know I could go add each service ‘manually’ to cluster B to mitigate, but I’d like to avoid this if possible.

I’ve also looked at transforms during import / restore, but I am unable to get the cluster to accept my replacement nodePort.

Is there a way to turn this off this random nodePort generation so that we get a 1 - 1 replication of our workloads?

Cheers!
 

Best answer by michael-courcy

At the moment no.

 

But the fact is that needing to control the value of a nodePort for a lot of workloads is quite uncommon. At least from my experience. And I doubt it scales on the long term because that creates situation where you are expected to define each of your loadbalancers and control the availability of the ports you define. 

 

There is usually two approaches to solve this issue : create a service of type loadBalancer which in turn manage the loadbalancer for you, but this approach means a connector between kubertenes and your load balancer API and are not so easy to implement on premise (even if solution like MetalLB make it easier). The second and more pragmatic approach is to create an ingress controller (backed by a single loadbalancer) and you define ingress in your different namespaces.

View original
Did this topic help you find an answer to your question?

14 comments

  • Comes here often
  • 11 comments
  • December 13, 2021

Hi JT 

 

The issue is probably not Kasten but rather your cluster that control the open port on the nodes. 

 

To test it you can backup the definition of your svc in A cluster in yaml format

kubectl get svc <my-svc> -n <my-ns> > svc.yaml

and restore it in B cluster with

kubectl apply -f svc.yaml

nodePort will be changed, this process does not involve Kasten.

 

Let us know.

Regards


  • Author
  • New Here
  • 6 comments
  • December 13, 2021

Hey @michael-courcy , Thanks for the reply.

I should have specified... we have nodePort values defined in our service yamls, so that external load balancers always have the same ports to target. We do not let the cluster choose a nodePort value for us.

Doing the above, applies the correct port to the nodePort service on cluster B as expected. This is just the same as me deploying our workload to the cluster manually.

I still think this is an issue with Kasten when a backup is imported.


  • Comes here often
  • 11 comments
  • December 13, 2021

Ok could you send me one of your svc.yaml file that I can test it out on my side ? 


  • Author
  • New Here
  • 6 comments
  • December 13, 2021

Try this:
Its part of a larger deployment file for this workload but should work fine for testing.
 

---
apiVersion: v1
kind: Namespace
metadata:
  name: str-home
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  creationTimestamp: null
  labels:
    io.kompose.service: str-home
  name: str-home-svc-np
  namespace: str-home
spec:
  externalTrafficPolicy: Cluster
  ports:
  - name: "str-home-svc-np"
    nodePort: 30901
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    io.kompose.service: str-home
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

 


  • Comes here often
  • 11 comments
  • December 13, 2021

Ok was able to repro 

oc create -f str-home.yaml 
namespace/str-home created
service/str-home-svc-np created
oc get svc -n str-home
NAME              TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
str-home-svc-np   NodePort   172.30.196.86   <none>        8080:30901/TCP   15s
# Run a backup with kasten
# Delete in order to restore
oc delete -f str-home.yaml 
namespace "str-home" deleted
service "str-home-svc-np" deleted
# Run a restore 
sysmic@MBP-de-Michael str-home % oc get svc -n str-home    
NAME              TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
str-home-svc-np   NodePort   172.30.29.70   <none>        8080:30119/TCP   23s

I open an issue. Thanks for catching that.


  • Comes here often
  • 11 comments
  • December 13, 2021

After discussion with the engineering team that is on purpose to avoid nodePort conflict. We have many customer that duplicate their namespace by restoring them in another namespace. In this case we are going to run into conflict.

 

The Service "str-home-svc-np" is invalid: spec.ports[0].nodePort: Invalid value: 30901: provided port is already allocated


  • Author
  • New Here
  • 6 comments
  • December 13, 2021
michael-courcy wrote:

After discussion with the engineering team that is on purpose to avoid nodePort conflict. We have many customer that duplicate their namespace by restoring them in another namespace. In this case we are going to run into conflict.

 

The Service "str-home-svc-np" is invalid: spec.ports[0].nodePort: Invalid value: 30901: provided port is already allocated

Understood - Is there something we can add in here to detect if its being restored as an import job? or perhaps the transform function could be looked at?

 


  • Comes here often
  • 11 comments
  • December 14, 2021

You could try a transform function using add operation to the path /spec/ports/0/nodePort and the value 30901. Let me know.


  • Author
  • New Here
  • 6 comments
  • December 14, 2021
michael-courcy wrote:

You could try a transform function using add operation to the path /spec/ports/0/nodePort and the value 30901. Let me know.

I’ve turned on the transform again to capture the error I see.

Note the error does not occur without the quotation marks around the port value, however the nodePort does not get changed. 

cause:
  cause:
    cause:
      message: 'v1.Service.Spec: v1.ServiceSpec.Ports: []v1.ServicePort:
        v1.ServicePort.NodePort: readUint32: unexpected character: �, error
        found in #10 byte of ...|odePort":"30901","po|..., bigger context
        ...|k","ports":[{"name":"str-home-svc-np","nodePort":"30901","port":8080,"protocol":"TCP","targetPort":8|...'
    function: kasten.io/k10/kio/kube.wrapError
    linenumber: 105
    message: Failed to decode spec into object
  fields:
    - name: instance
      value: RestoreSpecsForResources
  function: kasten.io/k10/kio/exec/phases/phase.(*restoreK8sSpecsPhase).Run
  linenumber: 68
  message: Failed to restore spec artifacts
message: Job failed to be executed
fields: []


Here’s the transform:

 

 


  • Comes here often
  • 11 comments
  • December 15, 2021

you’re right, look like the removal of the port happen after the transformation. 

 

We can’t go further I’m afraid.


  • Author
  • New Here
  • 6 comments
  • December 15, 2021

So other than manually doing it for heaps and heaps of workloads, there’s no way at all to add the port I actually want?


  • Comes here often
  • 11 comments
  • Answer
  • December 15, 2021

At the moment no.

 

But the fact is that needing to control the value of a nodePort for a lot of workloads is quite uncommon. At least from my experience. And I doubt it scales on the long term because that creates situation where you are expected to define each of your loadbalancers and control the availability of the ports you define. 

 

There is usually two approaches to solve this issue : create a service of type loadBalancer which in turn manage the loadbalancer for you, but this approach means a connector between kubertenes and your load balancer API and are not so easy to implement on premise (even if solution like MetalLB make it easier). The second and more pragmatic approach is to create an ingress controller (backed by a single loadbalancer) and you define ingress in your different namespaces.


Geoff Burke
Forum|alt.badge.img+22
  • Veeam Legend, Veeam Vanguard
  • 1317 comments
  • December 20, 2021

Hi Guys,

 

I would also think that moving away from Nodeports is a good idea. I have used it for testing but it becomes a headache after awhile. Metallb is my first choice, very easy to install and get working. Then you just have to open the range you give it. 

 

cheers


  • Author
  • New Here
  • 6 comments
  • December 20, 2021
Geoff Burke wrote:

Hi Guys,

 

I would also think that moving away from Nodeports is a good idea. I have used it for testing but it becomes a headache after awhile. Metallb is my first choice, very easy to install and get working. Then you just have to open the range you give it. 

 

cheers

Yes, we use both nodePorts and MetalLB, for differing circumstances. We do not use ingress controllers as it doesn’t suit our deployments.

The original goal here was to have Kasten work in our favor for setting up fail-over clusters should the worst happen. We control the fail-over at a higher level than our k8s clusters.

nodePorts are considerably more flexible with our deployment of external LBs and the mixture of public and private cloud nodes. We don’t always have network control, so nodePort often wins over MetalLB.

Due to nodePort changing in restore jobs, we would have to automate updating all our external LB’s to update with the right endpoint.

Using MetalLB works fine for restore jobs, but then we run into issues such as IP conflict, as our failover clusters will be using the same IP as live,


Comment