Skip to main content
Solved

Key collision errors after K10 cluster restore


  • Not a newbie anymore
  • 2 comments

I recently did an “unscheduled rebuild” of my cluster and restored K10 using the disaster recovery process outlined in the docs.

 

I did run into an issue with a couple of the restore pods hanging waiting for a volume to be created so worked around this by manually creating the pvc and that allowed the restore to complete successfully. 

 

I mention this because I suspect that might have something to do with the current issue I face which is the ‘media’ namespace backup is failing nightly with the following error. 

 

- cause:
    cause:
      cause:
        cause:
          cause:
            message: "[PUT /artifacts/{itemId}][409] updateArtifactConflict  &{Code:409
              Message:Conflict. Cause: Key collision. UserMessages:[]}"
          fields:
            - name: artID
              value: 38c0aa5e-c56f-11ec-98a8-02e5dc1cf223
          file: kasten.io/k10/kio/rest/clients/catalogclient.go:363
          function: kasten.io/k10/kio/rest/clients.UpdateArtifact
          linenumber: 363
          message: Unable to update artifact
        file: kasten.io/k10/kio/repository/utils.go:96
        function: kasten.io/k10/kio/repository.CreateOrUpdateRepositoryArtifact
        linenumber: 96
        message: Failed to update Repository artifact
      file: kasten.io/k10/kio/collections/kopia/manager.go:153
      function: kasten.io/k10/kio/collections/kopia.(*KopiaManager).Export
      linenumber: 153
      message: Failed to add repository artifact for collections
    file: kasten.io/k10/kio/exec/phases/phase/migrate.go:146
    function: kasten.io/k10/kio/exec/phases/phase.(*migrateSendPhase).Run
    linenumber: 146
    message: Failed to export collection
  message: Job failed to be executed
- cause:
    cause:
      cause:
        cause:
          cause:
            message: "[PUT /artifacts/{itemId}][409] updateArtifactConflict  &{Code:409
              Message:Conflict. Cause: Key collision. UserMessages:[]}"
          fields:
            - name: artID
              value: 38c0aa5e-c56f-11ec-98a8-02e5dc1cf223
          file: kasten.io/k10/kio/rest/clients/catalogclient.go:363
          function: kasten.io/k10/kio/rest/clients.UpdateArtifact
          linenumber: 363
          message: Unable to update artifact
        file: kasten.io/k10/kio/repository/utils.go:96
        function: kasten.io/k10/kio/repository.CreateOrUpdateRepositoryArtifact
        linenumber: 96
        message: Failed to update Repository artifact
      file: kasten.io/k10/kio/collections/kopia/manager.go:153
      function: kasten.io/k10/kio/collections/kopia.(*KopiaManager).Export
      linenumber: 153
      message: Failed to add repository artifact for collections
    file: kasten.io/k10/kio/exec/phases/phase/migrate.go:146
    function: kasten.io/k10/kio/exec/phases/phase.(*migrateSendPhase).Run
    linenumber: 146
    message: Failed to export collection
  message: Job failed to be executed
- cause:
    cause:
      cause:
        cause:
          cause:
            message: "[PUT /artifacts/{itemId}][409] updateArtifactConflict  &{Code:409
              Message:Conflict. Cause: Key collision. UserMessages:[]}"
          fields:
            - name: artID
              value: 38c0aa5e-c56f-11ec-98a8-02e5dc1cf223
          file: kasten.io/k10/kio/rest/clients/catalogclient.go:363
          function: kasten.io/k10/kio/rest/clients.UpdateArtifact
          linenumber: 363
          message: Unable to update artifact
        file: kasten.io/k10/kio/repository/utils.go:96
        function: kasten.io/k10/kio/repository.CreateOrUpdateRepositoryArtifact
        linenumber: 96
        message: Failed to update Repository artifact
      file: kasten.io/k10/kio/collections/kopia/manager.go:153
      function: kasten.io/k10/kio/collections/kopia.(*KopiaManager).Export
      linenumber: 153
      message: Failed to add repository artifact for collections
    file: kasten.io/k10/kio/exec/phases/phase/migrate.go:146
    function: kasten.io/k10/kio/exec/phases/phase.(*migrateSendPhase).Run
    linenumber: 146
    message: Failed to export collection
  message: Job failed to be executed


Here are one of the problem pvcs from the restore

 

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2023-02-04T02:19:58Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    kasten.io/backup-volume: enabled
    kustomize.toolkit.fluxcd.io/name: apps-media-jellyfin
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: jellyfin-config-v1
  namespace: media
  resourceVersion: "19044618"
  uid: 3a2355f7-2197-43fa-8853-2043bef564af
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi
  storageClassName: rook-ceph-block
  volumeMode: Filesystem
  volumeName: pvc-3a2355f7-2197-43fa-8853-2043bef564af
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 30Gi
  phase: Bound

 

I have tried deleting ALL of the exported backups and shapshots and even went so far as recreating the policy but the error comes back. How can I resolve this to get the backups back to a working state? My next step will probably be to get drastic and do a complete reinstall of K10 but I am hoping to avoid it if possible.

 

Thanks in advance for your input.

Best answer by rust84

I’ve resolved my issue for now with the help of the support team. In case somebody comes across this issue in their own install and needs a quick workaround, then creating a new profile pointed to the same NFS dir with a different profile name does the trick.

This generated a new repositoryArtifact removing the conflict and allowed the export to complete successfully. Thanks to @jaiganeshjk for the excellent support.

View original
Did this topic help you find an answer to your question?

5 comments

Madi.Cristil
Forum|alt.badge.img+8
  • Community Manager
  • 617 comments
  • February 27, 2023

jaiganeshjk
Forum|alt.badge.img+2
  • Experienced User
  • 275 comments
  • February 27, 2023

@rust84 I am not exactly sure on this error, However from the message, it suggests that there is an artifact in the catalog that has the same key/value as the resource that you are trying to backup.

 

There are few logs and outputs which we would like to look at and it could be better handled and tracked through a case(As this might involve looking through the saved artifacts from your namespace).

 

Would you be able to open a case with us from my.veeam.com and select Veeam Kasten by K10 Trial under the product to raise a case with the debug logs?


  • Author
  • Not a newbie anymore
  • 2 comments
  • March 1, 2023

Thanks @jaiganeshjk I have raised a case now with the requested logs.


  • Author
  • Not a newbie anymore
  • 2 comments
  • Answer
  • March 14, 2023

I’ve resolved my issue for now with the help of the support team. In case somebody comes across this issue in their own install and needs a quick workaround, then creating a new profile pointed to the same NFS dir with a different profile name does the trick.

This generated a new repositoryArtifact removing the conflict and allowed the export to complete successfully. Thanks to @jaiganeshjk for the excellent support.


jaiganeshjk
Forum|alt.badge.img+2
  • Experienced User
  • 275 comments
  • March 16, 2023

@rust84 Thank you for your support and patience thus far.

We found the root cause of the issue in this case and we are currently working on enhancing the product to avoid such situation in future.

We don’t expect/support importing the restorepoints in the same k10 installation where it was initially exported from.

Below is the gist of what caused this issue.

 

You seem to have run an import for the media namespace after the catalog is restored using K10 DR.
Prior to 5.5.4 we didn't have the concept of imported repositories.
We introduced a change after 5.5.4 for repositoryArtifact to track and find the difference between imported and exported repositories(imported repositories are set to read-only)

with this particular timing, you happened to DR restore data to a catalog state where the export-side repo artifact didn't yet have API keys, prior to 5.5.4. Then, prior to running another export to that same repo using 5.5.4, you happened to run an import into the same k10 instance first. Due to the hashing divergence, the catalog happily added the import-side repo artifact (API keys included), which didn't clash with any existing API keys.

Then when you started running export again, causing a conflict with the new import-side artifact which already had the expected keys.

Two conflicting artifacts created on two different versions of k10
Created in April 2022:
"path": "media"
"repoPath": "media/kopia"
Created February 2023:
"path": "media/media"
"repoPath": "kopia

Comment