Automating Large-Scale Dataset Migrations with Background Coding Agents

By • min read

Overview

Migrating thousands of datasets across downstream consumer services can be a daunting task—prone to human error, time-consuming, and often blocking development velocity. At Spotify, we tackled this challenge by combining three powerful tools: Honk (our background coding agent), Backstage (the developer portal), and Fleet Management (our service orchestration layer). This tutorial walks through how you can build a similar system to automate dataset migrations with confidence and minimal manual overhead.

Automating Large-Scale Dataset Migrations with Background Coding Agents
Source: engineering.atspotify.com

By the end of this guide, you’ll understand how to design background agents that decode migration rules, apply schema transformations, and coordinate updates across a fleet of services—all while maintaining observability and rollback capabilities.

Prerequisites

Before diving in, ensure you have the following:

Step-by-Step Instructions

1. Model Your Migration as a Backstage Entity

Backstage serves as the central catalog for all your services, datasets, and migrations. Create a new entity type called Migration with the following YAML template:

# catalog/migration-template.yaml
apiVersion: backstage.io/v1beta1
kind: Component
metadata:
  name: ${{ name }}
  annotations:
    honk/agent: honk-migration-agent
    fleet-manager/target-group: ${{ fleet_group }}
spec:
  type: dataset-migration
  lifecycle: experimental
  owner: ${{ owner }}
  dependsOn:
    - component:default/${{ source_dataset }}
    - component:default/${{ target_dataset }}

This entity links the migration to a Honk agent (via annotation) and defines its dependencies. Register this template in Backstage so developers can create migration requests with standard fields like source_dataset, target_dataset, and migration_strategy.

2. Define Migration Rules in Honk

Honk agents are triggered by Backstage entity lifecycle events. Create a migration agent script that reads the entity spec and generates a plan:

# honk/agents/migration_agent.py
import yaml
from honk import Agent, Plan, Step

class DatasetMigrationAgent(Agent):
    def generate_plan(self, entity):
        spec = entity['spec']
        source = spec['source_dataset']
        target = spec['target_dataset']
        strategy = spec['migration_strategy']

        plan = Plan(name=f"migrate-{source}-to-{target}")
        
        if strategy == 'schema-transform':
            plan.add_step(Step(
                name='validate-schema',
                action='validate',
                params={'source': source, 'target': target}
            ))
            plan.add_step(Step(
                name='transform-data',
                action='transform',
                params={'mapping_file': spec['mapping_file']}
            ))
        elif strategy == 'bulk-copy':
            plan.add_step(Step(
                name='copy-dataset',
                action='copy',
                params={'source': source, 'target': target}
            ))
        # ... additional strategies
        return plan

Honk will store this plan and execute it in a background thread, reporting status back to Backstage.

3. Orchestrate with Fleet Management

Once Honk generates the plan, it needs to coordinate with Fleet Management to safely roll out changes across services. Use Fleet Management’s rollout API:

# fleet_manager/rollout.py
import requests

def trigger_rollout(service_name, migration_id):
    payload = {
        'service': service_name,
        'migration_id': migration_id,
        'rollout_strategy': 'canary',  # or 'blue-green'
        'max_instances': 5
    }
    response = requests.post(
        'https://fleet-manager.internal/rollouts',
        json=payload
    )
    response.raise_for_status()
    return response.json()['rollout_id']

Integrate this call into the Honk agent after each migration step that modifies consumer code or data schemas. For example, after a schema transform, the agent can instruct Fleet Management to gradually update services that read the old schema.

Automating Large-Scale Dataset Migrations with Background Coding Agents
Source: engineering.atspotify.com

4. Hook Everything Together via Backstage Actions

Backstage provides a scaffolder plugin that can trigger Honk plans directly. Create a custom action:

# backstage-plugin/scaffolder/actions/migrateDataset.ts
import { createTemplateAction } from '@backstage/plugin-scaffolder-node';

export const migrateDatasetAction = createTemplateAction({
  id: 'honk:migrate-dataset',
  async handler(ctx) {
    const { entityRef, source, target } = ctx.input;
    // Fetch Honk agent endpoint from entity annotations
    const honkUrl = await ctx.discovery.getUrl('honk');
    const response = await fetch(`${honkUrl}/agents/migrate`, {
      method: 'POST',
      body: JSON.stringify({ entityRef, source, target }),
    });
    ctx.output('migrationId', response.json().id);
  },
});

Now, when a developer creates a new migration entity in Backstage, the scaffolder can automatically call this action—starting Honk and notifying Fleet Management—all from a simple form.

5. Monitor and Rollback

Add observability via Backstage’s TechDocs and custom plugins. Honk can log every step to a structured log stream, which feeds into a monitoring dashboard. Create a rollback step in Honk that reverses the last plan action:

def rollback(plan: Plan):
    for step in reversed(plan.steps):
        if step.status == 'completed':
            reverse_action = reverse_mapping[step.action]
            reverse_action(step.params)
            # Notify Fleet Management to rollback services
            fleet_manager.rollback(step.params['rollout_id'])

Include this rollback as a manual trigger in Backstage, so operators can click a “Rollback” button that restores the previous state.

Common Mistakes

Summary

By combining Honk’s background coding agents with Backstage’s catalog and Fleet Management’s orchestration, you can automate even the most complex dataset migrations. The key takeaways are: model migrations as entities, let agents generate and execute plans, coordinate with fleet operations, and always build in observability and rollback. This approach reduces manual toil, increases safety, and scales to thousands of datasets.

Recommended

Discover More

Nvidia's $2.1B Investment in IREN: What It Means for AI Data CentersDouble Fine Joins Growing Union Movement Among Microsoft Game StudiosBoosting Hyperscale Efficiency with AI Agents at MetaUX Designers Face ‘Production-Ready’ Demand as AI Forces Role Evolution, Experts Warn of Competency CrisisAI Showdown: Which Chatbot Gives the Best Advice for Selling Your Car?