# How src executes a Batch Spec

<p className="subtitle">
	This document helps you debug and troubleshoot the writing and execution of
	Batch Specs with the [Sourcegraph CLI `src`](/cli/) command.
</p>

Here, you will learn what happens when a user applies or previews a Batch Spec by running `src batch apply` or `src batch preview` commands.

## Overview

`src batch apply` and `src batch preview` execute a batch spec the same way by following these steps:

-   [How src executes a Batch Spec](#how-src-executes-a-batch-spec)
    -   [Overview](#overview)
    -   [Parse the batch spec](#parse-the-batch-spec)
    -   [Resolving namespace](#resolving-namespace)
    -   [Preparing container images](#preparing-container-images)
    -   [Resolving repositories](#resolving-repositories)
    -   [Executing steps](#executing-steps)
        -   [Download archive and prepare](#download-archive-and-prepare)
        -   [Run the steps](#run-the-steps)
        -   [Create final diff](#create-final-diff)
        -   [Saving a changeset spec](#saving-a-changeset-spec)
    -   [Importing changesets](#importing-changesets)
    -   [Sending changeset specs](#sending-changeset-specs)
    -   [Sending the batch spec](#sending-the-batch-spec)
    -   [Preview or apply the batch spec](#preview-or-apply-the-batch-spec)
        -   [Halt execution](#halt-execution)

The only difference is the last step, i.e., Preview or apply the batch spec. Here, the `src batch apply` command applies the batch spec, and the `src batch preview` prints a URL that gives you a preview of what would change if you applied the batch spec.

Let's learn about each step in more detail.

## Parse the batch spec

`src` reads in, parses, and validates the batch spec YAML specified with the `-f` flag. It validates the batch spec against its [schema](https://github.com/sourcegraph/src-cli/blob/main/schema/batch_spec.schema.json). It then performs some semantic checks to make sure that, for example, `changesetTemplate` is specified if `steps` are specified or that no feature is used that's not supported by the Sourcegraph instance.

## Resolving namespace

`src` resolves the given namespace to apply/preview the batch spec by sending a GraphQL request to the Sourcegraph instance to fetch the ID for the given namespace name.

If no namespace is specified with `-namespace` (or `-n`) then the currently authenticated user is used as the namespace. Learn more about how to [Connect to Sourcegraph](/cli/quickstart#connect-to-sourcegraph) in the CLI docs for details on how to authenticate.

## Preparing container images

If the batch spec contains `steps`, then for each step, `src` checks its `container` image to see whether it's locally available.

To do so, it runs `docker image inspect --format {{.Id}} -- <container-image-name>` to get the specific image ID for the container image.

If that fails with a **No such image** error, `src` tries to pull the image by running `docker image pull <container-image-name>` and then running `docker image inspect --format {{.Id}} -- <container-image-name>` again.

## Resolving repositories

`src` resolves each entry in the batch spec's `on` property to produce a **unique list of repositories (!)** in which to execute the batch spec's `steps`. With an `on` property like this:

```yaml
on:
    - repositoriesMatchingQuery: lang:go fmt.Sprintf("%d", :[v]) patterntype:structural -file:vendor
    - repositoriesMatchingQuery: repohasfile:README
    - repository: github.com/sourcegraph/sourcegraph
    - repository: github.com/sourcegraph/automation-testing
      branch: thorstens-test-branch
```

`src` will do the following:

-   For each `repositoriesMatchingQuery`, it will:

    -   Send a request to the Sourcegraph API to execute the search query
    -   Collect each result's repository: The ID, name, default branch, and the current revision of the default branch. If the search result is a **repository result**, i.e., a search query of `type:repo` only produces repositories, that's used. If it's a file match, then the file match's repository is used
    -   (Optional): If the results are file matches, then their path in the repository is also saved so that they can be used in the `steps` with [templating](/batch-changes/batch-spec-templating)

-   For each `repository` **without** a `branch`, it will:

    -   Send a request to the Sourcegraph API to get the repository's ID, name, default branch, and the current revision of the default branch

-   For each `repository` **with** a `branch`, it will:
    -   Send a request to the Sourcegraph API to get the repository's ID, name, and the current revision of the specified `branch`
    -   It then creates a unique list of all repositories yielded by the previous three steps by going through all repositories and comparing them, skipping those where no current revision of a branch could be resolved, and checking whether they're on a supported code host. If they are on unsupported code hosts and no `-allow-unsupported` flag is given, then a warning is printed, and the repositories are not added to the list

## Executing steps

If a batch spec contains `steps`, then `src` executes the steps **locally** on the machine on which `src` is run for each repository yielded by the previous [Resolving repositories](#resolving-repositories) step.

If `-clear-cache` is not set and it previously executed the same `steps` for the same repository at the same revision of the base branch, it will try to use cached results instead of re-executing the steps.

`src` does the following for each repository:

### Download archive and prepare

Downloads the archive of a repository. It's equivalent to:

```bash
curl -L -v -X GET -H 'Accept: application/zip' \
-H 'Authorization: token <SRC_ACCESS_TOKEN>' \
'http://sourcegraph.example.com/github.com/my-org/my-repo@refs/heads/master/-/raw' \
--output ~/tmp/my-repo.zip
```

-   Unzip the archive into the workspace. Where the workspace lives depends on the workspace mode, which can be controlled by the `-workspace` flag. The two modes are:
    -   **Bind mount mode** (the default everywhere except Intel macOS), this will be somewhere on the filesystem, e.g., `~/.cache/sourcegraph/batch-changes` (see `src batch preview -h` for the default value of cache directory, overwrite with `-cache`)
    -   **Volume mount mode** (the default on Intel macOS), a Docker volume will be created using `docker volume create` and attached to all running containers, then removed before `src` exits
-   `cd` into the workspace, which now contains the unzipped archive
-   In the workspace, create a Git repository. Configure `git` to not use local configuration (see [the code for explanations on what each variable does](https://github.com/sourcegraph/src-cli/blob/54fedaf3bfcf21ad3a8d89d9d2d361c8c6da6441/internal/batches/git.go#L13-L26)):

```bash
export GIT_CONFIG_NOSYSTEM=1 \
GIT_CONFIG=/dev/null \
GIT_AUTHOR_NAME=Sourcegraph \
GIT_AUTHOR_EMAIL=batch-changes@sourcegraph.com \
GIT_COMMITTER_NAME=Sourcegraph \
GIT_COMMITTER_EMAIL=batch-changes@sourcegraph.com
```

-   Next, use the following:
    -   Run `git init`
    -   Run `git config --local user.name Sourcegraph`
    -   Run `git config --local user.email batch-changes@sourcegraph.com`
    -   Run `git add --force --all`
    -   Run `git commit --quiet --all -m sourcegraph-batch-changes`

### Run the steps

For each step in the batch spec `steps`:

-   Probe container image (the `container` property of the step) to see whether it has `/bin/sh` or `/bin/bash`
-   Write the step's `run` command to a temp file on the host, e.g., `/tmp-script`
-   Run `chmod 644 /tmp-script`
-   Run the Docker container. The exact command will depend on the workspace mode:
    -   For the **Bind mount** mode:

```bash
docker run --rm --init --workdir /work \
--mount type=bind,source=/unzipped-archive-locally,target=/work \
--mount type=bind,source=/tmp-script,target=/tmp-file-in-container \
--entrypoint /bin/bash -- <IMAGE> /tmp-file-in-container
```

-   For the **Volume mount** mode:

```bash
docker run --rm --init --workdir /work \
--mount type=volume,source=temporary-docker-volume-id,target=/work \
--mount type=bind,source=/tmp-script,target=/tmp-file-in-container \
--entrypoint /bin/bash -- <IMAGE> /tmp-file-in-container
```

-   Finally, add all produced changes to the git index via `git add --all`

### Create final diff

In the workspace, create a diff by running: `git diff --cached --no-prefix --binary`

### Saving a changeset spec

`src` adds the produced diff to the local cache so that re-executing the same steps in the same repository can be skipped if the base branch has not changed.

`src` then creates a changeset spec from:

-   The diff
-   The `changesetTemplate`, and
-   Information about the repository in which the changes have been made (the name and ID of the repository, the revision of its base branch)

A changeset spec is a description of what the changeset should look like.

## Importing changesets

If the batch spec contains [`importChangesets`](/batch-changes/batch-spec-yaml-reference#importchangesets), then `src` goes through the list of `importChangesets`, and for each entry it:

-   Resolves the repository name, trying to get an ID, base branch, and revision for the given repository name
-   Parses the `externalIDs`, checking that they're valid strings or numbers
-   For each external ID, it saves a changeset spec that describes that a changeset with the given external ID in the given repository should be imported and tracked in the batch change

## Sending changeset specs

The previous two steps, [Executing steps](#executing-steps) and [Importing changesets](#importing-changesets), can produce changeset specs, each one describing either a changeset to create or to import. These changeset specs are now uploaded to the connected Sourcegraph instance, with one request per changeset spec.

Each request yields an ID that uniquely identifies the changeset spec on the Sourcegraph instance. These IDs are used for the next step.

## Sending the batch spec

The IDs of the changeset specs that were created in the previous step, [Sending changeset specs](#sending-changeset-specs), are collected into a list and used for the next request with which `src` uploads the batch spec to the connected Sourcegraph instance.

`src` creates the batch spec on the Sourcegraph instance, together with the changeset spec IDs, so that the batch spec fully describes the desired state of a batch change: its name, its description, and which changesets should be created or imported from which repository on which code host.

That request yields an ID uniquely identifying this expanded batch spec version.

## Preview or apply the batch spec

If `src batch apply` was used, then the ID of the batch change is used to send another request to the Sourcegraph instance to apply the batch spec.

If `src batch preview` was used to execute and create the batch spec, then a URL is printed, pointing to a preview page on the Sourcegraph instance, on which you can see what would happen if you were to apply the batch spec.

### Halt execution

If you encounter an error while running the `src batch preview` command, there is a fail-fast mode that halts the batch change execution immediately. To implement this, you can add a `-fail-fast` flag to the `src batch preview` and `src batch apply` commands. Once added, your execution should immediately halt on the first error instead of continuing with other repositories. This streamlines the iteration loop for users building a batch change who want to identify errors quickly.
