# Search Jobs

<TierCallout>
	Supported on [Enterprise](/pricing/plans/enterprise) plans.
</TierCallout>

Use Search Jobs to search code at scale for large-scale organizations.

Search Jobs allows you to run search queries across your organization's codebase (all repositories, branches, and revisions) at scale. It enhances the existing Sourcegraph's search capabilities, enabling you to run searches without query timeouts or incomplete results.

With Search Jobs, you can start a search, let it run in the background, and then download the results from the Search Jobs UI when it's done. Site administrators can **enable** or **disable** the Search Jobs feature, making it accessible to all users on the Sourcegraph instance.

## Using Search Jobs

To use Search Jobs, you need to:

-   Run a search query from your Sourcegraph instance
-   Click the result menu below the search bar to see if your query is supported by Search Jobs

![run-query-for-search-jobs](https://storage.googleapis.com/sourcegraph-assets/Docs/search-jobs/search-jobs-create.png)

-   If your query is valid, click **Create a search job** to initiate the search job
-   You will be redirected to the "Search Jobs UI" page at `/search-jobs`, where you can view all your created search jobs. If you're a site admin, you can also view search jobs from other users on the instance
-   For each search job, you can download the results by clicking the **Download** button. The results are formatted as JSON lines, see [Search results format](#search-results-format) for more details.

![view-search-jobs](https://storage.googleapis.com/sourcegraph-assets/Docs/search-jobs/search-jobs-manage-with-frame.png)

## Search results format

The downloaded results are formatted as [JSON lines](https://jsonlines.org).
The JSON objects have the same format as the (event-type) matches served by the [Stream API](/api/stream-api#event-types).

Here is an example of a JSON lines results file for a search job that ran the query `r:github.com/sourcegraph/sourcegraph handlePanic`.
The search returned 4 results in 2 files:

```json lines
{"type":"content","path":"cmd/symbols/shared/main.go","repositoryID":399,"repository":"github.com/sourcegraph/sourcegraph","branches":["HEAD"],"commit":"9b47c6430cc3c9cb16695b4fa0a1a277cbbdb327","hunks":null,"chunkMatches":[{"content":"func handlePanic(logger log.Logger, next http.Handler) http.Handler {","contentStart":{"offset":4526,"line":129,"column":0},"ranges":[{"start":{"offset":4531,"line":129,"column":5},"end":{"offset":4542,"line":129,"column":16}}]},{"content":"\thandler = handlePanic(logger, handler)","contentStart":{"offset":3570,"line":98,"column":0},"ranges":[{"start":{"offset":3581,"line":98,"column":11},"end":{"offset":3592,"line":98,"column":22}}]}],"language":"Go"}
{"type":"content","path":"cmd/embeddings/shared/main.go","repositoryID":399,"repository":"github.com/sourcegraph/sourcegraph","branches":["HEAD"],"commit":"9b47c6430cc3c9cb16695b4fa0a1a277cbbdb327","hunks":null,"chunkMatches":[{"content":"func handlePanic(logger log.Logger, next http.Handler) http.Handler {","contentStart":{"offset":6585,"line":190,"column":0},"ranges":[{"start":{"offset":6590,"line":190,"column":5},"end":{"offset":6601,"line":190,"column":16}}]},{"content":"\thandler = handlePanic(logger, handler)","contentStart":{"offset":2844,"line":81,"column":0},"ranges":[{"start":{"offset":2855,"line":81,"column":11},"end":{"offset":2866,"line":81,"column":22}}]}],"language":"Go"}
```

## Configure Search Jobs

Search Jobs requires an object storage to store the results of your search jobs.
By default, Search Jobs stores results using our bundled `blobstore` service.
If the `blobstore` service is deployed, and you want to use it to store results from Search Jobs, you don't need to configure anything.

To use a third party managed object storage service, see instructions in [externalizing object storage](/self-hosted/external-services/object-storage#search-job-results).

### Environment Variables

You can configure Search Jobs behavior using the following environment variables on the worker service:

| Environment Variable                     | Default Value | Description                                                                                                                                                                                                                                                                                                                                  |
| ---------------------------------------- | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `SRC_SEARCH_JOB_WORKER_INTERVAL`         | `1s`          | Controls how frequently the system checks for new search jobs to process. You probably don't need to configure this.                                                                                                                                                                                                                         |
| `SRC_SEARCH_JOB_MAXIMUM_RUNTIME_PER_JOB` | `5h`          | Sets a maximum time limit for how long a search job can run per repository-revision pair. Note that a search job is run as a collection of many smaller searches, each targeting a single revision of a repository. This ENV targets the maximum runtime of individual searches.                                                             |
| `SRC_SEARCH_JOB_NUM_HANDLERS`            | `5`           | Adjusts how many searches can run in parallel. A value of 5 (default) means that 5 searches, each targeting a different repository-revision pair, can run in parallel. More handlers put more pressure on the backend, particularly on Searcher and Zoekt. Make sure you have sufficient resources before increasing the number of handlers. |

## Supported result types

Search jobs supports the following result types:

-   `file`
-   `path`
-   `diff`
-   `commit`

The following result types are not supported:

-   `symbol`
-   `repo`

## Limitations

The following elements of our query language are not supported:

-   file predicates, such as `file:has.content`, `file:has.owner`, `file:has.contributor`, `file:contains.content`
-   catch-all `.*` regexp search
-   Multiple `rev` filters
-   Queries with `index: filter`

<Callout type="note">
	Alternatively, the search bar supports the `count:all` operator which
	increases result limits and timeouts. This works well if the search
	completes within a few minutes and the number of results is less than the
	configured display limit. For longer running searches and searches with huge
	result sets, Search Jobs is the better choice.
</Callout>

## Disable Search Jobs

To disable Search Jobs, set `DISABLE_SEARCH_JOBS=true` in your frontend and worker services.

## FAQ

### How can I run a search job against all branches and all repositories?

You can use a combination of the `repo:` and `rev:` filter to search across all branches and repositories.
For example, `r: rev:*/refs/heads/* foo` will search for `foo` across all branches and all repositories.
See the [search syntax documentation](/code-search/queries#repository-revisions) for more details.

### Is there a time limit for a search job?

We break down a search job into smaller tasks and run them in parallel.
Each task is limited to a single revision and a single repository.
This allows us to run searches across large codebases.
The chance of hitting timeouts is very low, but it's not impossible.
If you do hit a timeout, please reach out to support and we will help you troubleshoot the issue. As an immediate measure, you can break down your search query into smaller queries and run them separately.

### A search job failed, how can I find out what went wrong?

We mark a search job as failed if any of the tasks fail.
You can access the log of a search job from the search jobs UI.
The log contains one line per task. If a task fails, the log will contain the error message.