Skip to content

Chapter 4: The search Command

Before we can install anything, we need to know what's available. In this chapter we'll meet repodata, channels, and the rattler Gateway by building a search command.

Design

shot search <query> will search for packages matching a name pattern and print the results:

$ shot search lua
lua          5.4.7   The Lua programming language
luarocks     3.11.1  The Lua package manager
lua-cjson    2.1.0   Fast JSON encoding/decoding for Lua

The command will accept a --channel flag (defaulting to conda-forge) and search both the current platform and noarch.

Concepts

What is repodata?

Every conda channel serves a file called repodata.json for each supported platform. It lists every available package with its metadata:

{
  "packages.conda": {
    "lua-5.4.7-h5eee18b_0.conda": {
      "build": "h5eee18b_0",
      "build_number": 0,
      "depends": ["libgcc-ng >=12"],
      "name": "lua",
      "sha256": "abc123...",
      "size": 312449,
      "subdir": "linux-64",
      "version": "5.4.7"
    },
    ...
  }
}

For a large channel like conda-forge, this file can be hundreds of megabytes. You really don't want to download all of that for every command.

Think of repodata as a catalog of the packages that exist, which a client can choose from. How this catalog is designed determines:

  • Download speed: how much data the client must fetch
  • Caching: how much can be reused between runs
  • Startup cost: how much work the client does before it can even start solving

Channels

A Channel is more than just a URL string. It knows whether you gave it a named channel ("conda-forge") or an explicit URL, and it can construct the sub-URLs for each platform.

ChannelConfig provides the base URL for named channels (by default https://conda.anaconda.org/). You can override it to point at a local mirror.

MatchSpecs: describing what you want

We've already seen MatchSpecs in the version constraint table from Chapter 3. A few extra things worth knowing about the CEP-29 syntax:

  • pkg =1.8 (with =) is fuzzy -- it matches any 1.8.* release.
  • pkg 1.8 (with a space) is exact -- it matches only version 1.8.
  • Bracket syntax lets you filter on any record field: lua[build_number='>0'].
  • You can pin a channel: conda-forge::lua >=5.4.
  • It even supports regexes: if a value starts with ^ and ends with $ it's treated as a regular expression, e.g. lua[build='^h5ee.*$'].

rattler parses a MatchSpec into a typed struct:

use rattler_conda_types::{MatchSpec, ParseMatchSpecOptions};

let opts = ParseMatchSpecOptions::default();
let spec: MatchSpec = MatchSpec::from_str("lua >=5.4", opts)?;

The Gateway

The Gateway from rattler_repodata_gateway is the main entry point for fetching repodata. It manages the on-disk cache, the HTTP client, and per-channel configuration.

Why querying is fast: sharded repodata

The naive approach would be to fetch all of repodata.json and load it into RAM. For conda-forge that's over 350 MB. That's painfully slow on first run and wasteful when you only care about packages starting with lua.

CEP-16 (sharded repodata) replaces the monolithic file with a content-addressed, per-package scheme:

  1. The server publishes a compact shard index (repodata_shards.msgpack.zst, ~670 KB for conda-forge linux-64). It maps each package name to the SHA-256 hash of that package's shard file.
  2. When you ask for lua >=5.4, the client fetches only the shard for lua at <shards_base_url>/<sha256>.msgpack.zst.
  3. Each shard contains the full metadata (versions, builds, dependencies) for that one package name, encoded as zstd-compressed msgpack.
  4. The client reads the shard's dependency lists, discovers transitive dependencies (like libgcc-ng), and fetches those shards in turn.

Because shard URLs are derived from content hashes, the server can serve them with Cache-Control: immutable. An unchanged package keeps the same URL, so the client never re-downloads shards it already has.

Setting sharded_enabled: true on the Gateway tells it to prefer the sharded format when available. Both prefix.dev and anaconda.org already serve sharded repodata for conda-forge.

The cache directory

rattler caches repodata on disk so you don't have to re-download on every run.

rattler::default_cache_dir() returns the OS-appropriate location:

  • Linux: ~/.cache/rattler/cache
  • macOS: ~/Library/Caches/rattler/cache
  • Windows: %LOCALAPPDATA%\rattler\cache

By sharing this cache with pixi and rattler-build, packages are downloaded only once across all tools.

The HTTP client

rattler uses reqwest for HTTP. We'll build a client with authentication and OCI support.

Implementation

src/client.rs: shared HTTP client

Several commands will need an HTTP client with auth and OCI support, so we put this setup in its own module.

file: src/client.rs
use std::sync::Arc;

use miette::{Context, IntoDiagnostic};
use rattler_networking::AuthenticationMiddleware;

/// Build an HTTP client with authentication and OCI middleware.
///
/// The returned client handles:
/// - Token/keyring authentication for private channels
/// - `oci://` URL translation for container-registry channels
/// - Disabled automatic gzip (repodata ships pre-compressed)
pub fn build_authenticated_client() -> miette::Result<reqwest_middleware::ClientWithMiddleware> {
    let raw_client = reqwest::Client::builder()
        .no_gzip()
        .build()
        .expect("failed to build HTTP client");

    let client = reqwest_middleware::ClientBuilder::new(raw_client.clone())
        .with_arc(Arc::new(
            AuthenticationMiddleware::from_env_and_defaults()
                .into_diagnostic()
                .context("setting up auth middleware")?,
        ))
        .with(rattler_networking::OciMiddleware::new(raw_client))
        .build();

    Ok(client)
}

The .no_gzip() call disables reqwest's automatic gzip decompression. Rattler downloads .json.zst (zstd-compressed) files and handles decompression itself. Disabling reqwest's automatic gzip prevents double-decompression, where reqwest would try to gunzip a response that rattler then also tries to decompress.

reqwest_middleware wraps reqwest::Client to allow pluggable middleware. Each middleware intercepts every request and response:

  • AuthenticationMiddleware: injects tokens from the rattler keyring or .condarc
  • OciMiddleware: translates oci:// URLs to the OCI registry API so you can use container registries as conda channels

src/commands/search.rs

Let's create a new file for the search command:

We pull in the networking and repodata crates:

#search-imports
use std::collections::HashMap;
use std::env;

use clap::Parser;
use miette::{Context, IntoDiagnostic};
use rattler::package_cache::PackageCache;
use rattler_cache::{PACKAGE_CACHE_DIR, REPODATA_CACHE_DIR};
use rattler_conda_types::{Channel, ChannelConfig, MatchSpec, ParseMatchSpecOptions, Platform};
use rattler_repodata_gateway::{Gateway, RepoData, SourceConfig};

use crate::client::build_authenticated_client;
use crate::progress::with_spinner;

The command takes a query string and optional channel flags:

#search-args
#[derive(Debug, Parser)]
pub struct Args {
    /// Package name (or prefix) to search for.
    pub query: String,

    /// Channel to search. Defaults to conda-forge.
    #[clap(short, long, default_value = "conda-forge")]
    pub channel: Vec<String>,
}

The execute function walks through the networking setup we'll reuse in Chapter 6: parse channels, build an HTTP client, configure the Gateway, and query repodata.

#search-execute
pub async fn execute(args: Args) -> miette::Result<()> {
<<search-parse-channels>>
<<search-http-client>>
<<search-gateway>>
<<search-query>>
<<search-results>> [1, 2]
}

We convert the --channel strings into rattler Channel objects and parse the query into a MatchSpec.

#search-parse-channels
    let channel_config =
        ChannelConfig::default_with_root_dir(env::current_dir().into_diagnostic()?);

    let channels: Vec<Channel> = args
        .channel
        .iter()
        .map(|s| Channel::from_str(s, &channel_config))
        .collect::<Result<_, _>>()
        .into_diagnostic()
        .context("parsing channels")?;

    let spec = MatchSpec::from_str(&args.query, ParseMatchSpecOptions::default())
        .into_diagnostic()
        .with_context(|| format!("parsing search query `{}`", args.query))?;

    let cache_dir = rattler::default_cache_dir()
        .map_err(|e| miette::miette!("could not determine cache directory: {e}"))?;
    rattler_cache::ensure_cache_dir(&cache_dir)
        .map_err(|e| miette::miette!("could not create cache directory: {e}"))?;

We call the shared helper from src/client.rs to build an authenticated HTTP client. See the section above for the full implementation.

#search-http-client
    let client = build_authenticated_client()?;

The Gateway builder takes the cache directory, HTTP client, and channel configuration. Setting sharded_enabled: true tells it to prefer the fast sharded format when a channel supports it.

#search-gateway
    let platform = Platform::current();

    let gateway = Gateway::builder()
        .with_cache_dir(cache_dir.join(REPODATA_CACHE_DIR))
        .with_package_cache(PackageCache::new(cache_dir.join(PACKAGE_CACHE_DIR)))
        .with_client(client)
        .with_channel_config(rattler_repodata_gateway::ChannelConfig {
            default: SourceConfig {
                sharded_enabled: true,
                ..SourceConfig::default()
            },
            per_channel: HashMap::new(),
        })
        .finish();

gateway.query(...) fetches repodata for the requested packages. When recursive is true, the gateway also fetches repodata for transitive dependencies. For search we only need direct matches, so we set it to false. We query both the current platform and NoArch to cover pure-Lua packages.

#search-query
    let repo_data: Vec<RepoData> = with_spinner(
        "Fetching repodata",
        gateway
            .query(channels, [platform, Platform::NoArch], vec![spec])
            .recursive(false),
    )
    .await
    .into_diagnostic()
    .context("fetching repodata")?;

The query returns a Vec<RepoData>, one per channel/platform combination. We flatten the records, deduplicate by (name, version), and bail early when the query matched nothing.

#search-results [1]
    // Collect and deduplicate results by (name, version), keeping the latest.
    let mut seen: HashMap<(String, String), String> = HashMap::new();
    for repo in &repo_data {
        for record in repo.iter() {
            let name = record.package_record.name.as_normalized().to_string();
            let version = record.package_record.version.to_string();
            let key = (name.clone(), version.clone());
            seen.entry(key).or_insert_with(|| name);
        }
    }

    if seen.is_empty() {
        println!("No packages found matching `{}`.", args.query);
        return Ok(());
    }

With duplicates collapsed, we sort the results alphabetically and print one line per package, showing only the latest version. This mimics how tools like apt search or cargo search present results.

We parse the version strings into Version values for correct ordering. Plain string comparison would sort 5.9.0 after 5.10.0.

#search-results [2]
    // Sort by name, then by version descending.
    let mut results: Vec<(String, String)> = seen.into_keys().collect();
    results.sort_by(|a, b| {
        a.0.cmp(&b.0).then_with(|| {
            let va = a.1.parse::<rattler_conda_types::Version>();
            let vb = b.1.parse::<rattler_conda_types::Version>();
            match (va, vb) {
                (Ok(va), Ok(vb)) => vb.cmp(&va),
                _ => b.1.cmp(&a.1),
            }
        })
    });

    // Deduplicate by name (show only latest version per package).
    let mut last_name = String::new();
    let mut count = 0usize;
    for (name, version) in &results {
        if *name == last_name {
            continue;
        }
        last_name.clone_from(name);
        println!("{:<30} {}", console::style(name).cyan(), version);
        count += 1;
    }

    println!("\n{} package(s) found.", count);
    Ok(())

src/progress.rs

The progress module provides spinner wrappers we'll reuse in search and install. A simple spinner does the job for us.

file: src/progress.rs
use std::borrow::Cow;
use std::future::IntoFuture;
use std::time::Duration;

use indicatif::{ProgressBar, ProgressStyle};

/// Spinner style shared across the codebase.
pub fn spinner_style() -> ProgressStyle {
    ProgressStyle::with_template("{spinner:.green} {msg}")
        .unwrap()
        // braille dots feel snappy even at 10 fps
        .tick_strings(&["⠋", "⠙", "⠸", "⠴", "⠦", "⠇", "⠋"])
}
<<with-spinner>>
<<with-spinner-sync>>

The async spinner wraps any IntoFuture:

#with-spinner
pub async fn with_spinner<T, F>(msg: impl Into<Cow<'static, str>>, fut: F) -> T
where
    F: IntoFuture<Output = T>,
{
    let pb = ProgressBar::new_spinner();
    pb.enable_steady_tick(Duration::from_millis(80));
    pb.set_style(spinner_style());
    pb.set_message(msg);
    let result = fut.into_future().await;
    pb.finish_and_clear();
    result
}

Updates to src/commands/mod.rs and src/main.rs

The search module needs to be registered. We add pub mod search; to src/commands/mod.rs (the full file is shown in Chapter 2) and a Search variant to the Command enum in src/main.rs (shown in Chapter 2).

$ shot search lua
⠋ Fetching repodata
lua                            5.4.7
luarocks                       3.11.1
lua-cjson                      2.1.0
luafilesystem                  1.8.0

4 package(s) found.

Summary

  • Repodata is a channel's package catalog, and it can be huge.
  • MatchSpecs describe what packages you're looking for.
  • The Gateway fetches repodata, preferring the sharded format (CEP-16) when available.
  • For search we query with .recursive(false) since we only need direct matches.

Exercises

Show All Versions with Build Strings

Currently shot search deduplicates results to show only the latest version per package name. Add a --all-versions flag that displays every version found in the repodata. For each version, show the build string from PackageRecord, giving users visibility into how packages are built.

Acceptance criteria
  • shot search lua --all-versions shows multiple versions (e.g., 5.4.7, 5.4.6, 5.3.5) each with their build string
  • Default behavior (without flag) is unchanged
  • Output format: lua 5.4.7 h5505292_0

Display Package Dependencies from Repodata

Add a --deps flag to shot search that prints the dependency list for each matching package. Access PackageRecord::depends (a Vec<String> of dependency specs) and display each dependency on its own indented line. Parse each dependency back through MatchSpec::from_str to validate it and show the structured name + version constraint.

Acceptance criteria
  • shot search lua --deps shows the latest version of lua with its dependencies listed below
  • Each dependency is indented and shows name + constraint (e.g., libgcc-ng >=12)
  • All dependency strings parse through MatchSpec::from_str without error
  • Packages with no dependencies show (no dependencies)

Compare Package Versions

Implement shot search <package> --diff <version1> <version2> that compares two versions of the same package side by side. Query the gateway for both versions, then diff their PackageRecord fields: dependencies added/removed/changed, build string, size, and timestamp.

Acceptance criteria
  • shot search lua --diff 5.4.6 5.4.7 shows differences between the two versions
  • Dependencies diff shows added (+), removed (-), and changed (~) entries
  • Build string, size, and timestamp differences are displayed
  • If either version is not found, a clear error is shown

In the next chapter we'll implement shot add, which will let you edit the manifest. Then in Chapter 6 we build shot lock, which adds solving to the repodata pipeline and records the result.