Deep Dive: The resolvo SAT Solver¶
In Chapter 6 we ran shot lock and the solver found a
compatible set of packages in under a second. We also saw what happens when it
fails: requesting both lua and luajit produced a conflict explanation
tracing the incompatibility back to your manifest.
How does the solver explore thousands of package versions so quickly? And how does it produce those targeted error messages instead of just "no solution"? This chapter walks through the algorithms, from the connection to SAT to the specific optimizations that make resolvo fast in practice.
Dependency solving as SAT¶
SAT (Boolean satisfiability) is the problem of deciding whether a set of true/false variables can be assigned values so that every constraint is met.
Dependency solving reduces to SAT by encoding each package version as a boolean variable:
Each constraint becomes a clause:
-
"Install exactly one version of lua":
lua_5_4_7 XOR lua_5_4_6 XOR ... -
"If
luarocks_3_11is installed,lua >= 5.1must be installed":NOT luarocks_3_11 OR lua_5_4_7 OR lua_5_4_6 OR lua_5_3_6 OR ... -
"The user requested
lua >= 5.4":lua_5_4_7 OR lua_5_4_6(but not 5.3 or older)
A SAT solver finds an assignment that satisfies all clauses, or proves no solution exists.
DPLL and CDCL: the solver algorithms¶
Modern SAT solvers use CDCL (Conflict-Driven Clause Learning), an evolution of the older DPLL algorithm.
DPLL in brief¶
DPLL works by:
- Unit propagation: if a clause has only one unset variable, that variable must be set to make the clause true. This is cheap and often resolves many variables immediately.
- Choose: pick an unset variable and assume a value (e.g., set
lua_5_4_7 = true). - Recurse: run unit propagation with the new assumption.
- Backtrack: if a conflict is found, undo the last assumption and try the opposite.
CDCL: learning from conflicts¶
CDCL improves DPLL by analyzing why a conflict occurred and adding a new clause so the solver never hits the same dead end again.
Conflict: tried lua 5.4.7, failed because luarocks needs json-lib <2.0
but lua 5.4.7 doesn't need json-lib. The conflict comes from
legacy-plugin requiring json-lib <2.0 AND json-lib 2.1 being
required by web-server 2.0.
Learned clause: NOT (web-server_2_0 AND legacy-plugin)
The learned clause is added to the formula and persists for the rest of the search. This can prune large sections of the search space.
The following diagram shows the CDCL solver loop:
graph TD
A[Pick unassigned variable] --> B[Unit propagation]
B --> C{Conflict?}
C -->|No| D{All assigned?}
D -->|No| A
D -->|Yes| E[SAT: return solution]
C -->|Yes| F[Analyze conflict]
F --> G[Learn new clause]
G --> H[Backjump]
H --> B
resolvo's design¶
resolvo is a pure-Rust CDCL solver designed specifically for package dependency solving. It was developed by the prefix.dev team for use in conda-style dependency resolution.
Key design decisions:
Lazy candidate generation¶
In traditional SAT, all clauses are known upfront. In package solving, the "clauses" are the dependency constraints of each package, and you might not know those constraints until you decide to install a package.
resolvo's DependencyProvider trait allows lazy fetching:
pub trait DependencyProvider {
// Fetch what versions of a package are available
fn get_candidates(&self, name: NameId) -> Option<Candidates>;
// Fetch the dependencies of a specific version
fn get_dependencies(&self, solvable: SolvableId) -> Dependencies;
}
The solver calls get_dependencies only for the packages it's considering.
In a large channel like conda-forge, this means you never load dependency data
for the thousands of packages you don't need.
Arena allocators¶
resolvo uses arena allocators for performance. Records are packed into a
single large array rather than allocated individually on the heap (which would
involve many small malloc calls and pointer chasing):
pub struct Arena<T> {
data: Vec<T>,
}
impl<T> Arena<T> {
pub fn alloc(&mut self, item: T) -> Id<T> {
let id = self.data.len();
self.data.push(item);
Id(id as u32)
}
pub fn get(&self, id: Id<T>) -> &T {
&self.data[id.0 as usize]
}
}
Id<T> is a u32 index, not a pointer. This keeps items adjacent in memory
(cache-friendly), uses half the space of a pointer, and avoids the need for
lifetime annotations.
The watch list optimization¶
A key unit propagation optimization is the watch list (also used in the famous MiniSat solver).
The idea works in three steps:
- For each clause
(A OR B OR C OR ...), watch two of its literals. - When a watched literal becomes false, scan the clause for a replacement literal to watch.
- If no replacement exists (all other literals are also false), the clause is unit and the remaining watched literal must be true.
This avoids scanning all clauses on every assignment change. Instead, the solver maintains a list of clauses watching each literal, and only processes those clauses when that literal changes.
Conda-specific heuristics¶
General SAT solvers don't know about package manager conventions. resolvo adds conda-specific scoring:
Preferring higher versions¶
When choosing which variable to set next, the solver prefers:
- The version directly requested by the user (highest version that satisfies the spec).
- Currently-installed versions (avoid unnecessary changes).
- Higher versions over lower ones.
- Earlier channels over later channels.
These preferences are encoded as "decisions" that the solver makes before backtracking. If a preference leads to a conflict, the solver backtracks and tries the next preference.
Explaining conflicts¶
When the solver can't find a solution, it generates a human-readable explanation. This is one of resolvo's distinctive features:
The following packages are incompatible:
* luarocks 3.11 requires json-lib >=2.0
* legacy-plugin 1.0 requires json-lib <2.0
Therefore json-lib cannot be installed.
And because you requested both luarocks and legacy-plugin,
no solution exists.
Building this explanation takes real work. The solver must trace back through its conflict graph to find the minimal set of incompatibilities.
The conda scoring model¶
rattler's interface to resolvo (rattler_solve::resolvo, provided by rattler_solve) translates the conda
preference model into resolvo's scoring system.
The key insight is that conda uses a multi-objective ordering:
1. Maximize version of user-requested packages
2. Minimize number of changes from locked packages
3. Maximize version of unlocked packages
4. Maximize build number (for the same version)
5. Prefer packages from earlier channels
These objectives combine into one strict ranking of candidates. The resolvo solver uses this ranking to guide its search.
Practical performance¶
For typical conda environments (10-50 packages), resolvo solves in milliseconds. For large environments (hundreds of packages), it can take a few seconds, which is why we show a spinner.
Repodata loading is usually the bottleneck, rather than the SAT solving itself: fetching and parsing package metadata. This is why the Gateway's sharded repodata format (CEP-16) is so important. It avoids loading millions of records for the vast majority of conda-forge packages that aren't relevant to your request.
libsolv: the alternative backend¶
rattler also ships a binding to libsolv, a C library used by older conda tools (conda, mamba). You can select it via feature flags:
resolvo is the default and recommended backend. It's faster, produces better
error messages, and doesn't require a C compiler.
The two backends share the same SolverImpl trait, so switching is a
one-line change.
Using resolvo directly¶
Moonshot uses resolvo through the rattler_solve wrapper, which implements
DependencyProvider for conda packages. Resolvo itself is not tied to conda. You
can solve any dependency problem by implementing that trait yourself.
Implementing this requires two traits: Interner (for mapping between names,
version sets, and internal IDs) and DependencyProvider itself. The provider
has four core methods:
get_candidates(name)-- return the available versions for a package name.sort_candidates(solver, solvables)-- order candidates by preference (typically highest version first).filter_candidates(candidates, version_set, inverse)-- narrow candidates to those matching (or not matching) a version constraint.get_dependencies(solvable)-- return the dependencies of a specific version.
Once you have a provider, solving looks like this:
let provider = MyProvider::new(packages);
let mut solver = Solver::new(provider);
let reqs = /* build requirements from version sets */;
let problem = Problem::new().requirements(reqs);
let solution = solver.solve(problem)?;
Implementing the full trait takes roughly a hundred lines because Interner
requires you to map between string names, version sets, and their internal
integer IDs. The resolvo repository includes BundleBoxProvider, a complete test
implementation of DependencyProvider that uses in-memory package definitions.
It is the best starting point if you want to build a custom solver backend.
Summary¶
- Dependency solving reduces to SAT.
- CDCL improves on DPLL by learning clauses from conflicts.
- resolvo uses lazy dependency loading, arena allocators, and watch lists for efficiency.
- Conda-specific heuristics encode version preferences and minimize changes.
- Conflict explanations trace the incompatibility graph for human-readable errors.
- resolvo is faster than libsolv and produces better diagnostics.