Caching

By default, mdbook-rustdoc-link spawns a fresh rust-analyzer process every time it is run. rust-analyzer then reindexes your entire project before resolving links.

This significantly impacts the responsiveness of mdbook serve — it is as if for every live reload, you had to reopen your editor, and it gets even worse the more dependencies your project has.

To mitigate this, there is an experimental caching feature, disabled by default.

Sections

Enabling caching

In your book.toml, in the [preprocessor.rustdoc-link] table, set cache-dir to the relative path of a directory of your choice (other than your book's build-dir), for example:

[preprocessor.rustdoc-link]
cache-dir = "cache"
# You could also point to an arbitrary directory in target/

Now, when mdbook rebuilds your book during build or serve, the preprocessor reuses the previous resolution and skips rust-analyzer entirely if your edit does not involve changes in the set of Rust items to be linked, that is, no new items unseen in the previous build.

important

If you use a directory under your book root directory, make sure to also have a .gitignore in your book root dir to exclude it from source control, or the cache file could trigger additional reloads. See Specify exclude patterns in the mdBook documentation.

Do not use your book's build-dir as the cache-dir: mdbook clears the output directory on every build, making this setup useless.

How it works

note

The following are implementation details. See rustdoc_link/cache.rs.

The effectiveness of this mechanism is based on the following assumptions:

  • Most of the changes made during authoring don't actually involve item links.
  • Assuming the environment is unchanged, the same set of items should resolve to the same set of links.

The cache keeps the following information in a cache.json:

  • The set of items to be resolved, and their resolved links
  • The environment, as a checksum over the contents of:
    • Your crate's Cargo.toml
    • If you are using a workspace, the workspace's Cargo.toml
    • The entrypoint (lib.rs or main.rs)
    • For each item that is defined within your crate or workspace, its source file
    • (Note that Cargo.lock is currently not considered, nor are dependencies or std)

If a subsequent run has the same set of items (or a subset) and the same checksum (meaning you did not update your code), then the preprocessor simply reuses the previous results.

tip

Items that fail to resolve are not included in the cache.

If you keep such broken links in your Markdown source, the cache will permanently miss, and rust-analyzer will run on every edit.

Help wanted 🙌

The cache feature, as it currently stands, is a workaround at best. If you have insights on how performance could be further improved, please open an issue!

Cache priming and progress tracking

The preprocessor spawns rust-analyzer with cache priming enabled which contributes to the majority of build time.

Furthermore, the preprocessor relies on the LSP Work Done Progress notifications to know when rust-analyzer has finished cache priming, before actually sending out external docs requests. This requires parsing non-structured log messages that rust-analyzer sends out and some debouncing/throttling logic, which is not ideal, see client.rs.

Not waiting for indexing to finish and sending out requests too early causes rust-analyzer to respond with empty results.

Questions:

  • Is it possible to do it without cache priming?
  • Is there a better way to track rust-analyzer's "readiness" without having to arbitrary sleep?

Using ra-multiplex

ra-multiplex "allows multiple LSP clients (editor windows) to share a single rust-analyzer instance per cargo workspace."

In theory, in an IDE setting (e.g. with VS Code), one could setup the IDE and mdbook-rustdoc-link to both connect to the same ra-multiplex server. Then the preprocessor doesn't need to wait for cache priming (the cache is already warm from IDE use). Changes in the workspace could also be reflected in subsequent builds without the preprocessor being aware of them (because the IDE is doing the synchronizing).

In reality, with the current version, connecting the preprocessor to ra-multiplex seems to result in buggy builds. The initial build emits in many warnings despite all items eventually resolving. Subsequent builds hang indefinitely before timing out.

Question:

  • Is it possible to use ra-multiplex here?

Postscript

mdbook encourages a stateless architecture for preprocessors. Preprocessors are expected to work like pure functions over the entire book, even for mdbook serve. Preprocessors are not informed on whether they are invoked as part of mdbook build (prefer fresh starts) or mdbook serve (maintain states between run).

rust-analyzer, meanwhile, has a stateful architecture that also doesn't yet have persistent caching1. It is designed to take in a ground state (your project initially) and then evolve the state (your project edited) entirely in memory.

So rust-analyzer has an extremely incremental architecture, perfect for complex languages like Rust, and mdbook has an explicitly non-incremental architecture, perfect for rendering Markdown. This makes them somewhat challenging to work well together in a live-reload scenario.


  1. It was mentioned that the recently updated, salsa-ified rust-analyzer (version 2025-03-17) will unblock work on persistent caching, among many other things, so hopefully bigger changes are coming!