microsoft/openvmm
Publicmirrored fromhttps://github.com/microsoft/openvmmAvailable
Guide/src/dev_guide/snapshot_format.md
132lines · modecode
| 1 | # Snapshot Format |
| 2 | |
| 3 | This page documents the on-disk format used by OpenVMM snapshots, intended |
| 4 | for developers working on the save/restore subsystem. |
| 5 | |
| 6 | ## Directory layout |
| 7 | |
| 8 | A snapshot is stored as a directory containing three files: |
| 9 | |
| 10 | ```text |
| 11 | snapshot-dir/ |
| 12 | ├── manifest.bin # Protobuf-encoded SnapshotManifest |
| 13 | ├── state.bin # Protobuf-encoded device saved state |
| 14 | └── memory.bin # Hard link to the guest memory backing file |
| 15 | ``` |
| 16 | |
| 17 | ## Manifest format |
| 18 | |
| 19 | The manifest is a protobuf message defined as |
| 20 | [`SnapshotManifest`](https://openvmm.dev/rustdoc/linux/openvmm_helpers/snapshot/struct.SnapshotManifest.html) |
| 21 | in `openvmm/openvmm_helpers/src/snapshot.rs`, encoded using the `mesh` |
| 22 | crate's protobuf encoding. |
| 23 | |
| 24 | ## Device state (`state.bin`) |
| 25 | |
| 26 | The device state contains every device's saved state, collected via the |
| 27 | `SaveRestore` trait and encoded as a `mesh` protobuf message. The |
| 28 | [Save State](contrib/save-state.md) compatibility rules (mesh tag stability, |
| 29 | default values, forward/backward compatibility) apply. |
| 30 | |
| 31 | ## Memory (`memory.bin`) |
| 32 | |
| 33 | `memory.bin` is a hard link to the file-backed guest RAM file. During a save, |
| 34 | `write_snapshot()` creates this hard link using `std::fs::hard_link`. |
| 35 | |
| 36 | ```admonish note |
| 37 | The hard-link approach means the memory backing file and snapshot directory |
| 38 | must reside on the same filesystem. If they are on different filesystems, |
| 39 | `write_snapshot` returns an error with a suggestion to place the backing |
| 40 | file inside the snapshot directory. |
| 41 | ``` |
| 42 | |
| 43 | ### Same-file detection |
| 44 | |
| 45 | If the user passes `--memory-backing-file <snapshot_dir>/memory.bin`, the |
| 46 | source and target of the hard link are the same file. The code detects this |
| 47 | by canonicalizing both paths and comparing them. When they match, the |
| 48 | hard-link step is skipped. |
| 49 | |
| 50 | ## Code references |
| 51 | |
| 52 | - Manifest type and I/O: `openvmm/openvmm_helpers/src/snapshot.rs` |
| 53 | - Restore entry point: `prepare_snapshot_restore()` in |
| 54 | `openvmm/openvmm_entry/src/lib.rs` |
| 55 | - File-backed memory: `SharedMemoryFd` type alias in |
| 56 | `openvmm/openvmm_defs/src/worker.rs` |
| 57 | |
| 58 | ## Device state architecture |
| 59 | |
| 60 | Each VM component that participates in save/restore is registered as a |
| 61 | "state unit" with a unique string name via `StateUnits::add("name")`. |
| 62 | During save, every state unit receives a `StateRequest::Save`. Units that |
| 63 | have state return `Ok(Some(blob))`; units with no persistent state (e.g. |
| 64 | the input distributor) return `Ok(None)` and are omitted from `state.bin`. |
| 65 | |
| 66 | The resulting `state.bin` contains a `Vec<SavedStateUnit>`, where each |
| 67 | entry pairs a unit name with its opaque protobuf-encoded state blob. |
| 68 | |
| 69 | ### Restore matching rules |
| 70 | |
| 71 | During restore, `StateUnits::restore()` matches saved-state entries to |
| 72 | currently registered units **by name**: |
| 73 | |
| 74 | | Scenario | Result | |
| 75 | |---|---| |
| 76 | | Names match exactly | State is dispatched to the unit | |
| 77 | | Saved entry has no matching unit | **Error** — `unknown unit name` | |
| 78 | | Unit exists with no saved entry | Unit is skipped (keeps default state) | |
| 79 | | Duplicate name in saved state | **Error** — `duplicate unit name` | |
| 80 | |
| 81 | This means removing a device between save and restore will fail, but |
| 82 | adding a new device is allowed (it initialises to its power-on defaults). |
| 83 | |
| 84 | ### Unit naming conventions |
| 85 | |
| 86 | - **Chipset devices** — registered via `arc_mutex_device("name")` in |
| 87 | `vmotherboard`, e.g. `"pit"`, `"rtc"`, `"uefi"`, `"ide"`. |
| 88 | - **VMBus devices** — named `"{interface_name}:{instance_id}"`, e.g. |
| 89 | `"StorageVsp:ba6163d9-..."`. The instance GUID makes each offer |
| 90 | unique. |
| 91 | - **Infrastructure units** — `"vmtime"`, `"input"`, `"vmbus"`. |
| 92 | |
| 93 | ### Devices that do not support save/restore |
| 94 | |
| 95 | Not all devices implement save/restore. Devices signal this in one of |
| 96 | two ways: |
| 97 | |
| 98 | 1. **`SaveError::NotSupported`** — the `save()` method returns this error. |
| 99 | If any state unit does this, the entire save operation fails. |
| 100 | 2. **`supports_save_restore() -> false`** (virtio) or |
| 101 | `supports_save_restore() -> None` (VMBus) — transport-level check |
| 102 | that causes the transport's `save()` to return |
| 103 | `SaveError::NotSupported`. |
| 104 | |
| 105 | Key unsupported categories: |
| 106 | |
| 107 | - **PCIe** — `GenericPcieRootComplex`, `GenericPcieSwitch` return |
| 108 | `SaveError::NotSupported`. |
| 109 | - **NVMe** — `NvmeController` returns `SaveError::NotSupported`. |
| 110 | - **Pass-through PCI** — `AssignedPciDevice`, `RelayedVpciDevice`. |
| 111 | - **VGA / GDMA** — marked `todo!()` (will panic on save). |
| 112 | - **Virtio devices** — the `VirtioDevice` trait defaults |
| 113 | `supports_save_restore()` to `false`. Only `virtio-blk`, |
| 114 | `virtio-net`, `virtio-pmem`, and `virtio-rng` override it to `true`. |
| 115 | Devices with host-side session state (`virtio-9p`, `virtiofs`, |
| 116 | `virtio-console`) intentionally leave it `false`. |
| 117 | - **Some VMBus devices** — `GuestCrashDevice`, `GuestEmulationDevice`, |
| 118 | `VmbusSerialHost`, `Vmbfs` return `None` from |
| 119 | `supports_save_restore()`. |
| 120 | |
| 121 | ## Extending the format |
| 122 | |
| 123 | When adding new fields to `SnapshotManifest`, use the next available mesh |
| 124 | tag number. The protobuf encoding is forward-compatible: older readers will |
| 125 | ignore unknown fields. However, removing or reordering existing fields is a |
| 126 | breaking change. See [Save State](contrib/save-state.md) for the full set of |
| 127 | compatibility rules. |
| 128 | |
| 129 | ```admonish warning |
| 130 | Changing the mesh tag numbers of existing fields will break compatibility |
| 131 | with previously saved snapshots. |
| 132 | ``` |
| 133 | |