microsoft/openvmm

Public

mirrored fromhttps://github.com/microsoft/openvmmAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
f04cd4aa22bccbc47d27ddc5a95197bf88d3b54a

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

Guide/src/dev_guide/contrib/code.md

436lines · modecode

1# Coding conventions
2
3One of our major goals with OpenVMM is to provide a high quality coding
4experience for contributors, starting first-and-foremost by having a consistent
5set of coding conventions in the project.
6
7[Do your part](https://youtu.be/-_7FaWnlhS4?t=10) and keep OpenVMM clean!
8
9## `rustfmt`
10
11_Checked Automatically:_ **Yes** (via [`cargo xtask fmt rustfmt`](../dev_tools/xtask.md))
12
13OpenVMM source must be formatted using
14[rustfmt](https://github.com/rust-lang/rustfmt), which automatically and
15mechanically applies standard formatting to all the code. This eliminates time
16spent discussing or reviewing stylistic issues in pull requests.
17
18The CI will run `rustfmt --check` to enforce consistent formatting, and will
19fail if it notices any discrepancies.
20
21Unfortunately, `rustfmt` isn't infinitely customizable, and there are several
22rules that are must be manually enforced:
23
24- All lines must end with LF, not CRLF.
25- Top-level `use` imports should be non-nested.
26
27> NOTE: Some of these manually-enforced conventions were introduced late into
28> OpenVMM's development, and there may be chunks of the codebase that do not
29> adhere to these conventions.
30>
31> If you're working in a file and notice that it isn't following a certain
32> convention, please take a moment to fix it!
33
34Assuming you've followed the [suggested dev env setup](../getting_started/suggested_dev_env.md)
35and set up `rust-analyzer` to format-on-save, you should rarely have to think
36about formatting in `.rs` files.
37
38## House Rules
39
40_Checked Automatically:_ **Yes** (via [`cargo xtask fmt house-rules`](../dev_tools/xtask.md))
41
42"House-rules" are a set of misc code lints that are specific to OpenVMM, which
43are enforced using a custom in-house tool:
44
45- enforce the presence of the standard Microsoft copyright header
46- enforce in-repo crate names don't use '-' in their name (use '_' instead)
47- enforce Cargo.toml files don't include autogenerated "see more keys" comments
48- enforce Cargo.toml files don't contain author or version fields
49- enforce files end with a single trailing newline
50- deny usage of `#[repr(packed)]` (you want `#[repr(C, packed)]`)
51- justify usage of `cfg(target_arch = ...)` (use `guest_arch` instead!)
52- justify usage of `allow(unsafe_code)` with an UNSAFETY comment
53
54Some of these lints are self explanatory, whereas others are described in more
55detail elsewhere on this page.
56
57## Unused `Cargo.toml` Dependencies
58
59_Checked Automatically:_ **Yes** (via [`cargo xtask fmt unused-deps`](../dev_tools/xtask.md))
60
61We have an in-repo fork of
62[`cargo-machete`](https://github.com/bnjbvr/cargo-machete) that ensures
63`Cargo.toml` files only include dependencies that are actually being used.
64
65Avoiding unused dependencies makes it easier to reason about what a crate is
66doing just by looking at its dependencies, and also helps cut-down on
67incremental compile times.
68
69## Formatting (`Cargo.toml`)
70
71_Checked Automatically:_ **No**
72
73When defining dependencies in `Cargo.toml` files, please organize dependencies
74into the following groups:
75
76- crate-specific "subcrates"
77- crates under vm/
78- crates under vm/vmcore/
79- crates under support/
80- external dependencies
81
82The rationale here is that crates should be grouped according to how related to
83how _widely applicable_ they are. i.e: crates from crates.io and support are
84widely applicable outside of OpenVMM, whereas crates under vmcore/ only make
85sense within the context of OpenVMM, and crate-specific subcrates are - by
86definition - only applicable to the crate they are being imported from.
87
88Additionally, we make use of the
89[workspace dependencies](https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-dependencies-table)
90feature to ensure that all our dependencies stay in sync. This requires defining
91dependencies in your crate's `Cargo.toml` file and in the project's root `Cargo.toml`.
92
93So, for example:
94
95```toml
96[package]
97name = "openvmm"
98
99[dependencies]
100# crate-specific subcrates
101openvmm_core.workspace = true
102...
103
104# /vmcore
105vmcore.workspace = true
106...
107
108# /vm/devices
109firmware_uefi_custom_vars.workspace = true
110storvsp.workspace = true
111...
112
113# /support
114guid.workspace = true
115inspect.workspace = true
116inspect_proto.workspace = true
117...
118
119# external dependencies
120anyhow.workspace = true
121cfg-if.workspace = true
122clap.workspace = true
123...
124```
125
126## Linting (via `clippy`)
127
128_Checked Automatically:_ **Yes**
129
130OpenVMM uses [`cargo clippy`](https://github.com/rust-lang/rust-clippy) to
131supplement rustc's built-in lints.
132
133Assuming you've followed the guide and set up `rust-analyzer` to
134[use clippy](../getting_started/suggested_dev_env.md#enabling-clippy), you should see clippy lints
135appear inline when working on Rust code.
136
137The CI runs `cargo clippy` on every crate in the repo prior to building the
138project, and will fast-fail if it catches any warnings / errors.
139
140### Suppressing Lints
141
142In general, lints should be fixed by modifying the code to satisfy the lint.
143However, there are cases where a lint may need to be `allow`'d inline.
144
145In these cases, you _must_ provide a inline comment providing reasonable
146justification for the suppressed lint.
147
148e.g:
149
150```rust
151// x86_64-unknown-linux-musl targets have a different type defn for
152// `libc::cmsghdr`, hence why these lints are being suppressed.
153#[allow(clippy::needless_update, clippy::useless_conversion)]
154libc::cmsghdr {
155 cmsg_level: libc::SOL_SOCKET,
156 cmsg_type: libc::SCM_RIGHTS,
157 cmsg_len: (size_of::<libc::cmsghdr>() + size_of_val(fds))
158 .try_into()
159 .unwrap(),
160 ..std::mem::zeroed()
161}
162```
163
164### OpenVMM's `clippy` Configuration
165
166We stick fairly close to the default set of rustc / clippy lints, though there
167are some default lints that we've decided to disabled project wide, and other
168non-default lints which we've explicitly opted into.
169
170See the `[workspace.lints]` sections of OpenVMM's root
171[`Cargo.toml`](https://github.com/microsoft/openvmm?path=/Cargo.toml)
172for a list of globally enabled/disabled lints, along with justification as to
173why certain lints have been enabled/disabled.
174
175## Unsafe Code Policy
176
177> _Preface:_ When possible, try to avoid introducing new `unsafe` code!
178>
179> Before rolling your own `unsafe` code, check to see if a safe abstraction
180> already exists, either in-tree, on crates.io\*, or in the standard library.
181>
182> \*subject to an unsafe-code audit
183
184Rather than synthesizing our own unsafe code conventions, we follow the
185guidelines outlined in the following two resources:
186
187- [The Rust `unsafe` docs](https://doc.rust-lang.org/std/keyword.unsafe.html)
188- [The Fuchsia "Unsafe code in Rust" Guidelines](https://fuchsia.googlesource.com/fuchsia/+/refs/heads/main/docs/development/languages/rust/unsafe.md)
189
190In a nutshell:
191
192- `unsafe` functions are required to include `/// # Safety` documentation
193 describing any preconditions the caller must uphold when calling the function.
194- `unsafe {}` blocks are required to include a `// SAFETY:` comment describing
195 how the preconditions for calling the `unsafe` function(s) within the block
196 are being satisfied.
197- `allow(unsafe_code)` annotations are required to include an `// UNSAFETY:`
198 comment justifying why the code in question needs to use `unsafe`. This
199 annotation must be placed at the module or crate level.
200
201These requirements are enforced by CI, and will cause the build to fail if
202required documentation is missing.
203
204Editing a file containing unsafe code will trigger CI to automatically add the
205OpenVMM Unsafe Approvers group to your PR. This is to ensure that all unsafe code
206is audited for correctness by area experts.
207
208## Uses of `cfg(target_arch = ...)` must be justified
209
210_Checked Automatically:_ **Yes** (via [`cargo xtask fmt house-rules`](../dev_tools/xtask.md))
211
212Unless you're working on something that's genuinely tied to the host's CPU
213architecture, you should use `cfg(guest_arch = ...)` instead of `cfg(target_arch = ...)`.
214
215OpenVMM is a multi-architecture VMM framework, capable of running running x64
216guests on a x64 host, as well as Aarch64 guests on Aarch64 hosts (with the
217potential of adding additional platforms in the future).
218
219At the moment, OpenVMM requires that the host architecture and guest
220architecture match. That said, it's possible that at some point in the future,
221OpenVMM may also support _mismatched_ guest/host architectures, via an emulated
222CPU virtualization backend (akin to QEMU).
223
224Having an emulated CPU backend would enable OpenVMM to support such useful
225scenarios as:
226
227- running Arch64 guests on x86 machines
228- running x86 guests on ARM
229- running something exotic (e.g: RISC-V) on x86/ARM machines
230 - ...assuming we had the bandwidth to implement + maintain something like that
231- running OpenVMM on systems without hardware-accelerated CPU virtualization enabled
232
233With these scenarios in mind, it would be short-sighted to rely entirely on
234`cfg(target_arch = ...)` to gate guest-facing arch-specific functionality, as it
235would inexorably tie the guest's arch to the host's arch, making any future
236initiatives to pry the two apart significantly more difficult!
237
238As such, the `OpenVMM` repo includes infrastructure to specify a custom,
239OpenVMM-specific `cfg(guest_arch = ...)` cfg parameter.
240
241By default, `cfg(guest_arch = ...)` will act the same way as `cfg(target_arch =
242...)`, but it can be swapped to a different architecture by setting the
243`OPENVMM_GUEST_TARGET` env var at compile-time.
244
245There are very few reasons to use `cfg(target_arch = ...)` within the OpenVMM
246repo, and to enforce this rule, we have an in-house `xtask fmt house-rules`
247check that lints each use of `cfg(target_arch = ...)` to include a
248"justification" for why it's being used.
249
250e.g: `cfg(target_arch = ...)` would be applicable when feature-gating a CPU
251intrinsic (such as CPUID, or a SIMD instruction), or when implementing a
252`*-sys` crate where the underlying C API/ABI varies between architectures.
253
254...otherwise, use `cfg(guest_arch = ...)`!
255
256## Avoid `Default` when using `zerocopy::FromZeroes`
257
258_Checked Automatically:_ **No**
259
260The rule:
261
262- A type can `derive(Default)` **XOR** `derive(FromZeroes)`.
263- A type that is `FromZeroes` can also `impl Default`, but it must be a
264 conscious, explicit choice, with justification (read: inline comment) as to
265 why that particular default value was chosen.
266
267The why:
268
269- The all-zero type is often not a semantically valid `Default` value for a type
270- There are plenty of types that don't have a `Default` value, but _do_ have a
271 valid all-zero repr
272
273### Additional context
274
275As per the Rust docs for [`Default::default`](https://doc.rust-lang.org/std/default/trait.Default.html#tymethod.default):
276
277> `fn default() -> Self`
278>
279> Returns the “default value” for a type.
280>
281> Default values are often some kind of initial value, identity value, or anything else that may make sense as a default.
282
283Notably, **default should _not_ be used as some shorthand to "zero initialize"
284values!** For most non-trivial structs, the all-zero representation is _not_ a
285semantically valid `Default`!
286
287This is true in many contexts... but one that's particularly relevant in OpenVMM
288is that of FFI via C-style APIs and ABIs.
289
290In C, it's very common for types to undergo multi-stage instantiation, where
291they are initially allocated as all-zero, and then get "filled in" by some
292secondary init code. Notably: it's quite rare for that initial "all-zero" struct
293to be a valid instance of the type!
294
295#### Example: C FFI
296
297For example, a common pattern in C libraries might look something like:
298
299```c
300struct Handle {
301 uint16_t opaque_handle;
302}
303
304struct Handle handle = {0};
305init_handle(&handle);
306
307update_handle(handle, options);
308do_thing(handle);
309```
310
311In this example: it would be an error to invoke `update_handle` or `do_thing`
312with an all-zero handle, as the type hadn't finished being fully initialized.
313
314If we wanted to use this library from Rust, a "naive" approach would be to do
315something like:
316
317```rust
318#[repr(C)]
319#[derive(Default)]
320struct Handle {
321 opaque_handle: u16,
322}
323
324let mut handle = Handle::default(); // BAD!
325unsafe { init_handle(&handle) };
326
327unsafe { update_handle(handle, options) };
328unsafe { do_thing(handle) };
329```
330
331While this _technically_ works... using `Default` here is kinda bogus!
332
333After all - `Handle::default()` doesn't actually call `init_handle`, which means
334the value returned by `default()` doesn't match the "promise" of the trait!
335Namely: the returned value is not semantically valid yet!
336
337In many other Rust codebase, this "overloading" of `Default` to represent both
338semantically valid value _and_ all-zero values (in FFI) is par for the course,
339as once structs get more complicated, having a `derive` that is able to fully
340init a "uninitialized" struct in-memory is quite handy...
341
342In OpenVMM, we don't do this. Instead, we use a separate trait to init all-zero
343structs.
344
345**In OpenVMM, we use `FromZeroes` and `FromZeroes::new_zeroed()` to work with types
346that have valid all-zero representations, _without_ implying that those types
347also have valid all-zero _default_ values!**
348
349So, for the example above:
350
351```rust
352#[repr(C)]
353#[derive(zerocopy::FromZeroes)]
354struct Handle {
355 opaque_handle: u16
356}
357
358let mut handle = Handle::new_zeroed(); // GOOD!
359unsafe { init_handle(&handle) };
360```
361
362Now, it's impossible for code elsewhere to obtain a `Handle` via
363`Handle::default`, and mistakenly forget to invoke `init_handle` on it.
364
365...but if it so happens that we _do_ want a `Default` impl for `Handle`, we can
366do so by _manually_ implementing `derive(Default)` ourselves:
367
368```rust
369// Default + FromZeroes: `default` returns fully initialized handle
370impl Default for Handle {
371 fn default() -> Handle {
372 let mut handle = Handle::new_zeroed();
373 unsafe { init_handle(&handle) };
374 handle
375 }
376}
377```
378
379## Avoid Requiring `Debug` on Traits
380
381_Checked Automatically:_ **No**
382
383**TL;DR:** Don't do this:
384
385```rust
386trait MyTrait: std::fmt::Debug
387```
388
389Implementations of the standard library's `Debug` trait can be surprisingly large,
390and the final binary size of OpenHCL and related binaries is a major concern
391for us. Unused implementations of this trait are usually removed during the
392optimization process (like all dead code), making this a non-issue. However when
393traits, and more specifically trait objects, are involved, the compiler has a
394much more difficult time proving that implementations are unused. This can
395result in large amounts of functionally dead code ending up in the final
396binaries.
397
398If you need to implement `Debug` for a struct containing such a trait object, you
399will need to do so manually, so that you can skip over that field.
400
401Moreover, it's usually not good form to leave tracing statements that log a
402struct's `Debug` representation in production. Prefer tracing just the fields
403you're interested in, and/or connecting objects to the `Inspect` graph.
404
405## Crate Naming
406
407_Checked Automatically:_ **Yes** (via [`cargo xtask fmt house-rules`](../dev_tools/xtask.md))
408
409**Crates must be named with underscores, not dashes and underscores used in
410folder names.**
411
412- Bad: `my-cool-crate`
413- Good: `my_cool_crate`
414
415Rust does not allow dashes in imports, with any dashes getting replaced with
416underscores when used in the code. Avoiding dashes altogether makes it easier to
417grep for crate names, and makes things more consistent across the repo.
418
419This convention is enforced by CI
420
421**Do not name crates with the words "base, util, common" or other terms that are
422overly general.**
423
424For example, consider a crate that provides a common data structure used by
425multiple devices:
426
427- Bad: `devices_common`
428- Good: `range_map`
429
430Libraries that contain the following eventually become a mishmash of unrelated
431functionality that is located there for convenience. This
432[blog post](https://dave.cheney.net/2019/01/08/avoid-package-names-like-base-util-or-common)
433goes more in-depth as to why.
434
435Instead, name things based on what they logically provide, like functionality or
436data types.
437