This article is a good primer on why process isolation is more robust/separated than threads/coroutines in general, though ironically I don't think it fully justifies why process isolation is better for tests as a specific usecase benefitting from that isolation.For tests specifically, some considerations I found to be missing:
- Given speed requirements for tests, and representativeness requirements, it's often beneficial to refrain from too much isolation so that multiple tests can use/excercise paths that use pre-primed in memory state (caches, open sockets, etc.). It's odd that the article calls out that global-ish state mutation as a specific benefit of process isolation, given that it's often substantially faster and more representative of real production environments to run tests in the presence of already-primed global state. Other commenters have pointed this out.
- I wish the article were clearer about threads as an alternative isolation mechanism for sequential tests versus threads as a means of parallelizing tests. If tests really do need to be run in parallel, processes are indeed the way to go in many cases, since thread-parallel tests often test a more stringent requirement than production would. Consider, for example, a global connection pool which is primed sequentially on webserver start, before the webserver begins (maybe parallel) request servicing. That setup code doesn't need to be thread-safe, so using threads to test it in parallel may surface concurrency issues that are not realistic.
- On the other hand, enough benefits are lost when running clean-slate test-per-process that it's sometimes more appropriate to have the test harness orchestrate a series of parallel executors and schedule multiple tests to each one. Many testing frameworks support this on other platforms; I'm not as sure about Rust--my testing needs tend to be very simple (and, shamefully, my coverage of fragile code lower than it should be), so take this with a grain of salt.
- Many testing scenarios want to abort testing on the first failure, in which case processes vs. threads is largely moot. If you run your tests with a thread or otherwise-backgrounded routine that can observe a timeout, it doesn't matter whether your test harness can reliably kill the test and keep going; aborting the entire test harness (including all processes/threads involved) is sufficient in those cases.
- Debugging tools are often friendlier to in-process test code. It's usually possible to get debuggers to understand process-based test harnesses, but this isn't usually set up by default. If you want to breakpoint/debug during testing, running your tests in-process and on the main thread (with a background thread aborting the harness or auto-starting a debugger on timeout) is generally the most debugger-friendly practice. This is true on most platforms, not just Rust.
- fork() is a middle ground here as well, which can be slow, though mitigations exist, but can also speed things up considerably by sharing e.g. primed in-memory caches and socket state to tests when they run. Given fork()'s sharp edges re: filehandle sharing, this, too, works best with sequential rather than parallel test execution. Depending on the libraries in use in code-under-test, though, this is often more trouble than it's worth. Dealing with a mixture of fork-aware and fork-unaware code is miserable; better to do as the article suggests if you find yourself in that situation. How to set up library/reusable code to hit the right balance between fork-awareness/fork-safety and environment-agnosticism is a big and complicated question with no easy answers (and also excludes the easy rejoinder of "fork is obsolete/bad/harmful; don't bother supporting it and don't use it, just read Baumann et. al!").
- In many ways, this article makes a good case for something it doesn't explicitly mention: a means of annotating/interrogating in-memory global state, like caches/lazy_static/connections, used by code under test. With such an annotation, it's relatively easy to let invocations of the test harness choose how they want to work: reuse a process for testing and re-set global state before each test, have the harness itself (rather than tests by side-effect) set up the global state, run each test with and/or without pre-primed global state and see if behavior differs, etc. Annotating such global state interactions isn't trivial, though, if third-party code is in the mix. A robust combination of annotations in first-party code and a clear place to manually observe/prime/reset-if-possible state that isn't annotated is a good harness feature to strive for. Even if you don't get 100% of the way there, incremental progress in this direction yields considerable rewards.