programming: composable virtual file systems
This commit is contained in:
parent
b522732a67
commit
3673036bec
|
@ -9,6 +9,10 @@
|
||||||
% id = "01HPD4XQQ5GPQ20C6BPA8G670F"
|
% id = "01HPD4XQQ5GPQ20C6BPA8G670F"
|
||||||
- :folder: *blog*
|
- :folder: *blog*
|
||||||
|
|
||||||
|
% content.link = "programming/blog/vfs"
|
||||||
|
id = "01JDJJHCGTAJ78K8WRXV00ZEJC"
|
||||||
|
+ :page: composable virtual file systems
|
||||||
|
|
||||||
% content.link = "programming/blog/cstring-starts-with"
|
% content.link = "programming/blog/cstring-starts-with"
|
||||||
id = "01JCGAM56KS2C6D4XZ1MRATXH4"
|
id = "01JCGAM56KS2C6D4XZ1MRATXH4"
|
||||||
+ :page: prefix matches with C strings
|
+ :page: prefix matches with C strings
|
||||||
|
@ -69,3 +73,4 @@
|
||||||
+ :folder: *opinions*
|
+ :folder: *opinions*
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
504
content/programming/blog/vfs.tree
Normal file
504
content/programming/blog/vfs.tree
Normal file
|
@ -0,0 +1,504 @@
|
||||||
|
%% title = "composable virtual file systems"
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BRZDSYTZCH0S357B8"
|
||||||
|
- you know what I hate?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BTT39GCS96D3N7GF1"
|
||||||
|
- [State.](https://overwerk.bandcamp.com/album/state){.secret title="the linked album is pretty good, though."}
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BSXT52HB8A240XME2"
|
||||||
|
- guess what file systems are:
|
||||||
|
giant balls of state controlled fully by imperative code!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BEFG210RZV2G82TNC"
|
||||||
|
- how the heck are we supposed to write reliable code in an environment where we can't even be sure everything is in the state we expect it to?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BV6PQ1F6QK79M9RKJ"
|
||||||
|
- having to handle state desynchronizations results in a massive amount of complexity and space for bugs.
|
||||||
|
and unfortunately... there's really nothing you can do about this, because the whole _point_ of file systems is to enable persistent storage. :oh:
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B88P2JXYMV1TS3EMY"
|
||||||
|
- so I've been thinking if _maybe_ there was some way to make file systems more declarative.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BKTJDSQ4BNWAZKZ66"
|
||||||
|
- after all, functional programming exists, and it enables you to think about data and transformations, rather than sending commands to the computer.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BGSEKHKSP75CFCQ5V"
|
||||||
|
+ systemd exists, NixOS also exists... both enable you to specify what state your system _should_ reach, rather than _how_ it should be done.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BCNEBQWTYKFG523K3"
|
||||||
|
- well, right.
|
||||||
|
actually, systemd _does_ have an imperative element to it, because you have to tell it what command should be executed to get your process spinning.
|
||||||
|
I'd argue that is still way better than a shell script, because it can restart failing services selectively without any extra work on your part.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B1M21FZ2HQ2T83JWY"
|
||||||
|
- my idea was basically to enable the programmer to specify an expected file system structure declaratively, like so:
|
||||||
|
|
||||||
|
```lua
|
||||||
|
root = dir {
|
||||||
|
crates = dir "crates",
|
||||||
|
readme = file "README.md",
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
and then you'd be able to access the readme file via `root.readme`, rather than by the usual fopen-fwrite-fclose interface.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B22EV79JSJ95QGW6Y"
|
||||||
|
- but the more I thought about the idea in this way, the less sense it made.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BZS445KBG16AMWHGS"
|
||||||
|
- _sure_ it may be fine for asserting the expected structure of files in the file system, but what if some of the files don't exist?
|
||||||
|
what about directories where you explicitly _don't_ know about the file structure?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BSKPBDZ6M5BGQNFCW"
|
||||||
|
- basically, this idea was weirdly bidirectional in a way that my little cat brain couldn't process.
|
||||||
|
so that never went anywhere.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B09FX0BEDK6A133G6"
|
||||||
|
- a few weeks ago however, I had a different revelation: what if instead of interfacing with the underlying file system, we... build one?
|
||||||
|
like... you know. a _virtual_ file system?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BKKFS0758KXN22VFQ"
|
||||||
|
- I was familiar with the idea of creating an API exposing a virtual file system through [LÖVE](https://love2d.org/), which exposes bindings to [PhysicsFS](https://icculus.org/physfs/).
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B1SYGAADP1ECXA05G"
|
||||||
|
- I know of [at least one game](https://thelettervsixtim.es/) that makes use of PhysicsFS outside of LÖVE, but I had never used it myself in a project.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BCRY04ZZYYQ0DHVSS"
|
||||||
|
- looking at the API exposed by LÖVE though, it's pretty simple---you get a (global) file system, to which you can add mount points, which represent physical directories on your computer.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B5AR4NQC23ZY6HENB"
|
||||||
|
- this idea seemed incredibly cool, though going with my functional zen I just _needed_ to make it composable---so my end goal was to have a file system made out of pieces you can fit together like LEGO bricks!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BK00WN1B8WGAXEC07"
|
||||||
|
- ### a [fractal](https://www.youtube.com/watch?v=8lF2PQpwTEU){.secret} of files
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B633NSHE870KR761Q"
|
||||||
|
- I started designing.
|
||||||
|
I knew I at least needed the ability to enumerate and read files, so I needed at least these two functions:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
trait Dir {
|
||||||
|
/// List all entries under the given path.
|
||||||
|
fn dir(&self, path: &VPath) -> Vec<VPathBuf>;
|
||||||
|
|
||||||
|
/// Return the byte content of the entry at the given path, or `None` if the path does not
|
||||||
|
/// contain any content.
|
||||||
|
fn content(&self, path: &VPath) -> Option<Vec<u8>>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
this alone already gave me an _insane_ amount of insight!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BPEW2NAT7XSB5F97E"
|
||||||
|
- first of all, from the perspective of the program, do we _really_ need to differentiate between directories and files?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BHQWZDS17DY0096F6"
|
||||||
|
- from a GUI design standpoint, they're a useful abstraction for us humans---the more concrete an object, the easier it is to grasp.
|
||||||
|
_files_ and _folders_ seem a lot more easy to grasp than abstract _entries_ which _may_ represent documents, folders, or both.
|
||||||
|
but for a program, it couldn't care less.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BTKRY1YQ31E5WGGBQ"
|
||||||
|
- compare this code for walking the file system:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn walk_dir_rec(dir: &dyn Dir, path: &VPath, mut f: impl FnMut(&VPath)) {
|
||||||
|
for entry in dir.dir(path) {
|
||||||
|
f(&entry);
|
||||||
|
walk_dir_rec(dir, &entry, f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn process_all_png_files(dir: &dyn Dir) {
|
||||||
|
walk_dir_rec(dir, VPath::ROOT, |path| {
|
||||||
|
if path.extension() == Some("png") {
|
||||||
|
if let Some(content) = dir.content(path) {
|
||||||
|
// do stuff with the file
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJH0ZNAHM8JPQNKMPZKQ2ZT"
|
||||||
|
- to this code, which has to differentiate between files and directories, because calling `dir` on a file or `content` on a directory is an error:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn walk_dir_rec(dir: &dyn Dir, path: &VPath, mut f: impl FnMut(&VPath)) {
|
||||||
|
for entry in dir.dir(path) {
|
||||||
|
f(&entry);
|
||||||
|
if entry.kind == DirEntryKind::Dir {
|
||||||
|
walk_dir_rec(dir, &entry, f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn process_all_png_files(dir: &dyn Dir) {
|
||||||
|
walk_dir_rec(dir, VPath::ROOT, |entry| {
|
||||||
|
if entry.path.extension() == Some("png")
|
||||||
|
&& entry.kind == DirEntryKind::File
|
||||||
|
{
|
||||||
|
if let Some(content) = dir.content(entry.path) {
|
||||||
|
// do stuff with the file
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
to me, the logic seems a lot simpler in the former case, separating the concerns of walking the directory in `walk_dir_rec` from the concerns of reading the files in `process_all_png_files`!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B6GFRVG842PFEEC8E"
|
||||||
|
+ this does not _automatically_ mean it's a good idea to design an operating system around this, but it's interesting to think about the properties that emerge from removing the separation.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BDF8NWFJ90NN1EXSN"
|
||||||
|
- it may not even be the greatest idea to interface with the physical file system in this way, if the communication has to be bidirectional---since real world file systems separate files from directories, think about what happens if your program tries to write `content` to an entry which already has a `dir`.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BHFXM4XR1S4Y9XHBK"
|
||||||
|
- in that manner, this is a leaky abstraction.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BDB79H5S7WJHA53DM"
|
||||||
|
- second... this looks a lot like [resource forks](https://en.wikipedia.org/wiki/Fork_%28file_system%29)!
|
||||||
|
so imagine that you can add even more metadata to file system entries, by adding more methods to this trait.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BG0K62PK3YEH4NAJA"
|
||||||
|
- feels like an incredibly useful way to propagate auxiliary data through the program!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BJ41R94JDY78EYEWZ"
|
||||||
|
- with these two functions, the ability to join paths, and remove their prefixes, this is enough to start building interesting things.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B2GXQGZD3YGB6XYGF"
|
||||||
|
- since we'd like our file system to be composable, we'll need a composition operator first.
|
||||||
|
I'm naming mine `MemDir`, because it represents an in-*mem*ory *`dir`* with entries.
|
||||||
|
I'll spare you the implementation details, but it acts more or less like a hash map:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let mut dir = MemDir::new();
|
||||||
|
dir.add(VPath::new("README.txt"), readme_txt);
|
||||||
|
dir.add(VPath::new("src"), src);
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B96AM20AQ5YGN7XQR"
|
||||||
|
- and now for the _opposite_ of the composition operator: we'll need an operator that decomposes `Dir`s into smaller ones.
|
||||||
|
the name of this one I'll take from the command line---`Cd`, meaning _change directory_:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let src = Cd::new(dir, VPath::new("src"));
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B1N3EF1V2Q8MGK1N7"
|
||||||
|
- and voilá, the file system is composable!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BH8TN1WNZNJEVYEBB"
|
||||||
|
- interestingly enough, assuming your virtual paths cannot represent parent directories, this already forms the foundation of a capability-based file system.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BWSEYK4CFD7KDB0R3"
|
||||||
|
- if you want to give a program access to your `src` directory, but _nothing_ else, give it that `Cd` from above, and it cannot access anything outside of it.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BF77CE3N07NN9F2MX"
|
||||||
|
- ### [get real](https://www.youtube.com/watch?v=rwNSpBYszIk){.secret}
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BQZAY4TP1WG5N6QR3"
|
||||||
|
- having a design in mind, I thought it would be interesting to integrate it into a real project.
|
||||||
|
the treehouse seemed like the right thing to test it out on, since it's effectively a compiler---it transforms a set of source directories into a target directory.
|
||||||
|
|
||||||
|
% id = "01JDJH59SN5219QPGJPZS7FMA6"
|
||||||
|
- and I have to say: so far, I'm liking it!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BVTP35M0QG74A15RJ"
|
||||||
|
- I ended up needing a few more resource forks to implement all the existing functionality.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Dir: Debug {
|
||||||
|
/// List all entries under the provided path.
|
||||||
|
fn dir(&self, path: &VPath) -> Vec<DirEntry>;
|
||||||
|
|
||||||
|
/// Return the byte content of the entry at the given path.
|
||||||
|
fn content(&self, path: &VPath) -> Option<Vec<u8>>;
|
||||||
|
|
||||||
|
/// Get a string signifying the current version of the provided path's content.
|
||||||
|
/// If the content changes, the version must also change.
|
||||||
|
///
|
||||||
|
/// Returns None if there is no content or no version string is available.
|
||||||
|
fn content_version(&self, path: &VPath) -> Option<String>;
|
||||||
|
|
||||||
|
/// Returns the size of the image at the given path, or `None` if the entry is not an image
|
||||||
|
/// (or its size cannot be known.)
|
||||||
|
fn image_size(&self, _path: &VPath) -> Option<ImageSize> {
|
||||||
|
None
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns a path relative to `config.site` indicating where the file will be available
|
||||||
|
/// once served.
|
||||||
|
///
|
||||||
|
/// May return `None` if the file is not served.
|
||||||
|
fn anchor(&self, _path: &VPath) -> Option<VPathBuf> {
|
||||||
|
None
|
||||||
|
}
|
||||||
|
|
||||||
|
/// If a file can be written persistently, returns an [`EditPath`] representing the file in
|
||||||
|
/// persistent storage.
|
||||||
|
///
|
||||||
|
/// An edit path can then be made into an [`Edit`].
|
||||||
|
fn edit_path(&self, _path: &VPath) -> Option<EditPath> {
|
||||||
|
None
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BNDFRCSB9B58MF9H0"
|
||||||
|
- `content_version` and `anchor` are both used to assemble URLs out of `Dir` entries.
|
||||||
|
I have a function `url` which, given a root URL, returns a URL with a `?v=` parameter for cache busting.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn url(site: &str, dir: &dyn Dir, path: &VPath) -> Option<String> {
|
||||||
|
let anchor = dir.anchor(path)?;
|
||||||
|
if let Some(version) = dir.content_version(path) {
|
||||||
|
Some(format!("{}/{anchor}?v={version}", site))
|
||||||
|
} else {
|
||||||
|
Some(format!("{}/{anchor}", site))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BY8KF3AVJDQBED21Z"
|
||||||
|
+ `image_size` is used to automatically determine the size of images at build time.
|
||||||
|
that way I can add `width="" height=""` attributes to all `<img>` tags, preventing [layout shift](https://web.dev/articles/cls).
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BTGNABZVPCV38XFBV"
|
||||||
|
- technically, as of writing this, not _all_ images have this.
|
||||||
|
notably, ones I paste into the markup, like this one:
|
||||||
|
|
||||||
|
![goofy close up screenshot of Hat Kid staring down into the camera][pic:01JDJCD4205JHQ01XJTYPP5KB3]
|
||||||
|
|
||||||
|
if you refresh the page with this branch open, you will notice the layout shift.
|
||||||
|
|
||||||
|
unless I've already fixed it as of the time you're reading this, in which case... :partying:
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BVFG1T741YMEAWMEY"
|
||||||
|
- one notable piece of functionality that is currently missing is _version history_.
|
||||||
|
to be honest, I'm still figuring that one out in my head;
|
||||||
|
I have the feeling it's not exactly going to be simple, but it should end up being a lot more principled than [whatever this was][vh].
|
||||||
|
|
||||||
|
[vh]: https://src.liquidev.net/liquidex/treehouse/src/commit/86b4bf5b2ddb38f3b5fdf362da3c3e75704bfd48/crates/treehouse/src/generate.rs#L491
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BJAKXR98XPJV4FWPC"
|
||||||
|
- ### [Radio Edit (radio edit)](https://www.youtube.com/watch?v=WQzx9o2-0d0){.secret}
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BH3VFBZ4MG5TXJT25"
|
||||||
|
- _but wait liquidex! what's that `edit_path` do?_
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B7WD9RA8KZ1RT4MPS"
|
||||||
|
- one notable thing about this virtual file system is that it doesn't allow writing to the virtual files.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BRFGQX5DK3XKCTH14"
|
||||||
|
- I mean, think about it.
|
||||||
|
it just doesn't make sense!
|
||||||
|
we have a `&` immutable reference, and allowing the program to edit files as it's compiling could wreck some real havoc...!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B4YT790RRS3ME6600"
|
||||||
|
- and not only that, there's also the question of _useless edits_, edits that tweak in-memory files.
|
||||||
|
those don't persist across restarts, so they don't really make much sense, do they?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B2EF87ZMN66P0KJDZ"
|
||||||
|
- it may seem pointless to want to have the treehouse---again, a compiler of sorts---capable of editing source files, but there _is_ one legit use case: I have a `fix` command, which fills in all the missing branch IDs in a file, optionally saving it in place.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B8SAJ90Z3XWW59N0G"
|
||||||
|
- there's also a `fix-all` command, which does this to all files.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BM5ZX03KDAW376WW9"
|
||||||
|
- so I needed to devise an API that would let me _write_ to files after the command is done running.
|
||||||
|
that's what the `edit_path` is for.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B0BCHNHQPZFXVNFA9"
|
||||||
|
- `edit_path` returns an `EditPath`, which represents a location somewhere in persistent storage.
|
||||||
|
having an `EditPath`, you can construct an `Edit`.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Represents a pending edit operation that can be written to persistent storage later.
|
||||||
|
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||||
|
pub enum Edit {
|
||||||
|
/// An edit that doesn't do anything.
|
||||||
|
NoOp,
|
||||||
|
|
||||||
|
/// Write the given string to a file.
|
||||||
|
Write(EditPath, String),
|
||||||
|
|
||||||
|
/// Execute a sequence of edits in order.
|
||||||
|
Seq(Vec<Edit>),
|
||||||
|
/// Execute the provided edits in parallel.
|
||||||
|
All(Vec<Edit>),
|
||||||
|
|
||||||
|
/// Makes an edit dry.
|
||||||
|
///
|
||||||
|
/// A dry edit only logs what operations would be performed, does not perform the I/O.
|
||||||
|
Dry(Box<Edit>),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`Edit`s take many shapes and forms, but the most important one for us is `Write`: it allows you to write a file to the disk.
|
||||||
|
|
||||||
|
the other ones are for composing `Edit`s together into larger ones.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BCT46CXXVKMXATAA3"
|
||||||
|
- `Seq` can be used to implement transactions, where all edits have to succeed for the parent edit to be considered successful.
|
||||||
|
|
||||||
|
% id = "01JDJHDZ8XWEZAA94AW1VS4P53"
|
||||||
|
- I use this for writing backups.
|
||||||
|
if writing a backup fails, we wouldn't want to overwrite the original file, because we wouldn't be able to restore it!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B05H6079QHXAXC8Y4"
|
||||||
|
- `All` can be used to aggregate independent edits together and execute them in parallel.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BKXXJSWYTGX7XA979"
|
||||||
|
- `Dry` can be used to implement a `--dry-run` command, only printing an edit instead of applying it.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B7MNC3CA5DEP1HXM6"
|
||||||
|
+ `NoOp` can be used when you need to produce an `Edit`, but don't actually want to perform any operations.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BJRSMS1E4NQRVZ1H2"
|
||||||
|
- this runs contrary to [my opinion on `None` enums][branch:01HCG7KTGGAFS07QYJXZG6WHJJ], for one reason: would you rather have to handle `Option<Edit>` everywhere, or just assume whatever `Edit` you're being passed is valid?
|
||||||
|
|
||||||
|
% id = "01JDJHJ69797960A42P068X4F2"
|
||||||
|
- try replacing recursive `Edit` references in the example above with `Option<Edit>`s.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BNBB2HQHHZ0WCDA1R"
|
||||||
|
- while `Edit`s are meant to be composed, they aren't really meant to be inspected outside of the code that applies edits to disk---therefore having a `NoOp` variant actually _improves_ readability, since `Edit` is constructed more often than it is deconstructed.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B0NV9B16QJKKK15NG"
|
||||||
|
- if this doesn't yell "you shouldn't treat anyone's opinions as dogma---not even your own ones," I don't know what will.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B988WXSERGV95345R"
|
||||||
|
- although I said before that the fork-based virtual file system is a leaky abstraction when you introduce writing to the physical file system, I don't think this particular API is susceptible to this---since it can expose `EditPath`s for entries that _can_ actually be written (ones with a `content`), you can disallow writing to directories that way.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BGJQXTVZ6Z9BK12BG"
|
||||||
|
- of course then you cannot create directories.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B2JTWKJVEE1VP6P68"
|
||||||
|
- also, [TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use) bugs are a thing, but I disregard those as they don't really fit into a compiler's threat model.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BQVB72WJJ2DV1F91H"
|
||||||
|
- as I said in the beginning, I don't really like mutable state...
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BQSKCS5GC93QF9EE1"
|
||||||
|
- ### improvise, _adapt_, overcome
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BAN5P6KGHM5PVFQ78"
|
||||||
|
- thanks to the `Dir`'s inherent composability, it is trivial to build adapters on top of it.
|
||||||
|
I have a few in the treehouse myself.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BCG0W3SWSYR647ZCC"
|
||||||
|
- `TreehouseDir` is a file system which renders `.tree` files into HTML lazily.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BBJCR1TQCBKBHMBNS"
|
||||||
|
- since generating all this HTML is an expensive operation, I have it wrapped in a `ContentCache` adapter, which caches any successfully generated pages in memory.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B5458Y8KJKZ2K0HVR"
|
||||||
|
- I pre-warm the cache with all pages too, so that I can see any warnings that arise during generation.
|
||||||
|
this is done on multiple threads, because all my `Dir`s are `Send + Sync`.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BKJW4SJGK2MSX1HY3"
|
||||||
|
- I also have a `Dir` called `HtmlCanonicalize` layered on top of the `ContentCache`, which removes `html` extensions from paths.
|
||||||
|
`TreehouseDir` exposes canonical paths _without_ an `html`, but I have to support it for compatibility with old links.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BRMRT5YMTCTA4KQ38"
|
||||||
|
- `Blake3ContentVersionCache` is the sole implementor of `Dir::content_version`.
|
||||||
|
its purpose is to compute `content_version`s and cache them in memory for each path.
|
||||||
|
as the name suggests, versions are computed using a (truncated) [BLAKE3](https://en.wikipedia.org/wiki/BLAKE_%28hash_function%29#BLAKE3) hash.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BKKFMW6PQX3FZS4D4"
|
||||||
|
- `ImageSizeCache` is the sole implementor of `Dir::image_size`.
|
||||||
|
for paths with a supported extension (`png`, `svg`, `jpg`, `jpeg`, `webp`), it reads the stored image file's size and caches it in memory.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B29N205TP3NT2CR4R"
|
||||||
|
- `Anchored` is the sole implementor of `Dir::anchor`.
|
||||||
|
it lets me specify that the source file system's `static` directory ends up being available under `/static` on the website.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BCNF5DNH8FN8WP7H8"
|
||||||
|
- `Overlay` combines a base directory with an overlay directory, first routing requests to the overlay directory, and if those fail, routing them to the base directory.
|
||||||
|
this allows me to overlay a `MemDir` with the `static` directory and a `robots.txt` on top of a `TreehouseDir`---which together form the compiler's target directory.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BXA6G2AX5F2RVNH35"
|
||||||
|
- and guess what---the server serves straight from that virtual target directory, too!
|
||||||
|
after all, what's the point of writing it to disk if you already have everything assembled.
|
||||||
|
_It's all dynamic.™_
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B2KFPFAB8BRV735YA"
|
||||||
|
- for the curious, here's roughly how the treehouse's virtual file systems are structured:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
source: ImageSizeCache(Blake3ContentVersionCache(MemDir {
|
||||||
|
"treehouse.toml": BufferedFile(..), // content read at startup
|
||||||
|
"static": Anchored(PhysicalDir("static"), "static"),
|
||||||
|
"template": PhysicalDir("template"),
|
||||||
|
"content": PhysicalDir("content"),
|
||||||
|
}))
|
||||||
|
|
||||||
|
target: Overlay(
|
||||||
|
HtmlCanonicalize(ContentCache(TreehouseDir)),
|
||||||
|
MemDir {
|
||||||
|
"static": Cd(source, "static"),
|
||||||
|
"robots.txt": Cd(source, "static/robots.txt"),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BZKPH3ATEE1YP2MYS"
|
||||||
|
- I'm not too fond of that `treehouse.toml` hack, but this is because `PhysicalDir` cannot really be attached to singular files right now.
|
||||||
|
I'll definitely fix that one day...
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B1DG4Y4PTVRT0NQFV"
|
||||||
|
- ### a fatal [flaw](https://musicbyflaws.bandcamp.com/){.secret}
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BVV81GF2GRTQVCK7E"
|
||||||
|
- there is one flaw I want to point out with the current implementation of `Dir`: it uses trait methods to add new resource forks.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BJS79G1VJN1G29GRK"
|
||||||
|
- any time you add a new resource fork that's implemented only by one or two `Dir`s, you have to duplicate a stub across all existing adapters!
|
||||||
|
this means introducing a new fork means performing up to `N` edits, where `N` is the number of implementers of `Dir`.
|
||||||
|
|
||||||
|
% id = "01JDJJDAPMEVMJNQX7NF2AWEJE"
|
||||||
|
- this problem doesn't exist for `Dir`s which are _not_ adapters (aka suppliers), but in practice there are far less suppliers than adapters.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B7VDMV5HT6Q1W664D"
|
||||||
|
- one idea I've had to [fix this](https://anilist.co/anime/9253/SteinsGate/){.secret title="My Fork."} was to change the API shape to a single trait method.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub trait Dir {
|
||||||
|
fn forks(&self, path: &VPath, forks: &mut Forks);
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Dir for MyDir {
|
||||||
|
fn forks(&self, path: &VPath, forks: &mut Forks) {
|
||||||
|
forks.insert(|| MyFork);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T> Dir for AdapterDir<T> {
|
||||||
|
fn forks(&self, path: &VPath, forks: &mut Forks) {
|
||||||
|
self.inner.forks(path, forks);
|
||||||
|
|
||||||
|
forks.insert(|| MyMomsEpicSilverware);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
but that hasn't come to fruition yet, as I have no idea how to make it efficient yet object-safe...
|
||||||
|
I'm yet to add profiling to the treehouse, so I don't want to make risky performance decisions like this at this point.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B9799AC57MYCSA9X1"
|
||||||
|
- dynamic typing has a cost, after all!
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BTM4MH9PC4M15MJWW"
|
||||||
|
- maybe one day I'll post an update on that.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BMVXNKZPJ6YHR7BMB"
|
||||||
|
- and that basically concludes all this virtual file system shenaniganry!
|
||||||
|
I hope you got something useful out of it.
|
||||||
|
|
||||||
|
% id = "01JDJGVC7B60FZDTJFJMQ09EVV"
|
||||||
|
- I look forward to finding out how this system will fare in the long run.
|
||||||
|
guess I'll report my progress, I dunno. next year?
|
||||||
|
|
||||||
|
% id = "01JDJGVC7BG54TARPJ20T48ZGP"
|
||||||
|
- see you then!
|
|
@ -14,6 +14,18 @@ if you've been wondering what I've been up to, you've come to the right place.
|
||||||
% id = "01JCGWPM6TMAJT0B50GQSA4BDW"
|
% id = "01JCGWPM6TMAJT0B50GQSA4BDW"
|
||||||
- there is currently no RSS or Atom feed for this page, sorry!
|
- there is currently no RSS or Atom feed for this page, sorry!
|
||||||
|
|
||||||
|
% id = "01JDJJSEWASRWJGKMBNYMFD9B5"
|
||||||
|
- ### [composable virtual file systems][page:programming/blog/vfs]
|
||||||
|
|
||||||
|
% id = "01JDJJSEWAVZGJN3PWY94SJMXT"
|
||||||
|
- I recently rebuilt the treehouse to use a virtual file system for its source and target directories.
|
||||||
|
|
||||||
|
% id = "01JDJJSEWA7K5T3Z0Y6NQ8RBGX"
|
||||||
|
- this is an exploration of how I built my abstraction, how it works, and what I learned from it.
|
||||||
|
|
||||||
|
% id = "01JCGAM553TJJCEJ96ADEWETQC"
|
||||||
|
- ### [prefix matches with C strings][page:programming/blog/cstring-starts-with]
|
||||||
|
|
||||||
% id = "01JBAGZAZ30K443QYPK0XBNZWM"
|
% id = "01JBAGZAZ30K443QYPK0XBNZWM"
|
||||||
- ### [the curious case of Amon Tobin's Creatures][page:music/creatures]
|
- ### [the curious case of Amon Tobin's Creatures][page:music/creatures]
|
||||||
|
|
||||||
|
@ -157,3 +169,4 @@ if you've been wondering what I've been up to, you've come to the right place.
|
||||||
% id = "01HQ6G30PT1D729Z29NYVDCFDB"
|
% id = "01HQ6G30PT1D729Z29NYVDCFDB"
|
||||||
- this post explores basically just that.
|
- this post explores basically just that.
|
||||||
|
|
||||||
|
|
||||||
|
|
BIN
static/pic/01JDJCD4205JHQ01XJTYPP5KB3-hatkid+width752.jpg
Executable file
BIN
static/pic/01JDJCD4205JHQ01XJTYPP5KB3-hatkid+width752.jpg
Executable file
Binary file not shown.
After Width: | Height: | Size: 145 KiB |
Loading…
Reference in a new issue