diff --git a/content/fmt2.dj b/content/fmt2.dj index 8fe4649..5399a67 100644 --- a/content/fmt2.dj +++ b/content/fmt2.dj @@ -19,8 +19,7 @@ So I started thinking about how I could improve on that. I went back and forth trying to come up with something sensible, but nothing _simple_ was ever coming to mind.\ Until today. -This write-up describes this an improved version of the library in detail, with support for positional arguments, and much smaller generated code size! -I hope you like it. +This write-up describes this alternative version of the library in detail. ## Usage @@ -64,7 +63,7 @@ assert(strcmp( ) == 0); ``` -Other than that, the same invariants are upheld as in the previous library: the library never writes past the input buffer, gives back the amount of characters that would be written, and never allows reading arguments out of bounds. +Other than that, the same invariants are upheld as in the previous library: the library never writes past the input buffer, and never allows reading arguments out of bounds. As a reminder, the previous library printed holes without corresponding arguments verbatim (as the string `{}`). This library does the same thing, though using the new format string syntax. @@ -154,10 +153,10 @@ This trick with a template erasing `T*` into `void*` is actually really useful i If there's any knowledge worth remembering from this article, it would be this technique. Aside from that, we once again make use of parameter packs. -This time not with [fold expressions](https://en.cppreference.com/w/cpp/language/fold.html), but with an [expansion inside a brace-enclosed initialiser](https://en.cppreference.com/w/cpp/language/parameter_pack.html#Brace-enclosed_initializers). +This time not with [fold expressions](https://en.cppreference.com/w/cpp/language/fold.html), but with an ordinary [expansion inside a brace-enclosed initialiser](https://en.cppreference.com/w/cpp/language/parameter_pack.html#Brace-enclosed_initializers). -Among the [`const` soup](https://cdecl.org/), you may notice the `ffuncs` array being `static const`. -This little trick reduces code size a bit, because it will make the compiler generate a lookup table in the executable's read-only data section, instead of generating code to write the function pointers onto the stack. +Among the `const` soup, you may notice the `ffuncs` array being `static const`. +This little trick reduces code size a bit, because it will make the compiler generate a lookup table in the executable's `.rodata` section instead of copying the function pointers onto the stack. Finally, we get to `format_untyped`, which parses the format string, writing out the verbatim parts, and calling the appropriate format function whenever a hole is encountered. @@ -301,14 +300,9 @@ The same [extra goodies][page:fmt#Extras] as in the previous post can be used (i This library is slightly larger than the previous, being 73 lines of code long. I think the extra bit of functionality is useful enough that it's a worthy tradeoff, though. -...is what I would say if I really _cared_ about squeezing every last line of code out of the library, but I don't! -I wrote this little library to be simple, extensible, and maintainable, so don't treat it as code golf. -Go with the extra lines of code. -This version of the library is better. +I only really flailed the 65 loc figure in the title of the last post to poke fun at the complexity of popular template-heavy string formatting libraries. -It's in the same ballpark either way, and honestly I only really flailed the 65 loc figure in the title of the last post to poke fun at the complexity of popular template-heavy string formatting libraries.\ And `printf`. - Don't forget about `printf`. @@ -328,241 +322,20 @@ If you find the 0-based indexing unnatural, it's easy enough to switch it to 1-b ### Code size The assembly for the formatting code comes out a lot more compact, because the compiler no longer has to generate potentially very long and repetitive code for calling `next_hole` and `write_value` repeatedly. - -The new string formatter instead initialises a lookup table with value pointers on the stack, and passes it to `format_untyped`, along with a static table containing pointers to functions which can print the values. +Instead, it initialises the lookup table with value pointers on the stack, and passes it to `format_untyped`, along with a static table containing pointers to functions which can print the values. This is a good thing for embedded use cases. You should prefer this implementation over the previous one for that. -In my game, the size emitted into the executable for instantiations of `format` is *61.7%* of the previous version! -Read on for a detailed analysis. +I couldn't get clang to inline references to `write_value` into the function table, though. +There's always an intermediary function generated from the instantiation of `write_value_erased`. ---- +This can get pretty bad when passing array types into format arguments (e.g. string literals)---generating many redundant variants of the function, like `write_value_erased`, `write_value_erased`, etc. -To illustrate this a bit, let's set an example. -I have a function `usages` which uses `format` in a few different ways. +I believe this is due to function pointers of different types not always being interchangeable according to the C++ standard. +You could work around this little inefficiency by casting between function pointers, and it _does_ [seem](https://stackoverflow.com/a/559671) [safe](https://stackoverflow.com/q/11647220) to me in this case (the function pointers have the same return type, the same number of arguments, and the types of arguments `const void*` and `const char*` are compatible)---but I don't really think it's worth it. -```cpp -void usages( - String_Buffer& buf, - const char* filename, - const char* part_name, int part_index, - Entity_Id entity_id, Vec3 position) -{ - format(buf, "/prop/{}", filename); - format(buf, "Part #{} ({})", part_index, part_name); - format(buf, "{} at {}", entity_id, position); -} -``` - -This will generate three separate template instantiations of `format`: - -```cpp -void format(String_Buffer&, char const*, char const* const&) -void format(String_Buffer&, char const*, int const&, char const* const&) -void format(String_Buffer&, char const*, Entity_Id const&, Vec3 const&) -``` - -Clang 21.1.0 with `-O3` inlines these into the calling function, but let's consider them out of line for this example. -Each instantiation, after inlining, results in code that looks like this: - -```cpp -void format( - String_Buffer& buf, - char const* fstr, - int const& a1, - char const* const& a2) -{ - if (next_hole(buf, fstr)) - write_value(buf, a1); - if (next_hole(buf, fstr)) - write_value(buf, a2); - while (next_hole(buf, fstr)) {} -} -``` - -From the machine code perspective, this comes out to 120 bytes of code. -This doesn't seem like a lot, but it quickly multiplies when you consider that format function calls are going to end up having different sets of arguments types. - -My game currently has about 16k lines of code, though barely any user-facing text right now---most of it is logs and ImGui strings---but there are quite a few unique instantiations of `format` (35 to be exact.) -Here's the full list with their byte sizes (click the *bold* header to unfold the list). - -:::: details - -::: summary - -List of `fmt::format` instantiations sorted by byte size - -::: - -::: details-content - -```cpp -0x5c <> -0x6c -0x6c -0x6c -0x6c -0x6c -0x6c -0x6c -0x6c -0x6c -0x6c -0x7c -0x7c -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0x8e -0xb0 -0xc0 -0xc0 -0xc0 -0xc0 -0xc0 -0xd2 -0xd2 -0xf2 -0xf2 -``` - -::: - -:::: - -Summing it all up, that's 5164 bytes of machine code. -For 35 unique combinations of arguments! - -Now, let's replace the previous function with the new one. -Granted, this is using the new `%n` syntax which is incompatible with the old `{}`, and I haven't replaced the format strings---but the format string themselves do not affect the machine code size, so that's fine. - -First, there's the static data for the function lookup tables. - -:::: details - -::: summary - -List of lookup tables from instantiations of the new `fmt::format` - -::: - -::: details-content - -```cpp -0x00 <> -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x08 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x10 -0x18 -0x18 -0x18 -0x18 -0x18 -0x18 -0x20 -0x20 -0x28 -0x28 -``` - -::: - -:::: - -This comes out at 576 bytes total, obviously with tables for more arguments taking up more space. - -In an embedded setting, this will likely be a lot less due to a smaller (16-bit or 32-bit) memory space, and therefore 2× or 4× smaller pointers. - -Now, for the functions themselves. -Remember that these instantiations only set up the lookup tables for `format_untyped`, so they're likely to be inlined into the caller---though I've inhibited that with the `[[gnu::noinline]]` attribute, to sum up the figures for this post. - -:::: details - -::: summary - -List of instantiations of the new `fmt::format` - -::: - -::: details-content - -```cpp -0x3f <> -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x47 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x49 -0x4e -0x4e -0x4e -0x4e -0x4e -0x4e -0x53 -0x53 -0x5d -0x5d -``` - -::: - -:::: - -That's 2611 bytes of machine code, and summing it up with the space taken up by lookup tables, comes out at 3187 bytes in the executable. -That's *61.7%* the size of the previous version---quite a hefty save! - -And this would only multiply in larger codebases. -Imagine the megabytes of disk space saved if a refactor of this scale were done on the Unreal Engine... +It's not an insurmountable task, but _not_ doing it and letting the compiler do all the necessary ABI shuffling---at the expense of an extra `jmp` in case it's not needed---is probably not a big performance cost, either. ### C version @@ -575,6 +348,3 @@ The worst part would probably be emulating the parameter packs, because the prep Maybe in another post. ---- - -Thank you once again to my friend Tori for reviewing a draft of this post! diff --git a/src/html/djot.rs b/src/html/djot.rs index 5e475de..45a6f8d 100644 --- a/src/html/djot.rs +++ b/src/html/djot.rs @@ -311,9 +311,6 @@ impl<'a> Writer<'a> { } match c { - Container::Heading { id, .. } => { - write!(out, r##">"##)?; - } Container::TableCell { alignment, .. } if !matches!(alignment, Alignment::Unspecified) => { @@ -448,7 +445,7 @@ impl<'a> Writer<'a> { } out.push_str("

"); } - Container::Heading { level, .. } => write!(out, "
")?, + Container::Heading { level, .. } => write!(out, "")?, Container::TableCell { head: false, .. } => out.push_str(""), Container::TableCell { head: true, .. } => out.push_str(""), Container::Caption => out.push_str(""), diff --git a/static/css/doc.css b/static/css/doc.css index b4c6349..dbcfa38 100644 --- a/static/css/doc.css +++ b/static/css/doc.css @@ -56,12 +56,6 @@ main.doc { grid-column: main; } - & hr, - & pre, - & th-literate-program { - grid-column: left-code / right-wide; - } - & p { padding-top: 0.5lh; padding-bottom: 0.5lh; @@ -73,8 +67,7 @@ main.doc { padding-bottom: 0.5lh; } - & h3, - & h4 { + & h3 { margin: 0; padding-top: 0.5lh; padding-bottom: 0.5lh; @@ -97,6 +90,7 @@ main.doc { & pre, & th-literate-program { padding: 0.8rem var(--code-block-h-padding); + grid-column: left-code / right-wide; & code { --recursive-wght: 500; @@ -155,69 +149,6 @@ main.doc { text-align: center; } - & details { - /* I wanted this to work on grid layout, but currently it is impossible to set -
to display: grid; across all browsers. - Instead you have to include a element after the summary. */ - - --details-marker-size: var(--code-block-h-padding); - --details-indent-size: var(--code-block-h-padding); - - grid-column: left-code / right-wide; - - padding-top: 0.5lh; - padding-bottom: 0.5lh; - - & > summary { - display: flex; - flex-direction: row; - align-items: center; - - --recursive-wght: 600; - border-bottom: 1px solid var(--border-1); - - cursor: pointer; - - &::before { - content: ""; - - display: block; - width: calc(2 * var(--details-marker-size)); - height: calc(2 * var(--details-marker-size)); - flex-shrink: 0; - - background-image: var(--icon-expand); - background-position: 50% 50%; - background-repeat: no-repeat; - } - } - - &[open] > summary::before { - background-image: var(--icon-collapse); - } - - & > details-content { - display: grid; - grid-template-columns: - [indent] auto - [main] 1fr; - - border-bottom: 1px solid var(--border-1); - - &::before { - content: ""; - display: block; - width: 100%; - margin: 0 var(--details-indent-size); - border-left: 1px solid var(--border-1); - } - - & > * { - grid-column: main; - } - } - } - & .wide { grid-column: left-wide / right; } @@ -278,18 +209,15 @@ main.doc { & .doc-text { --code-block-grid-space: 0; - & details { - --details-marker-size: 1.6rem; - --details-indent-size: 1.6rem; - } - - & > pre, - & > th-literate-program { + & pre, + & th-literate-program { /* Stretch to whole page. This way of doing it feels a bit brittle, though. It might be good to refactor this to CSS grid at some point. */ padding-left: var(--doc-padding); padding-right: var(--doc-padding); + margin-left: calc(var(--doc-padding) * -1); + margin-right: calc(var(--doc-padding) * -1); border-radius: 0; border-left: none; border-right: none; @@ -301,20 +229,6 @@ main.doc { } } - & > pre, - & > th-literate-program, - & > details { - margin-left: calc(var(--doc-padding) * -1); - margin-right: calc(var(--doc-padding) * -1); - } - - & > details { - & > summary, - & > details-content { - padding-right: var(--doc-padding); - } - } - & figure figcaption { &.overlay-bottom-right { position: static; diff --git a/static/css/main.css b/static/css/main.css index 0a4bb77..4e77930 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -376,41 +376,6 @@ a.secret { text-decoration: none; } -/* Links to headings should be invisible by default, only appearing on hover. */ - -h1, -h2, -h3, -h4, -h5, -h6 { - & > a { - color: var(--text-color); - text-decoration: none; - - &:visited { - color: var(--text-color); - } - - &:hover { - text-decoration: underline; - } - } -} - -@media (hover: none) { - h1, - h2, - h3, - h4, - h5, - h6 { - & > a { - text-decoration: underline; - } - } -} - /* Make blockquotes a bit prettier */ blockquote { @@ -535,17 +500,17 @@ section.feed { /* Titles */ & h2 { - & a { + & a, + & a:visited { color: var(--text-color); - text-decoration: underline; + } - &:visited { - color: color-mix( - in srgb, - var(--background-color), - var(--text-color) 60% - ); - } + & a:visited { + color: color-mix( + in srgb, + var(--background-color), + var(--text-color) 60% + ); } } @@ -753,6 +718,8 @@ h1.page-title { text-decoration: underline; text-decoration-color: transparent; + transition: var(--transition-duration) text-decoration-color; + &:hover { text-decoration-color: var(--text-color); }