title = "A simple string formatting library, round two"
+++
_This is a draft post.
Please don't share it around just yet!_
---
After implementing [a simple string formatting library in 65 lines of code][page:fmt.dj], I thought I was done.
The API was nice, it did what I needed it to do, so: _project complete_.
One piece of functionality that's commonly available in more complete libraries was missing though, and that was _positional arguments_.
I got a [comment on Lobsters](https://lobste.rs/c/bbwxbz) about it at the time, though I waved it away thinking I wouldn't need positional arguments anyways.
So far, I've been right---there still haven't been any cases in my game where I _needed_ positional arguments---but without them, the library feels a bit... incomplete.
So I started thinking about how I could improve on that.
I went back and forth trying to come up with something sensible, but nothing _simple_ was ever coming to mind.\
This write-up describes this an improved version of the library in detail, with support for positional arguments, and much smaller generated code size!
`{}` no longer needs escaping, but the `%0` pattern does---and it's done in a manner similar to that of libc `printf`.
To print a single `%` character followed by a digit, double the `%`, like so:
```cpp
fmt::format(buf, "%%0");
assert(strcmp(str, "%0") == 0);
```
Any other patterns are interpreted verbatim, and do not need escaping---though you may choose to double the `%`, for consistency with cases where it does need escaping.
```cpp
fmt::format(buf, "%0% complete", "3.14");
assert(strcmp(str, "3.14% complete") == 0);
```
The big advantage of this library comes from the fact that arguments can be reordered or repeated.
```cpp
fmt::format(
buf,
"Words of the day: '%2', '%1', '%0'. You chose '%1'.",
"matrix", "crystal", "rivulet");
assert(strcmp(
str,
"Words of the day: 'rivulet', 'crystal', 'matrix'. You chose 'matrix'."
Other than that, the same invariants are upheld as in the previous library: the library never writes past the input buffer, gives back the amount of characters that would be written, and never allows reading arguments out of bounds.
As a reminder, the previous library printed holes without corresponding arguments verbatim (as the string `{}`).
This library does the same thing, though using the new format string syntax.
```cpp
fmt::format(buf, "%0% complete" /* no arguments */);
assert(strcmp(str, "%0% complete") == 0);
```
## Implementation walkthrough
The library starts off with the same boilerplate as [the first one][page:fmt.dj].
If you haven't read the original write-up, I would strongly recommend doing that right now.
```cpp
#include <cstring>
struct String_Buffer
{
char* str;
int cap;
int len = 0;
};
static void write(String_Buffer& buf, const char* str, int len)
{
int remaining_cap = buf.cap - buf.len - 1; // leave one byte for NUL
int write_len = len > remaining_cap ? remaining_cap : len;
if (write_len > 0)
memcpy(buf.str + buf.len, str, write_len);
buf.len += len;
}
```
The signatures for functions writing values into the output string stay the same.
This library uses function overloading to resolve which function a value should be printed with, based on the value's type---just like the first library did.
The difference comes from how parsing the string and writing out format arguments is driven.
Previously, parsing the string was driven by an expansion of a parameter pack.
It allowed the library to remain compact and simple, but it's what prevented reordering or repeating of arguments.
There is no way you can reorder the expansion of a parameter pack (or, a compile-time operation) using run-time values.
This time around, arguments are passed into the library through an array of *type-erased pointers to the values*, as well as *functions that format the values behind those pointers.*
```cpp
using Format_Function_Untyped = void(String_Buffer& buf, const void* value_ptr);
void format_untyped(
String_Buffer& buf,
const char* fstr,
int nargs,
const void* const* values,
Format_Function_Untyped* const* ffuncs);
```
`format` then constructs those arrays in a type-safe manner, helping itself with a template function, which _erases_ its argument's type into `const void*`---allowing its instantiations to be stuffed into an array of function pointers of the same type.
This time not with [fold expressions](https://en.cppreference.com/w/cpp/language/fold.html), but with an [expansion inside a brace-enclosed initialiser](https://en.cppreference.com/w/cpp/language/parameter_pack.html#Brace-enclosed_initializers).
Among the [`const` soup](https://cdecl.org/), you may notice the `ffuncs` array being `static const`.
This little trick reduces code size a bit, because it will make the compiler generate a lookup table in the executable's read-only data section, instead of generating code to write the function pointers onto the stack.
Finally, we get to `format_untyped`, which parses the format string, writing out the verbatim parts, and calling the appropriate format function whenever a hole is encountered.
```cpp
void format_untyped(
String_Buffer& buf,
const char* fstr,
int nargs,
const void* const* values,
Format_Function_Untyped* const* ffuncs)
{
const char* start = fstr;
while (*fstr != 0) {
if (*fstr == '%') {
write(buf, start, fstr - start);
++fstr;
start = fstr;
if (*fstr >= '0' && *fstr <= '9') {
int index = *fstr - '0';
if (index < nargs) {
(ffuncs[index])(buf, values[index]);
++fstr;
start = fstr;
} else {
start = fstr - 1; // include %n sequence verbatim
}
}
} else {
++fstr;
}
}
write(buf, start, fstr - start);
}
```
The full code listing, split into a header and an implementation file, and packed into a namespace, is available below.
The same [extra goodies][page:fmt#Extras] as in the previous post can be used (implementations of `write_value` for various types, functions improving ergonomics).
## Remarks
### Source code length
This library is slightly larger than the previous, being 73 lines of code long.
I think the extra bit of functionality is useful enough that it's a worthy tradeoff, though.
It's in the same ballpark either way, and honestly I only really flailed the 65 loc figure in the title of the last post to poke fun at the complexity of popular template-heavy string formatting libraries.\
I chose `%n` as the hole syntax this time, because it was about as simple to parse as the previous choice of `{}`.
If I went with `{n}` instead, I would have to add extra code checking that there's only one digit inside the braces.
The downside of course is that you need to write out the indices manually in case they're a monotonic sequence (`%1, %2, %3`), as there is no natural syntax for an indexless hole (like there is with `{}`).
Also, due to the `%n` syntax only parsing one character, the library is limited to accepting 10 format arguments `%0`--`%9`.
That should be more than enough for 99.999% of your use cases, though.
It would be easier to add support for multiple digits with the `{n}` syntax, but the runtime cost would be bigger than supporting only a single digit.
If you find the 0-based indexing unnatural, it's easy enough to switch it to 1-based by tweaking the digit parsing code in `format_untyped`.
### Code size
The assembly for the formatting code comes out a lot more compact, because the compiler no longer has to generate potentially very long and repetitive code for calling `next_hole` and `write_value` repeatedly.
The new string formatter instead initialises a lookup table with value pointers on the stack, and passes it to `format_untyped`, along with a static table containing pointers to functions which can print the values.
From the machine code perspective, this comes out to 120 bytes of code.
This doesn't seem like a lot, but it quickly multiplies when you consider that format function calls are going to end up having different sets of arguments types.
My game currently has about 16k lines of code, though barely any user-facing text right now---most of it is logs and ImGui strings---but there are quite a few unique instantiations of `format` (35 to be exact.)
Here's the full list with their byte sizes (click the *bold* header to unfold the list).
:::: details
::: summary
List of `fmt::format` instantiations sorted by byte size
:::
::: details-content
```cpp
0x5c <>
0x6c <bool>
0x6c <char [128]>
0x6c <char [20]>
0x6c <char [32]>
0x6c <char const*>
0x6c <char*>
0x6c <double>
0x6c <int>
0x6c <unsigned int>
0x6c <unsigned long>
0x7c <Format_Hex>
0x7c <float>
0x8e <Writer::Chunk*, unsigned long long>
0x8e <char [128], char const*>
0x8e <char const*, Entity_Id>
0x8e <char const*, Vec2i>
0x8e <char const*, Vec3>
0x8e <char const*, char [80]>
0x8e <char const*, char const*>
0x8e <char const*, char*>
0x8e <char const*, unsigned int>
0x8e <int, char [32]>
0x8e <int, char const*>
0x8e <int, int>
0xb0 <char [32], char const*, char const*>
0xc0 <Writer::Chunk*, unsigned long long, char const*>
Summing it all up, that's 5164 bytes of machine code.
For 35 unique combinations of arguments!
Now, let's replace the previous function with the new one.
Granted, this is using the new `%n` syntax which is incompatible with the old `{}`, and I haven't replaced the format strings---but the format string themselves do not affect the machine code size, so that's fine.
First, there's the static data for the function lookup tables.
:::: details
::: summary
List of lookup tables from instantiations of the new `fmt::format`
:::
::: details-content
```cpp
0x00 <>
0x08 <Format_Hex>
0x08 <bool>
0x08 <char [128]>
0x08 <char [20]>
0x08 <char [32]>
0x08 <char const*>
0x08 <char*>
0x08 <double>
0x08 <float>
0x08 <int>
0x08 <unsigned int>
0x08 <unsigned long>
0x10 <Writer::Chunk*, unsigned long long>
0x10 <char [128], char const*>
0x10 <char const*, Entity_Id>
0x10 <char const*, Vec2i>
0x10 <char const*, Vec3>
0x10 <char const*, char [80]>
0x10 <char const*, char const*>
0x10 <char const*, char*>
0x10 <char const*, unsigned int>
0x10 <int, char [32]>
0x10 <int, char const*>
0x10 <int, int>
0x18 <Writer::Chunk*, unsigned long long, char const*>
This comes out at 576 bytes total, obviously with tables for more arguments taking up more space.
In an embedded setting, this will likely be a lot less due to a smaller (16-bit or 32-bit) memory space, and therefore 2× or 4× smaller pointers.
Now, for the functions themselves.
Remember that these instantiations only set up the lookup tables for `format_untyped`, so they're likely to be inlined into the caller---though I've inhibited that with the `[[gnu::noinline]]` attribute, to sum up the figures for this post.
:::: details
::: summary
List of instantiations of the new `fmt::format`
:::
::: details-content
```cpp
0x3f <>
0x47 <Format_Hex>
0x47 <bool>
0x47 <char [128]>
0x47 <char [20]>
0x47 <char [32]>
0x47 <char const*>
0x47 <char*>
0x47 <double>
0x47 <float>
0x47 <int>
0x47 <unsigned int>
0x47 <unsigned long>
0x49 <Writer::Chunk*, unsigned long long>
0x49 <char [128], char const*>
0x49 <char const*, Entity_Id>
0x49 <char const*, Vec2i>
0x49 <char const*, Vec3>
0x49 <char const*, char [80]>
0x49 <char const*, char const*>
0x49 <char const*, char*>
0x49 <char const*, unsigned int>
0x49 <int, char [32]>
0x49 <int, char const*>
0x49 <int, int>
0x4e <Writer::Chunk*, unsigned long long, char const*>
It feels to me like with a bit of elbow grease, this library could be adapted to plain old C as well.
You would have to replace the template bits with macros and `...` / `va_list` though, and I'm not entirely sure how to resolve types into functions---feels like C11 `_Generic` could be of help.
The worst part would probably be emulating the parameter packs, because the preprocessor doesn't make it easy to transform the individual elements in a `__VA_ARGS__` list---but it [seems possible](https://github.com/pfultz2/Cloak/wiki/C-Preprocessor-tricks,-tips,-and-idioms), if a bit horrid.