Description
In 665c428, I've added a benchmark that focuses on the performance of decoding strings. Here's the performance of decoding 4KB of json, most of which is just 30-character strings:
benchmarked json/url/100/decode
time 8.053 μs (7.736 μs .. 8.411 μs)
0.996 R² (0.989 R² .. 0.999 R²)
mean 8.089 μs (8.051 μs .. 8.139 μs)
std dev 74.41 ns (48.48 ns .. 107.8 ns)
benchmarked aeson/url/100/decode
time 22.66 μs (21.50 μs .. 24.51 μs)
0.994 R² (0.989 R² .. 0.999 R²)
mean 22.72 μs (22.49 μs .. 22.84 μs)
std dev 276.1 ns (159.0 ns .. 414.7 ns)
Not bad. We're ahead of aeson by a factor of three, but can this number be improved further? String decode currently walks the string, byte-by-byte, until it finds the end of it. As it walks the string, it keeps up with information about whether or not anything will need to be unescaped. I think that it should be possible to instead walk the string w64-by-w64. This could be done by adapting the approach in bytestring-encodings to work with ByteArray
instead of Ptr
and adding some additional bit twiddling hacks. The general idea would be:
- Fail if you encounter a backslash
- Fail if you encounter a byte less than 0x20
- Fail if you encounter a byte greater than 0x7E
- Fail if your read would give you a w64 that straddled the end of the string (simplifies things a little)
- Succeed if you encounter a
"
Failure just means to fall back to the existing string decode logic, and succeed means that we may perform a memcpy (as we do now). This whole thing is a little bit tricky because it's possible to encounter both a "
and a failing byte in them same w64, and then the order that they showed up in matters. But I think that the conservative action of always failing even if the quote showed up first is probably the best course. It simplifies it, and the bytes after a string ends are probably ascii characters anyway, so this shouldn't cost a ton of performance.
This needs to be implemented and benchmarked, but I think it could this benchmark at least 2x faster.