Skip to content

std: rework HTTP and TLS for new I/O API #24698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 8, 2025
Merged

std: rework HTTP and TLS for new I/O API #24698

merged 19 commits into from
Aug 8, 2025

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Aug 5, 2025

Followup from #24329.

Bonus breakage:

  • delete std.Io.LimitedReader
  • delete std.Io.BufferedReader
  • delete std.fifo

Performance Data Points

Compiler Binary Size (ReleaseSmall)

13.5 -> 13.4 MiB (-1%)

Merge checklist

  • get std lib unit tests passing on windows
  • finish updating fetch logic with respect to copying Resource
  • do some QA to ensure non regressions of HTTP and TLS code

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. standard library This issue involves writing Zig code for the standard library. release notes This PR should be mentioned in the release notes. labels Aug 5, 2025
@andrewrk andrewrk force-pushed the http branch 2 times, most recently from 628de83 to 309a945 Compare August 5, 2025 05:38
@squeek502
Copy link
Collaborator

Side note: I'll work on upstreaming all the resinator changes so far to double check its still working as intended before 0.15.0.

@andrewrk
Copy link
Member Author

andrewrk commented Aug 5, 2025

Thanks. If you have the time and motivation, it would be nice to do a more thorough upgrade, too. For instance I noticed a lot of those seeks could become no-ops once they are using File.Reader API.

@ianic
Copy link
Contributor

ianic commented Aug 5, 2025

Just a small observation, if I'm reading tls.Client and http.Reader code correctly.

tls.Client stream method requires that the reader, which calls it, has enough unused capacity to fit cleartext record. If that unused capacity is less than the decoded record we will get error.OutputBufferUndersize. For 'safe' operation with tls.Client we must ensure to have tls.min_buffer_len of unused capacity on each call to stream method it is not enough to ensure that initial reader buffer is at least tls.min_buffer_len.

For example, in https.Client if we get partial headers in the first tls record, http parser will call fillMore while leaving partial headers in the buffer, if the next tls record is big enough and reader buffer is initialized with tls.min_buffer_len, decrypted record can't fit into unused space.

This makes some extra requirements for sizing tls.Client read buffer.

@mlugg
Copy link
Member

mlugg commented Aug 5, 2025

Note that because the ziglang.org master builds were purged recently and std.http is regressed on master, this PR is blocking the usually-daily Zig community mirror health check: ziglang/www.ziglang.org#516

@andrewrk
Copy link
Member Author

andrewrk commented Aug 5, 2025

Note that because the ziglang.org master builds were purged recently

what do you mean by this?

@andrewrk
Copy link
Member Author

andrewrk commented Aug 5, 2025

For 'safe' operation with tls.Client we must ensure to have tls.min_buffer_len of unused capacity on each call to stream method it is not enough to ensure that initial reader buffer is at least tls.min_buffer_len.

not quite because it will rebase if necessary

@mlugg
Copy link
Member

mlugg commented Aug 5, 2025

what do you mean by this?

Uhhh, I assumed this was intentional? The 0.15.0 dev build which check-mirrors was previously built against, https://ziglang.org/builds/zig-x86_64-linux-0.15.0-dev.885+e83776595.tar.xz, started 404'ing.

@andrewrk
Copy link
Member Author

andrewrk commented Aug 5, 2025

@mlugg
Copy link
Member

mlugg commented Aug 5, 2025

Ah, I see, I didn't know you'd automated that -- thanks for pointing it out!

@ianic
Copy link
Contributor

ianic commented Aug 5, 2025

not quite because it will rebase if necessary

Please ignore if this is only my misunderstanding...

I'm thinking about something like this:
We have http page which consists of 4k headers and 100k body.
First two tls packets (ignoring tls overhead) are 3k (3k headers) and 16k (1k headers + 15k body).
tls.Client is initialized with min_buffer_len ~ 16k.
Http parser gets first 3k of headers and calls for more without consuming; seek: 0, end: 3k.
tls.Client decrypts next 16k record but can't push it into readers 13k free buffer space.

@andrewrk
Copy link
Member Author

andrewrk commented Aug 7, 2025

@ianic yes you are right, thank you for laying it out so plainly. In this case the buffer size requirement is cumulative.

@andrewrk andrewrk force-pushed the http branch 2 times, most recently from b7a94cf to aac26f3 Compare August 7, 2025 05:42
@andrewrk andrewrk enabled auto-merge August 7, 2025 08:30
@squeek502
Copy link
Collaborator

squeek502 commented Aug 7, 2025

Consistently hitting TlsInitializationFailed when using this branch to fetch packages on Windows.

Example urls that fail:

  • https://github.com/squeek502/compressed_mingw_includes/releases/download/1.0.2/compressed_mingw_includes.tar
  • https://www.ryanliptak.com/misc/notin.html (just a random page on my site, not related to Zig packages)

Can also reproduce it with http.Client.fetch:

const std = @import("std");

pub fn main() !void {
    var debug_allocator: std.heap.DebugAllocator(.{}) = .{};
    defer std.debug.assert(debug_allocator.deinit() == .ok);
    const gpa = debug_allocator.allocator();

    var client: std.http.Client = .{ .allocator = gpa };
    defer client.deinit();

    var body: std.ArrayListUnmanaged(u8) = .empty;
    defer body.deinit(gpa);

    const res = try client.fetch(.{
        .location = .{ .url = "https://www.ryanliptak.com/misc/notin.html" },
        .method = .GET,
        .response_storage = .{ .allocator = gpa, .list = &body },
    });

    std.debug.print("{}\n", .{res.status});
}
error: TlsInitializationFailed
C:\Users\Ryan\Programming\Zig\zig\lib\std\crypto\aes_gcm.zig:108:17: 0x7ff77b21f336 in decrypt (httpclient_zcu.obj)
                return error.AuthenticationFailed;
                ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\crypto\tls\Client.zig:374:29: 0x7ff77b1e85ce in init (httpclient_zcu.obj)
                            return error.TlsBadRecordMac;
                            ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:338:25: 0x7ff77b150593 in create (httpclient_zcu.obj)
                ) catch return error.TlsInitializationFailed,
                        ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:1419:24: 0x7ff77b14065f in connectTcpOptions (httpclient_zcu.obj)
            const tc = try Connection.Tls.create(client, proxied_host, proxied_port, stream);
                       ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:1377:5: 0x7ff77b14085b in connectTcp (httpclient_zcu.obj)
    return connectTcpOptions(client, .{ .host = host, .port = port, .protocol = protocol });
    ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:1552:14: 0x7ff77b138870 in connect (httpclient_zcu.obj)
    } orelse return client.connectTcp(host, port, protocol);
             ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:1668:18: 0x7ff77b1326ee in request (httpclient_zcu.obj)
        break :c try client.connect(host_name, uriPort(uri, protocol), protocol);
                 ^
C:\Users\Ryan\Programming\Zig\zig\lib\std\http\Client.zig:1757:15: 0x7ff77b12e7cc in fetch (httpclient_zcu.obj)
    var req = try request(client, method, uri, .{
              ^
C:\Users\Ryan\Programming\Zig\tmp\httpclient.zig:14:17: 0x7ff77b12e32c in main (httpclient_zcu.obj)
    const res = try client.fetch(.{
                ^

EDIT: Same fetch reproducion on Linux hits integer overflow:

thread 6887 panic: integer overflow
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1305:31: 0x1072801 in defaultRebase (std.zig)
    if (r.end <= r.buffer.len - capacity) return;
                              ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1301:27: 0x115b9a3 in rebase (std.zig)
    return r.vtable.rebase(r, capacity);
                          ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1071:15: 0x1147d91 in fillUnbuffered (std.zig)
    try rebase(r, n);
              ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1059:26: 0x112fc6f in fill (std.zig)
    return fillUnbuffered(r, n);
                         ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:498:15: 0x111d520 in peek (std.zig)
    try r.fill(n);
              ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:546:30: 0x11040c0 in take (std.zig)
    const result = try r.peek(n);
                             ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:563:23: 0x18669c6 in takeArray__anon_373530 (std.zig)
    return (try r.take(n))[0..n];
                      ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1146:36: 0x1865afe in takeStructPointer__anon_373072 (std.zig)
    return @ptrCast(try r.takeArray(@sizeOf(T)));
                                   ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:1176:51: 0x18611c7 in streamInner (std.zig)
                var res = (try r.takeStructPointer(T)).*;
                                                  ^
/home/ryan/Programming/zig/zig/lib/std/compress/flate/Decompress.zig:248:23: 0x185385a in streamFallible (std.zig)
    return streamInner(d, w, limit) catch |err| switch (err) {
                      ^
/home/ryan/Programming/zig/zig/lib/std/compress/flate/Decompress.zig:171:23: 0x185443c in streamIndirectInner (std.zig)
    _ = streamFallible(d, &writer, .limited(writer.buffer.len - writer.end)) catch |err| switch (err) {
                      ^
/home/ryan/Programming/zig/zig/lib/std/compress/flate/Decompress.zig:244:31: 0x1867f95 in streamIndirect (std.zig)
    return streamIndirectInner(d);
                              ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:385:34: 0x10b6ce3 in appendRemainingUnlimited__anon_10086 (std.zig)
        const n = r.vtable.stream(r, &writer, .limited(list.unusedCapacitySlice().len)) catch |err| switch (err) {
                                 ^
/home/ryan/Programming/zig/zig/lib/std/Io/Reader.zig:317:61: 0x1173df8 in appendRemaining__anon_23361 (std.zig)
    if (limit == .unlimited) return appendRemainingUnlimited(r, gpa, alignment, list, 1);
                                                            ^
/home/ryan/Programming/zig/zig/lib/std/http/Client.zig:1802:31: 0x1168ff9 in fetch (std.zig)
        reader.appendRemaining(allocator, null, list, storage.append_limit) catch |err| switch (err) {
                              ^
/home/ryan/Programming/zig/tmp/httpclient.zig:14:33: 0x1165a51 in main (httpclient.zig)
    const res = try client.fetch(.{
                                ^
/home/ryan/Programming/zig/zig/lib/std/start.zig:627:37: 0x1166589 in posixCallMainAndExit (std.zig)
            const result = root.main() catch |err| {
                                    ^
/home/ryan/Programming/zig/zig/lib/std/start.zig:232:5: 0x1165441 in _start (std.zig)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)
Aborted (core dumped)

andrewrk and others added 12 commits August 7, 2025 10:04
I never liked how this data structure took its API as a parameter.

This use case is now served by std.Io buffering.
Just enough to get things working correctly again
and fix bad unit test API usage that it finds
let's see if anybody notices it missing
respect the case when there is existing buffer
* TLS: add missing assert for output buffer length requirement
* TLS: add missing flushes
* TLS: add flush implementation
* TLS: finish drain implementation
* HTTP: correct buffer sizes for TLS
* HTTP: expose a getReadError method on Connection
* HTTP: add missing flush on sendBodyComplete
* Fetch: remove unwanted deinit
* Fetch: improve error reporting
@andrewrk
Copy link
Member Author

andrewrk commented Aug 7, 2025

We're already in a regressed state relative to a couple weeks back, so I'm going to let this through since it's progress towards towards the solution, and #24732 blocks the release.

Comment on lines -1231 to 1214
var adapter = resource.reader().adaptToNewApi(&adapter_buffer);
var decompress: std.compress.zstd.Decompress = .init(&adapter.new_interface, window_buffer, .{
var decompress: std.compress.zstd.Decompress = .init(resource.reader(), window_buffer, .{
.verify_checksum = false,
});
return try unpackTarball(f, tmp_directory.handle, &decompress.reader);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the changes in this PR, but this won't work for all tar.zst files, see #24735

@andrewrk andrewrk disabled auto-merge August 8, 2025 02:54
@andrewrk andrewrk merged commit 5998a8c into master Aug 8, 2025
10 of 12 checks passed
@andrewrk andrewrk deleted the http branch August 8, 2025 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants