Comments on return values for PSGI streaming functions #27

avar · 2012-10-01T22:09:32Z

Add comments on the proposal to optionally add meaningful return values to the streaming functions here.

miyagawa · 2012-10-29T17:31:55Z

This came up on irc/mailing list, and i think it makes sense to return write buffer length in the synchronous/blocking server implementations, and it is actually done so with the current version of HTTP::Server::PSGI and Starlet, while it doesn't make much sense in async implementation since there, write() is just a queue up method.

I'm not sure if I like that complicated optional return value array format, i would just rather have a boolean.

avar · 2012-10-29T22:45:37Z

It would be useful to do some review of several servers to see what
they commonly support.

While a boolean would make for a simpler API be simpler for the Plack
application programmer it would mean that e.g. the Apache glue layer
would have to first find out the length of the string you passed it,
then compare that to how many bytes were written, and finally return
$bytes_written == $bytes_received.

It also means you can't use the Plack API to see how many bytes you
managed to write, when you're dealing with a lot of clients it's
interesting to note the historical patterns of how many you can write
compared to how many you wanted to write over time.

Anyway, I didn't mean that they would return an array, rather that it
would be used like this:

my $app = sub {
    my $env = shift;

    # immediately starts the response and stream the content
    return sub {
        my $responder = shift;
        my $writer = $responder->(
            [ 200, [ 'Content-Type', 'application/json' ]]);

        wait_for_events(sub {
            my $new_event = shift;
            if ($new_event) {
                my $str = get_content();
                my $write = $writer->write();
                if (exists $env->{'psgi.streaming.write.return_value'}) {
                    if ($env->{'psgi.streaming.write.return_value'} eq 'bool') {
                        die "Couldn't write()" unless $write;
                    } elsif ($env->{'psgi.streaming.write.return_value'} eq 'bytes_written') {
                        die "Couldn't write()" unless bytes::length($str) == $write;
                    } elsif ($env->{'psgi.streaming.write.return_value'} eq 'chars_written') {
                        die "Couldn't write()" unless length($str) == $write;
                    }
                }
            } else {
                my $close = $writer->close;
                if (exists $env->{'psgi.streaming.close.return_value'} and
                           $env->{'psgi.streaming.close.return_value'} eq 'bool') {
                    die "Failed to close()" unless $close;
                }
            }
        });
    };
};

Which is certainly a lot more verbose than:

my $app = sub {
    my $env = shift;

    # immediately starts the response and stream the content
    return sub {
        my $responder = shift;
        my $writer = $responder->(
            [ 200, [ 'Content-Type', 'application/json' ]]);

        wait_for_events(sub {
            my $new_event = shift;
            if ($new_event) {
                my $str = get_content();
                my $write = $writer->write();
                die "Couldn't write()" if $env->{'psgi.streaming.close.has_return_value'} and !$write;
            } else {
                my $close = $writer->close;
                die "Failed to close()" if $env->{'psgi.streaming.close.has_return_value'} and $!close;
            }
        });
    };
};

But in the case where you actually care about getting the full
information the webserver can give you about how many bytes were
successfully written you won't have to go and write something
webserver specific.

miyagawa · 2012-10-29T22:56:10Z

I see what you mean but the complexity of the code checking them sounds like a sign of over complication.

avar · 2012-10-29T23:12:52Z

Yeah, I hate it (really I just wrote the original proposal to start
just such a discussion).

I actually think in practice a better way to handle this is to
just say that if you support this you should always return the number
of bytes (or characters, whatever makes sense in the environment) you
wrote, but if you can't do so just fake it in the server and do:

# In the server code
sub write {
    my ($writer, $content) = @_;
    my $was_successful = _internal_write($content);
    return $was_successful ? length $content : 0;
}

I checked out e.g. the uwsgi code and it does:

    body = SvPV(ST(1), blen);

    /* Can give us a return value but doesn't */
    wsgi_req->response_size += wsgi_req->socket->proto_write(wsgi_req, body, blen);

    XSRETURN(0);

Where the write does:

ssize_t uwsgi_proto_uwsgi_write(struct wsgi_request * wsgi_req, char *buf, size_t len) {
    ssize_t wlen;
    char *ptr = buf;
    if (len == 0) return 0;

    while(len > 0) {
        wlen = write(wsgi_req->poll.fd, ptr, len);
        if (wlen <= 0) {
            if (!uwsgi.ignore_write_errors) {
                uwsgi_req_error("write()");
            }
            wsgi_req->write_errors++;
            return ptr-buf;
        }
        ptr+=wlen;
        len -= wlen;
    }

    return ptr-buf;
}

I.e. returns the number of bytes written. I think any server that's
likely to support returning this at all likely ends up doing a
write(2) anyway, so most of them can give you how many bytes they
wrote.

So just saying:

Return the number of bytes you wrote on write(), boolean on close()
If you can't say how many bytes you wrote but know it all went
through return the length of the buffer we gave you

But we'll need a variable in $env to indicate whether the server
supports this to tell us whether we should be checking these return
values, and also so that we'll know that something returning an empty
list from close() is doing so because it doesn't support this, not
because it couldn't close the connection.

We can also just give suggestions for returning whatever makes sense
under that server and make clients sort it out for themselves, which
makes writing portable code harder.

But that brings us back to a similar discussion we had about overly
verbose flags in $env to exhaustively indicate server support.

Personally I lean heavily towards verbosely declaring API's up-front,
it makes for shittier code examples, but especially for something like
this that's relatively obscure it's usually something most people
ignore anyway, and those that don't really want to make sure they get
it right.

ISim mentioned this issue Apr 16, 2015

Issue #36 - Twiggy died when client close connection during streaming miyagawa/Twiggy#46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments on return values for PSGI streaming functions #27

Comments on return values for PSGI streaming functions #27

avar commented Oct 1, 2012

miyagawa commented Oct 29, 2012

avar commented Oct 29, 2012

miyagawa commented Oct 29, 2012

avar commented Oct 29, 2012