A couple of times I have written code which basically does this:
- accept() an incoming connection
- do some magic to get another socket
- start passing bit between the two sockets
- perform a correct close down sequence
An example use case of this is a proxy implementing the HTTP CONNECT method, but in for some known hostnames it will log a message and mangle the hostname before proceeding. This has been used as a legacy fall back solution while changing a network setup. But my uses are not restricted to HTTP proxies I have done the same for a few legacy protocols where the magic has been of different complexities.
The two final steps are quite general and it would be nice to have a module doing just that. Take two sockets and make it easy (or even automatic) to pass bytes between them.
The naïve non-blocking solution would use a scalar string buffer for each direction and perform a select loop while maintaining the write vector depending on which buffers contain data. I have written this code multiple times. In development this is usually quite successful, in production less so. While Perl might be quite suited for the magic in step 2, the naïve way of passing bytes have quite an overhead for the buffer management.
A less naïve way would use a array of strings for buffers, but I’m not quite sure if this would be a win in all cases. You might be able to get away with some string operations on the read side of the buffer, but it might be more expensive on the write side. I have not benchmarked this.
Most of the time I don’t care about Perl level IO handles. I know that there is a real C level file descriptor beneath. So an even better POSIX compliant solution might be to use XS to have plain C strings and use readv()/writev() and a iovec structure as buffer.
Can we do even better? At least on Linux we can. With the Linux splice() specific system call it is possible to us a pipe as buffer and never to have to copy data from and to user space.
I have not been able to find any off the shelf solution on CPAN. So I think I need to write it myself, but what would the nice and general API be? I guess the basic interface would be something like:
my $chain = IO::Splice->new($fh1, $fh2);
$chain->pump(); # read and write from both handles if possible and needed
$chain->read($fh1); # read to buffer from one specific handle
$chain->write($fh2); # write from buffer to one specific handle
$chain->can_write(); # returns the handles it needs to write to
but it might be simpler to have two callbacks for setting a file handle in write or no-write state:
my $readset = IO::Select->new( $fh1, $fh2);
my $writeset = IO::Select->new();
my $chain = IO::Splice->new( $fh1, $fh2,
writable => sub { $writeset->add( shift ) },
unwritable => sub { $writeset->remove( shift ) }
);
while ( ... select ... ) {
$chain->pump();
}
As said, I think I have plenty of implementations of the naïve way but before releasing some code it would be nice to get some input on the API. But the best feedback would be a module that already have a usable API but might not implement the Linux specific way. That would allow me to steal the interface…