-
Notifications
You must be signed in to change notification settings - Fork 380
VIP30: Plumbing: vcl_raw() and vcl_pipe()
Tl;dr: Improve websocket handling, pipe efficiency
It was obvious from the very start, that non-HTTP traffic would arrive at Varnish instances and therefore we have 'pipe' processing in which Varnish just moves bytes between the client and the backend.
Back then more than now, there were protocols which sent a HTTP header as preamble to entirely different protocols, TN3270 for instance, without going through the (back then badly) defined "Upgrade:" protocol.
Because some of these protocols are still used, and hard to get rid of, and because pipe is an incribly powerful coping mechanism, it is not going away.
Because pipe only is, and only can be, an HTTP1 mechanism, and because of increasing websocket usage, I looked at the "non-http-traffic" issues afresh, to see a) What we can do, and b) If we should do it.
As an initial matter, the current use of pipe for enormous objects because pass does not have a read-ahead limit is not considered further, that should be fixed in the pass code.
The following corner-cases have been considered:
-
Protocols like TN3270 hiding behind a HTTP header
-
Proper HTTP/1 upgrades which take over the entire connection, including CONNECT.
-
HTTP/2 CONNECT frames which take over a stream
-
Expect: 100-continue
-
HTTP/2 SETTINGS
Special cases, which need handling up front in VCL, has always run the risk that users would forget to copy the code from builtin_vcl.
One solution to this, which we never adopted, was to put the "magic" code in a plain VCL subroutine, which builtin vcl_recv{} would simply call, thus making it just one line users could paste to the front of their vcl_recv{}. (I am not entirely sure why we never did that, and the idea might still be worth considering for "regular" HTTP magic.)
Given the very early and special circumstances applicable for listed corner-cases, vcl_recv{} seems too late to handle them, and adding a new "raw" processing state seems warranted.
On receiving a corner case HTTP request, the following actions seem sensible:
A) Rejection: TCP RST
B) Close TCP FIN / H2 stream close
C) Synth Synthetic HTTP response
D) pipe, as today: VCL processing of req.* (Rejection, Close, Synth) Send req.* to backend Dont await response Pass bytes forth and back
E) Tunnel: VCL processing of req.* (Rejection, Close, Synth) Send req.* to backend Await be.resp VCL processing of be.resp (Rejection, Close, Synth) Pass bytes forth and back
F) Recv Send "100 Continue" if expected Pass req.* to vcl_recv{} for normal processing
For starters, we can split the current builtin vcl_recv{} in two:
sub vcl_raw {
if (req.http.host) {
set req.http.host = req.http.host.lower();
}
if (req.method == "PRI") {
/* (see: RFC7540) */
return (synth(405));
}
if (!req.http.host && req.proto ~ "^(?i)HTTP/1.1") {
/* In HTTP/1.1, Host is required. */
return (synth(400));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE" &&
req.method != "PATCH") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
/* MARKER_A */
return (recv);
}
sub vcl_recv {
if (req.method != "GET" && req.method != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}
Note that the "req.esi_level == 0" condition disappears from the Host: check, as ESI include requests will go directly to vcl_recv{}
We can continue, by lifting code we put in C-processing to make sure it happened before vcl_recv{} into vcl_raw{} at MARKER_A.
std.collect(req.http.X-Forwarded-For);
if (req.http.X-Forwarded-For) {
set req.http.X-Forwarded-For += "," + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
std.collect(req.http.Cache-Control);
(We would have to give all VCL an implict "import std;" to do this)
Even if we do nothing else, this part seems worthwhile to me.
return(reset)
is straight-forward.
return(close)
should take an optional second argument which determines
if HTTP/2 should close the stream or the session, and with which error
code.
Add a new return option to vcl_raw{}
:
`return(tunnel(backend));`
This sends the posibly modified req.* to the chosen backend, and
calls into vcl_tunnel{}
when a response OR a failure to get
a response is a reality.
In vcl_tunnel{}
which can see and modify both req.* and beresp.*,
processing can continue with:
return(reset)
TCP_RST
return(close)
TCP_FIN/H2 stream close
return(synth)
Deliver synth HTTP response
return(pipe)
NB: Not going through `vcl_pipe{}` but straight to:
Send beresp to client, then pass bytes
return(tunnel(backend))
Try another backend, come back to `vcl_tunnel{}`
(this obviously needs a counter/limit)
sub vcl_raw {
if (req.http.host) {
set req.http.host = req.http.host.lower();
}
if (req.method == "PRI") {
/* (see: RFC7540) */
return (synth(405));
}
if (!req.http.host && req.proto ~ "^(?i)HTTP/1.1") {
/* In HTTP/1.1, Host is required. */
return (synth(400));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE" &&
req.method != "PATCH") {
/* Non-RFC2616 or CONNECT which is weird. */
set req.http.connection = "close";
return (tunnel); // use default backend
}
std.collect(req.http.X-Forwarded-For);
if (req.http.X-Forwarded-For) {
set req.http.X-Forwarded-For += "," + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
std.collect(req.http.Cache-Control);
return (recv);
}
sub vcl_tunnel {
return (pipe);
}
sub vcl_recv {
if (req.method != "GET" && req.method != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}
It seems trivial to let both vcl_raw{} and vcl_tunnel{} set up independent VDP filters for both directions. VDP seems the more natural choice than VFP.
A logical way to introduce vcl_raw{} could be to retire vcl_pipe{}, as this will alert pretty much all the users who might have to do something to their VCL as a result of these changes.
I have not thought this one through, but as far as I can tell, it could be shoe-horned into vcl_raw{}
but it might not be pretty.
There is an argument for having a vcl_$(transport){}
, and H2 SETTINGS may be a good argument for it.