Very poor performance in TechEmpower Mutliple Queries #651

codygman · 2016-12-16T17:03:48Z

Hi all, I hope that you feel this is an appropriate place. I feel like the multiple queries benchmark that TechEmpower does is one of the more useful benchmarks, and I surprisingly see dismal results for the hasql/servant combination:

I feel that Servant/hasql should be on par with Go (4195 requests per second) but is only 10% of that at 402 requests per second. I'd like to be able to recommend servant for it's improved safety over Go for web api's but cannot with performance so much worse.

In the past I tried using Servant for some of my freelance clients and had to use a different tech stack when the result was too slow.

** I do acknowledge the problem seems to be concurrency around database bindings**, however I think database binding problems are a Servant problem if broader adoption is a large goal. This could be something simpler such as not compiling that benchmark with -threaded, though unlikely I think.

I'll be looking into this more in the coming weeks.

The text was updated successfully, but these errors were encountered:

codygman · 2016-12-16T17:12:48Z

If I'm reading this right, the multiple queries benchmark creates a connection pool of 500?

https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Haskell/servant/src/ServantBench.hs#L130

I know that Go for instance, resizes the connection pool based on number of queries. In the past when I debugged this issue, drivers which resize the connection pool based on number of queries and concurrency seemed to do the best.

phadej · 2016-12-17T13:47:07Z

I guess we must play the benchmark game, i.e. have unbounded resource pool. In real life, it's not a good idea, but to win the game we have to be db-bound?

codygman · 2016-12-17T18:39:44Z

Well with only 402 requests per second d I'm convinced some tweaking like you would do in the real world is needed anyway.

jkarni · 2016-12-17T18:49:18Z

Turning off content-type negotiation would make it significantly better and fairer (probably servant is one of the only frameworks doing it), but sort of annoying to do and really only useful to do well in the benchmark.

codygman · 2016-12-17T20:20:42Z

Wondering if this could be related to nikita-volkov/hasql#47 as well.

flip111 · 2017-12-12T23:40:21Z

servant (and other haskell frameworks) are slow in all the techempower benchmarks and come below many php, js, python, ruby solutions

qrilka · 2018-04-26T15:16:52Z

Latest numbers are even more confusing for that test, for top contender start the result is 10,340, for yesod it's 2,165 and for servant only 71

alpmestan · 2018-04-27T10:04:37Z

I think we all agree that it would be nice to do well in those benchmarks, but it seems relatively low priority compared to recent/ongoing/future developments, which are about improving the life of existing and new users. Those benchmarks are pretty much about "technical marketing". If anyone's willing to put in the time, servant devs would definitely be around to help in the investigation. But actual servant (or more generally wai/warp) applications do perform well, which is why I'm calling the benchmarks "marketing".

saurabhnanda · 2018-04-27T10:18:31Z

We can sponsor a small bounty if anyone wants to take a shot at this. http://www.vacationlabs.com/haskell-bounty-program/

alpmestan · 2018-04-27T11:51:45Z

@saurabhnanda That would be very generous of Vacation Labs, thanks! We'd definitely do our best to help anyone taking up that offer with the time we've got available.

flip111 · 2018-04-27T15:15:42Z

@alpmestan when you say "users" do you mean the developers that use servant or the end users? I don't know about the current developers who use vagrant but i don't think you can assume that new developers don't care about performance. When you say "servant performs well" it's not clear without some measurements. Actually that is what i like about the techempower benchmarks that it gives you 6 different metrics and what is even better is that you can compare to so many other frameworks. To me actual performance numbers is what it is about (not necessarily the ones from techempower), this is not marketing but impacting on how beefy your server needs to be to serve X amount of users. If you agree with me that performance is important, then we can argue about the techempower benchmarks specifically. I don't belief that these benchmarks are so many percentage points off that they would not reflect real world usage. The goal of the benchmarks is to measure as realistically as possible. Of course your app is dependent on business logic and other things too, but these things are also not part of the servant code base. And we can't do any optimization on code that we don't know about. It would be nice to acknowledge the impact the lower level code (the framework) has to the overall application (added business logic).

alpmestan · 2018-04-27T15:48:08Z

@flip111 We have investigated those benchmarks significantly in the past (@jkarni in particular). They never really revealed any problem with servant. For instance, for the DB benchmarks, the performance is determined mostly by the performance of the database library. In some other benchmarks, what costs us is that we aways do content type negotiation because the benchmarks do so little that this matters. And of course many other implementations completely skip that, which violates the HTTP spec. Also, a lot of the performance is determined by warp, servant-server's layer is quite thin and we're never going to spend all that much time in this code, for a non-hello world application. We've profiled and benchmarked servant apps before, the only bottleneck we found got fixed.

But I'm not saying those benchmarks are completely useless. I'd be happy if someone worked on this. I'm just saying that we have put a decent amount of time into this before and, I believe, got everything we could out of that effort, at the time. Moroeover, a bunch of development and documentation tasks are waiting for us already, and we know for sure that we're going to address problems that people have if we handle those tasks. So this is only a matter of prioritising. Which is why I'm saying that I'd be happy to help if someone wants to pick this up.

erewok · 2018-04-27T16:10:29Z

@flip111 I'm not a Servant contributor but I use and appreciate the project, and while agree with some of what you are saying (performance is important to measure and consider), I believe there are some things about Techempower that make it lower priority for the servant team, and I think that's a perfectly fine stance to take.

I certainly can't speak for the team but perhaps some context will be useful here:

The Techempower developers have remarked in the past that the most useful test they run is the "Fortunes" test because it comes closest to measuring the true performance of a framework. Other tests include things like your database-connection library.
While it's interesting as a horse-race, as I've watched Techempower over the years, the results have shifted so dramatically that any particular year for a framework is less interesting to me than overall trends in languages. Out of this, it seems an independent focus on performance is a fine pursuit, but I wouldn't recommend spending a lot of time on optimizing for Techempower specifically.
There's a drive for some teams to push performance and build applications that are not representative of real-world use (this is my reservation any time Techempower comes up in a discussion at work: are we going to build something similar to the stated performance of a system we're comparing to). Admittedly, for their part, the Techempower team tries to flag projects as "real-world" or not; they do a number of things well, in fact, and I always appreciate this distinction myself.
As @jkarni has pointed out many times in the past above and elsewhere, Servant does automatic content-type negotiation, which is something that various high-performing frameworks don't do. This is a philosophical difference which arises from strictly interpreting and designing based on the HTTP spec.

Now, with all that said, I think it would be a fine pursuit for a group of people interested in supporting and marketing Servant to try to build something to game the benchmarks or to optimize the existing the effort. I'd be interested in contributing to that project (if I had the time, of course...), but I have no qualms whatsoever with the Servant team themselves continuing to make progress on the project instead of focusing on Techempower and I appreciate their efforts in this regard.

Lastly, perhaps it would be constructive to add some sort of performance-testing suite to Servant-server completely independent of Techempower? This may allay the concerns of people who are surprised to see its performance on Techempower, and in the future, it could always be pointed to for those looking for numbers or it could also be used to spot ineffeciencies that arise.

For my own uses, for instance, Servant has always been perfectly performant enough for my needs. (I typically run load-testing on my projects using wrk or Apache bench until I am confident that I am in the ballpark of where I need to be).

flip111 · 2018-04-27T17:14:02Z

@alpmestan you say a lot of performance is depended on warp (and thus wai ?). I took a look at wai performance some time ago, but i was not able to draw a conclusion from the information i collected. It might be interesting to someone else though, it can be found here yesodweb/wai#663

alpmestan · 2018-04-28T02:58:45Z

@flip111 Mostly warp because it is the library that implements the actual HTTP server, i.e that turns wai Applications into webservers. And this in turns relies on the performance of the runtime system's I/O manager, described in this paper, with some benchmarks.

Now, it would be useful to have benchmarks servant-server's routing mechanism, because that's pretty much the only place in servant-server where applications are going to spend some more or less neglectible time in. We don't change that code often but it would be nice to have a benchmarking suite, as that would make it easier for people to measure the performance of that code and also to compare different runs of it when trying some potential optimisations.

Warp seems to have a benchmark for its HTTP request parser, but not a more comprehensive benchmarking suite that would maybe use wrk to measure how well the server copes with increasing numbers of requests etc. This in turn could lead to more (haskell specific) criterion/gauge benchmarks of specific parts of wai/warp, to track the performance of those critical code paths and perhaps investigate improvements of them.

In summary:

the only thing worth writing benchmarks for in servant-server, I think, is the router; it would be great if someone could look into this, we can help and would be glad to, and might pick this up ourselves eventually if nobody beats us to it -- it's just not at the top of our TODO at the moment and seems like a relatively fun and accessible task for a new contributor;
warp doesn't seem to have "general" benchmarks, this is something possibly worth looking into; I may simply have missed them though;
regarding the TechEmpower benchmarks, @erewok summed things up much better than me.

- Pool size now matches the max concurrency of requests used by the benchmark. Many other framworks appear to do similar matching. - Idea inspired by: haskell-servant/servant#651 (comment) - This finally restores all performance regression caused by `114b1b8`. Additionally we now finally blow past the performance of master at `6250eb8`.

* Bump to latest stable compiler, stackage resolver and libs. - Removed upper bounds for libs since the stackage resolver already takes care of pinning versions for us. - Removed extra-deps from stack config since resolver now contains `hasql-pool`. - Addressed `hasql` incompatibilities arising from upgrades to latest version. - Addressed runtime issue caused by `servant` upgrade changes where invalid parameters are now an error instead of no value. Added a datatype to handle invalid type coercion to 1 as the benchmark rules expect. - Error responses now describe the cause for a 500 to help debug issues. * Add `--pedantic` flag to catch even more warnings. * Re-use a single session across statements to regain some lost performance from `114b1b8`. - Switch to `unit` decoder for `updateSingle` statement as it now fails when being used in a session with other statements. We really dont need/use the result and as such can safely move to returning `()`. * Bump pool size to workaround `libpq` locking. - Pool size now matches the max concurrency of requests used by the benchmark. Many other framworks appear to do similar matching. - Idea inspired by: haskell-servant/servant#651 (comment) - This finally restores all performance regression caused by `114b1b8`. Additionally we now finally blow past the performance of master at `6250eb8`.

naushadh · 2019-03-25T00:05:17Z

@jkarni: Based on your comments:
#651 (comment)
https://www.reddit.com/r/haskell/comments/4zomu9/what_would_make_this_yesod_code_15_faster_than/d6xhjy6/

I'd like to try and create a version of the servant benchmark less content-type negotiation. From my dig into the source code, I believe the escape hatch to disable the negotiation is to use Raw?

servant/servant-server/src/Servant/Server/Internal.hs

Line 544 in ad02280

instance HasServer Raw context where

phadej · 2019-03-25T01:20:01Z

using Raw is not-using servant.

alpmestan · 2019-03-25T05:18:13Z

To elaborate a little more: if your servant app solely consists of a Raw endpoint, you'll basically not be doing any routing and it should be very, very close (performance wise) to just serving the equivalent WAI-only Application.

flip111 · 2019-03-25T10:47:36Z

If anyone is interested in speeding up WAI, please comment on this yesodweb/wai#663

saurabhnanda · 2019-03-25T11:25:34Z

I'd like to try and create a version of the servant benchmark less content-type negotiation. From my dig into the source code, I believe the escape hatch to disable the negotiation is to use Raw?

To give some context to what @naushadh is doing, we want to benchmark the overhead of content-type negotiation. I believe using raw drops down to the wai level, right? Which mean, you don't even get routing, right?

Is it possible to keep routing, but disable content-type negotiation for the purpose of a synthetic benchmark?

Also, what exactly does "content-type negotiation" mean? Is it what this combinator does...

Post '[HTML] (Html())

If yes, then is it possible to have a build where the first matching route is used, while disregarding the content-type specified in the route?

saurabhnanda · 2019-03-25T11:27:09Z

And, just to check if the "content-type" line-of-thought is still relevant -- @naushadh is servant better than yesod on your version of the benchmarks now? If not, then this is still relevant.

phadej · 2019-03-25T11:42:25Z

Disregarding Content-Type is against HTTP spirit (if not spec). Browsers pass Accept: ... */* header, that's why at the end you get "something". Machine clients (e.g. servant-client) is not lenient, it says specifically, e.g. Accept: application/json, and it don't want to get some HTML then from Post '[Html] (Html ()) endpoint... but server would signal a HTTP 4xx (I don't remember which code).

saurabhnanda · 2019-03-25T11:47:39Z

@phadej I understand where you're coming from. I'm not suggesting we actually remove content-type negotiation from servant. This is just an academic exercise to understand how much content-type negotiation costs in terms of performance.

phadej · 2019-03-25T12:08:35Z

I would be surprised it costs measurably more then other things we do in routing (e.g. parsing capture fields); but I don't know.

That can be benchmarked in isolation.

http://hackage.haskell.org/package/http-media-0.7.1.3/docs/Network-HTTP-Media-Accept.html#t:Accept
vs. e.g. parseUrlPiece for Day (or some other type with non trivial format)

jkarni · 2019-03-25T13:54:13Z

I recall having reason to think that in the benchmarks involving simple requests (no DB etc) content-type negotiation was actually quite significant. In order to remove it but keep routing, you'd have to write a version of reqbody and verb that doesn't do negotiation. On my phone now so it's a bit hard to give more details, but if you look at the HasServer instance for those, it should be obvious how to do it. (Still feels like it's the benchmark that ought to change, but I guess we'll never win that battle, and it's understandable that now it'd break everything.) Em seg, 25 de mar de 2019 13:08, Oleg Grenrus <[email protected]> escreveu:

…

I would be surprised it costs measurably more then other things we do in routing (e.g. parsing capture fields); but I don't know. That can be benchmarked in isolation. - http://hackage.haskell.org/package/http-media-0.7.1.3/docs/Network-HTTP-Media-Accept.html#t:Accept - vs. e.g. parseUrlPiece for Day (or some other type with non trivial format) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#651 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABlKmnCZuxQ1gzsdxtcAMiBBRfRcOlj5ks5vaLxEgaJpZM4LPZFc> .

alpmestan · 2019-03-25T14:45:53Z

Yes if you want usual APIs but no content type negotiation, you want to define custom HasServer instances that bypass all the MimeRender/MimeUnrender machinery that our existing instances use.

naushadh · 2019-03-27T00:55:06Z

Most results from 2019-03-18; yesod from 2019-03-12 as Citrine (TFB test server) unexpectedly crashed before all frameworks were completed.

Framework	JSON Serialization	Single Query	Mutiple queries	Fortunes	Updates	Plaintext
servant	297,444	115,204	14,958	91,456	2,584	340,137
servant-mysql-haskell	301,422	155,806	10,260	97,185	2,572	344,312
yesod	337,215	64,931	4,144	44,544	0	328,363
snap	370,986	89,014	1,013	-	-	618,250
spock	40,216	28,405	945	21,796	501	8,779

@saurabhnanda: Servant is about as fast as yesod in plaintext -- the slight slowness could be attributed to the much older stack resolver (lts-6.3) being used compared to servant's (lts-13). servant had recently gained a 35% boost since round 17.

Now snap is an interesting contender.

naushadh mentioned this issue Mar 16, 2019

Add new servant benchmark: mysql-haskell TechEmpower/FrameworkBenchmarks#4550

Merged

tchoutri closed this as completed Dec 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very poor performance in TechEmpower Mutliple Queries #651

Very poor performance in TechEmpower Mutliple Queries #651

codygman commented Dec 16, 2016

codygman commented Dec 16, 2016

phadej commented Dec 17, 2016

codygman commented Dec 17, 2016

jkarni commented Dec 17, 2016

codygman commented Dec 17, 2016

flip111 commented Dec 12, 2017

qrilka commented Apr 26, 2018

alpmestan commented Apr 27, 2018

saurabhnanda commented Apr 27, 2018

alpmestan commented Apr 27, 2018

flip111 commented Apr 27, 2018 •

edited

Loading

alpmestan commented Apr 27, 2018

erewok commented Apr 27, 2018

flip111 commented Apr 27, 2018

alpmestan commented Apr 28, 2018

naushadh commented Mar 25, 2019

phadej commented Mar 25, 2019

alpmestan commented Mar 25, 2019

flip111 commented Mar 25, 2019

saurabhnanda commented Mar 25, 2019

saurabhnanda commented Mar 25, 2019

phadej commented Mar 25, 2019 •

edited

Loading

saurabhnanda commented Mar 25, 2019

phadej commented Mar 25, 2019

jkarni commented Mar 25, 2019 via email

alpmestan commented Mar 25, 2019

naushadh commented Mar 27, 2019

Very poor performance in TechEmpower Mutliple Queries #651

Very poor performance in TechEmpower Mutliple Queries #651

Comments

codygman commented Dec 16, 2016

codygman commented Dec 16, 2016

phadej commented Dec 17, 2016

codygman commented Dec 17, 2016

jkarni commented Dec 17, 2016

codygman commented Dec 17, 2016

flip111 commented Dec 12, 2017

qrilka commented Apr 26, 2018

alpmestan commented Apr 27, 2018

saurabhnanda commented Apr 27, 2018

alpmestan commented Apr 27, 2018

flip111 commented Apr 27, 2018 • edited Loading

alpmestan commented Apr 27, 2018

erewok commented Apr 27, 2018

flip111 commented Apr 27, 2018

alpmestan commented Apr 28, 2018

naushadh commented Mar 25, 2019

phadej commented Mar 25, 2019

alpmestan commented Mar 25, 2019

flip111 commented Mar 25, 2019

saurabhnanda commented Mar 25, 2019

saurabhnanda commented Mar 25, 2019

phadej commented Mar 25, 2019 • edited Loading

saurabhnanda commented Mar 25, 2019

phadej commented Mar 25, 2019

jkarni commented Mar 25, 2019 via email

alpmestan commented Mar 25, 2019

naushadh commented Mar 27, 2019

flip111 commented Apr 27, 2018 •

edited

Loading

phadej commented Mar 25, 2019 •

edited

Loading