Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very poor performance in TechEmpower Mutliple Queries #651

Closed
codygman opened this issue Dec 16, 2016 · 27 comments
Closed

Very poor performance in TechEmpower Mutliple Queries #651

codygman opened this issue Dec 16, 2016 · 27 comments

Comments

@codygman
Copy link

Hi all, I hope that you feel this is an appropriate place. I feel like the multiple queries benchmark that TechEmpower does is one of the more useful benchmarks, and I surprisingly see dismal results for the hasql/servant combination:

techempowermultiplequeries

I feel that Servant/hasql should be on par with Go (4195 requests per second) but is only 10% of that at 402 requests per second. I'd like to be able to recommend servant for it's improved safety over Go for web api's but cannot with performance so much worse.

In the past I tried using Servant for some of my freelance clients and had to use a different tech stack when the result was too slow.

** I do acknowledge the problem seems to be concurrency around database bindings**, however I think database binding problems are a Servant problem if broader adoption is a large goal. This could be something simpler such as not compiling that benchmark with -threaded, though unlikely I think.

I'll be looking into this more in the coming weeks.

@codygman
Copy link
Author

If I'm reading this right, the multiple queries benchmark creates a connection pool of 500?

https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Haskell/servant/src/ServantBench.hs#L130

I know that Go for instance, resizes the connection pool based on number of queries. In the past when I debugged this issue, drivers which resize the connection pool based on number of queries and concurrency seemed to do the best.

@phadej
Copy link
Contributor

phadej commented Dec 17, 2016

I guess we must play the benchmark game, i.e. have unbounded resource pool. In real life, it's not a good idea, but to win the game we have to be db-bound?

@codygman
Copy link
Author

Well with only 402 requests per second d I'm convinced some tweaking like you would do in the real world is needed anyway.

@jkarni
Copy link
Member

jkarni commented Dec 17, 2016

Turning off content-type negotiation would make it significantly better and fairer (probably servant is one of the only frameworks doing it), but sort of annoying to do and really only useful to do well in the benchmark.

@codygman
Copy link
Author

Wondering if this could be related to nikita-volkov/hasql#47 as well.

@flip111
Copy link

flip111 commented Dec 12, 2017

servant (and other haskell frameworks) are slow in all the techempower benchmarks and come below many php, js, python, ruby solutions

@qrilka
Copy link

qrilka commented Apr 26, 2018

Latest numbers are even more confusing for that test, for top contender start the result is 10,340, for yesod it's 2,165 and for servant only 71

@alpmestan
Copy link
Contributor

I think we all agree that it would be nice to do well in those benchmarks, but it seems relatively low priority compared to recent/ongoing/future developments, which are about improving the life of existing and new users. Those benchmarks are pretty much about "technical marketing". If anyone's willing to put in the time, servant devs would definitely be around to help in the investigation. But actual servant (or more generally wai/warp) applications do perform well, which is why I'm calling the benchmarks "marketing".

@saurabhnanda
Copy link

We can sponsor a small bounty if anyone wants to take a shot at this. http://www.vacationlabs.com/haskell-bounty-program/

@alpmestan
Copy link
Contributor

@saurabhnanda That would be very generous of Vacation Labs, thanks! We'd definitely do our best to help anyone taking up that offer with the time we've got available.

@flip111
Copy link

flip111 commented Apr 27, 2018

@alpmestan when you say "users" do you mean the developers that use servant or the end users? I don't know about the current developers who use vagrant but i don't think you can assume that new developers don't care about performance. When you say "servant performs well" it's not clear without some measurements. Actually that is what i like about the techempower benchmarks that it gives you 6 different metrics and what is even better is that you can compare to so many other frameworks. To me actual performance numbers is what it is about (not necessarily the ones from techempower), this is not marketing but impacting on how beefy your server needs to be to serve X amount of users. If you agree with me that performance is important, then we can argue about the techempower benchmarks specifically. I don't belief that these benchmarks are so many percentage points off that they would not reflect real world usage. The goal of the benchmarks is to measure as realistically as possible. Of course your app is dependent on business logic and other things too, but these things are also not part of the servant code base. And we can't do any optimization on code that we don't know about. It would be nice to acknowledge the impact the lower level code (the framework) has to the overall application (added business logic).

@alpmestan
Copy link
Contributor

@flip111 We have investigated those benchmarks significantly in the past (@jkarni in particular). They never really revealed any problem with servant. For instance, for the DB benchmarks, the performance is determined mostly by the performance of the database library. In some other benchmarks, what costs us is that we aways do content type negotiation because the benchmarks do so little that this matters. And of course many other implementations completely skip that, which violates the HTTP spec. Also, a lot of the performance is determined by warp, servant-server's layer is quite thin and we're never going to spend all that much time in this code, for a non-hello world application. We've profiled and benchmarked servant apps before, the only bottleneck we found got fixed.

But I'm not saying those benchmarks are completely useless. I'd be happy if someone worked on this. I'm just saying that we have put a decent amount of time into this before and, I believe, got everything we could out of that effort, at the time. Moroeover, a bunch of development and documentation tasks are waiting for us already, and we know for sure that we're going to address problems that people have if we handle those tasks. So this is only a matter of prioritising. Which is why I'm saying that I'd be happy to help if someone wants to pick this up.

@erewok
Copy link
Contributor

erewok commented Apr 27, 2018

@flip111 I'm not a Servant contributor but I use and appreciate the project, and while agree with some of what you are saying (performance is important to measure and consider), I believe there are some things about Techempower that make it lower priority for the servant team, and I think that's a perfectly fine stance to take.

I certainly can't speak for the team but perhaps some context will be useful here:

  • The Techempower developers have remarked in the past that the most useful test they run is the "Fortunes" test because it comes closest to measuring the true performance of a framework. Other tests include things like your database-connection library.
  • While it's interesting as a horse-race, as I've watched Techempower over the years, the results have shifted so dramatically that any particular year for a framework is less interesting to me than overall trends in languages. Out of this, it seems an independent focus on performance is a fine pursuit, but I wouldn't recommend spending a lot of time on optimizing for Techempower specifically.
  • There's a drive for some teams to push performance and build applications that are not representative of real-world use (this is my reservation any time Techempower comes up in a discussion at work: are we going to build something similar to the stated performance of a system we're comparing to). Admittedly, for their part, the Techempower team tries to flag projects as "real-world" or not; they do a number of things well, in fact, and I always appreciate this distinction myself.
  • As @jkarni has pointed out many times in the past above and elsewhere, Servant does automatic content-type negotiation, which is something that various high-performing frameworks don't do. This is a philosophical difference which arises from strictly interpreting and designing based on the HTTP spec.

Now, with all that said, I think it would be a fine pursuit for a group of people interested in supporting and marketing Servant to try to build something to game the benchmarks or to optimize the existing the effort. I'd be interested in contributing to that project (if I had the time, of course...), but I have no qualms whatsoever with the Servant team themselves continuing to make progress on the project instead of focusing on Techempower and I appreciate their efforts in this regard.

Lastly, perhaps it would be constructive to add some sort of performance-testing suite to Servant-server completely independent of Techempower? This may allay the concerns of people who are surprised to see its performance on Techempower, and in the future, it could always be pointed to for those looking for numbers or it could also be used to spot ineffeciencies that arise.

For my own uses, for instance, Servant has always been perfectly performant enough for my needs. (I typically run load-testing on my projects using wrk or Apache bench until I am confident that I am in the ballpark of where I need to be).

@flip111
Copy link

flip111 commented Apr 27, 2018

@alpmestan you say a lot of performance is depended on warp (and thus wai ?). I took a look at wai performance some time ago, but i was not able to draw a conclusion from the information i collected. It might be interesting to someone else though, it can be found here yesodweb/wai#663

@alpmestan
Copy link
Contributor

@flip111 Mostly warp because it is the library that implements the actual HTTP server, i.e that turns wai Applications into webservers. And this in turns relies on the performance of the runtime system's I/O manager, described in this paper, with some benchmarks.

Now, it would be useful to have benchmarks servant-server's routing mechanism, because that's pretty much the only place in servant-server where applications are going to spend some more or less neglectible time in. We don't change that code often but it would be nice to have a benchmarking suite, as that would make it easier for people to measure the performance of that code and also to compare different runs of it when trying some potential optimisations.

Warp seems to have a benchmark for its HTTP request parser, but not a more comprehensive benchmarking suite that would maybe use wrk to measure how well the server copes with increasing numbers of requests etc. This in turn could lead to more (haskell specific) criterion/gauge benchmarks of specific parts of wai/warp, to track the performance of those critical code paths and perhaps investigate improvements of them.

In summary:

  • the only thing worth writing benchmarks for in servant-server, I think, is the router; it would be great if someone could look into this, we can help and would be glad to, and might pick this up ourselves eventually if nobody beats us to it -- it's just not at the top of our TODO at the moment and seems like a relatively fun and accessible task for a new contributor;
  • warp doesn't seem to have "general" benchmarks, this is something possibly worth looking into; I may simply have missed them though;
  • regarding the TechEmpower benchmarks, @erewok summed things up much better than me.

naushadh added a commit to naushadh/FrameworkBenchmarks that referenced this issue Mar 10, 2019
- Pool size now matches the max concurrency of requests used by the benchmark. Many other framworks appear to do similar matching.
- Idea inspired by: haskell-servant/servant#651 (comment)
- This finally restores all performance regression caused by `114b1b8`. Additionally we now finally blow past the performance of master at `6250eb8`.
NateBrady23 pushed a commit to TechEmpower/FrameworkBenchmarks that referenced this issue Mar 11, 2019
* Bump to latest stable compiler, stackage resolver and libs.

- Removed upper bounds for libs since the stackage resolver already takes care of pinning versions for us.
- Removed extra-deps from stack config since resolver now contains `hasql-pool`.
- Addressed `hasql` incompatibilities arising from upgrades to latest version.
- Addressed runtime issue caused by `servant` upgrade changes where invalid parameters are now an error instead of no value. Added a datatype to handle invalid type coercion to 1 as the benchmark rules expect.
- Error responses now describe the cause for a 500 to help debug issues.

* Add `--pedantic` flag to catch even more warnings.

* Re-use a single session across statements to regain some lost performance from `114b1b8`.

- Switch to `unit` decoder for `updateSingle` statement as it now fails when being used in a session with other statements. We really dont need/use the result and as such can safely move to returning `()`.

* Bump pool size to workaround `libpq` locking.

- Pool size now matches the max concurrency of requests used by the benchmark. Many other framworks appear to do similar matching.
- Idea inspired by: haskell-servant/servant#651 (comment)
- This finally restores all performance regression caused by `114b1b8`. Additionally we now finally blow past the performance of master at `6250eb8`.
@naushadh
Copy link

@jkarni: Based on your comments:
#651 (comment)
https://www.reddit.com/r/haskell/comments/4zomu9/what_would_make_this_yesod_code_15_faster_than/d6xhjy6/

I'd like to try and create a version of the servant benchmark less content-type negotiation. From my dig into the source code, I believe the escape hatch to disable the negotiation is to use Raw?

instance HasServer Raw context where

@phadej
Copy link
Contributor

phadej commented Mar 25, 2019

using Raw is not-using servant.

@alpmestan
Copy link
Contributor

To elaborate a little more: if your servant app solely consists of a Raw endpoint, you'll basically not be doing any routing and it should be very, very close (performance wise) to just serving the equivalent WAI-only Application.

@flip111
Copy link

flip111 commented Mar 25, 2019

If anyone is interested in speeding up WAI, please comment on this yesodweb/wai#663

@saurabhnanda
Copy link

I'd like to try and create a version of the servant benchmark less content-type negotiation. From my dig into the source code, I believe the escape hatch to disable the negotiation is to use Raw?

To give some context to what @naushadh is doing, we want to benchmark the overhead of content-type negotiation. I believe using raw drops down to the wai level, right? Which mean, you don't even get routing, right?

Is it possible to keep routing, but disable content-type negotiation for the purpose of a synthetic benchmark?

Also, what exactly does "content-type negotiation" mean? Is it what this combinator does...

Post '[HTML] (Html())

If yes, then is it possible to have a build where the first matching route is used, while disregarding the content-type specified in the route?

@saurabhnanda
Copy link

And, just to check if the "content-type" line-of-thought is still relevant -- @naushadh is servant better than yesod on your version of the benchmarks now? If not, then this is still relevant.

@phadej
Copy link
Contributor

phadej commented Mar 25, 2019

Disregarding Content-Type is against HTTP spirit (if not spec). Browsers pass Accept: ... */* header, that's why at the end you get "something". Machine clients (e.g. servant-client) is not lenient, it says specifically, e.g. Accept: application/json, and it don't want to get some HTML then from Post '[Html] (Html ()) endpoint... but server would signal a HTTP 4xx (I don't remember which code).

@saurabhnanda
Copy link

@phadej I understand where you're coming from. I'm not suggesting we actually remove content-type negotiation from servant. This is just an academic exercise to understand how much content-type negotiation costs in terms of performance.

@phadej
Copy link
Contributor

phadej commented Mar 25, 2019

I would be surprised it costs measurably more then other things we do in routing (e.g. parsing capture fields); but I don't know.

That can be benchmarked in isolation.

@jkarni
Copy link
Member

jkarni commented Mar 25, 2019 via email

@alpmestan
Copy link
Contributor

Yes if you want usual APIs but no content type negotiation, you want to define custom HasServer instances that bypass all the MimeRender/MimeUnrender machinery that our existing instances use.

@naushadh
Copy link

Most results from 2019-03-18; yesod from 2019-03-12 as Citrine (TFB test server) unexpectedly crashed before all frameworks were completed.

Framework JSON Serialization Single Query Mutiple queries Fortunes Updates Plaintext
servant 297,444 115,204 14,958 91,456 2,584 340,137
servant-mysql-haskell 301,422 155,806 10,260 97,185 2,572 344,312
yesod 337,215 64,931 4,144 44,544 0 328,363
snap 370,986 89,014 1,013 - - 618,250
spock 40,216 28,405 945 21,796 501 8,779

@saurabhnanda: Servant is about as fast as yesod in plaintext -- the slight slowness could be attributed to the much older stack resolver (lts-6.3) being used compared to servant's (lts-13). servant had recently gained a 35% boost since round 17.

Now snap is an interesting contender.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants