-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize Counterexamples (a la Hypothesis)? #248
Comments
Regarding the how-toHmm... I think in order to have less flaky storage, we could do something like this:
The problem with this would be the tracking involved. It’d require some compiler-level semantic knowledge, right? Or at least access to a data structure that represents the objects modeled in code and their relationships. Regarding the point of property testingI agree with you. But, just like the Generators in SwiftCheck allow, couldn’t we have the generators create new, unkown instances, as well as the counterexamples? I mean giving it a fixed set (the counterexamples) + an unbounded set (the rest of the domain). And if the counterexamples are so many that they actually clog the generators, we could just give the c-ex set a weight and thus tell the main generator that it needs to take only as many counterexamples as it needs, and not all of them. |
Would work, but would be terribly inefficient. Suppose I gather a large database of a hundred failing seeds, each of which fails test 99 of 100 or thereabouts. That's a little less than 10000 runs of the property testing block just to reproduce them all.
Now we're serializing the user's API as well? We don't have the capability to even introspect which function we're called from (hence the DSL). This is 110% the domain of a compiler plugin given the current state of Swift.
A generator of counterexamples still needs to be created somehow. At the end of the day, something has to do the grunt work of coming up with them and storing them somewhere. Or, better yet, coming up with a consistent procedure for generating failures.
Which kinda defeats the point of serializing them all in the first place. Hypothesis also keeps track of "interesting" test cases in their database and they specifically do this so they don't clog up the pipes. It just feels like a waste of effort on their part. |
Hmm. And then... what do you think? Should this exist? How useful is it in practice? |
For context, I watched this talk from one of our users who brought this up as a desirable feature. It sounds like a good and convenient thing to have if the right infrastructure is in place. |
Hmm. Yes. I think I see what you mean. What about the following?
And considering this to address its shortcomings:
Since adressing this automatically would be difficult, there should be a manual editor of the Archive where we show the amount of times a regression test has passed and group them by property or something. Those are my thoughts atm. I think it’s a nice idea, but it feels like it should be separated from the Property Testing in order to be meaningful and to keep both clean. Also, I’m assuming here that there is a way to ensure consistency or to warn of issues. Maybe there is a protocol that we could make that would enforce some kind of consistency throught time...? Idk. That part might as well be a compiler plugin 🤔 |
Our sister framework Hypothesis has a feature they call "The Database" where they serialize counterexamples to disk then run them before they start the actual testing loop.
I have some gut reactions to this, so I've laid out some problems that needs to be addressed first
I'm most worried about this. Hypothesis goes to quite a lot of effort to make sure their example database is consistent and useful and even then they break it every so often between releases. On the one hand, it is incredibly useful - especially while practicing TDD - to write a property and have the framework just track your counterexamples, but there's got to be a less flaky way.
Python kind of has it easy - everything is a key-value store at the end of the day, so everything serializes for free: pretty much all of the python middleware frameworks are capable of automatically deriving what would need to be
Codable
andDecodable
instances for data.The expected user model when a test fails now becomes
This would seem to encourage growing test suites that exist solely to run through counterexamples which is not the point of property testing.
The text was updated successfully, but these errors were encountered: