Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicore support #410

Open
wants to merge 67 commits into
base: master
Choose a base branch
from
Open

Multicore support #410

wants to merge 67 commits into from

Conversation

Rewbert
Copy link
Collaborator

@Rewbert Rewbert commented Jun 18, 2024

Support for running tests in parallel, fixes #180, and addresses #352.

Packages like Tasty already have support for running multiple properties in parallel (calling quickCheck more than once in parallel), whereas this work distributes all the tests concerning a single property over the available cores. This is more helpful when you as a developer are working on something, as opposed to when tests are being executed in CI.

I have used things only found in base to not increase the already large list of dependencies for QC. Primarily Control.Concurrent and Control.Exception

I wrote up some more text regarding this work here: https://www.krook.dev/posts/quickercheck/quickercheck.html
The paper presenting the work was published at IFL 23', and can be found here: https://www.krook.dev/papers/quickercheck.pdf Please give it a read to get a feeling for how the implementation works. There are no details like code, but it explains in broad terms how the implementation works.

I expect it will be a nightmare to merge this (and make sure related packages like Tasty and HSpec don't break), but it will be worth it in the end. The change is quite large with some 1500 LOC added and 500 LOC removed, and so far I am the only one that's really looked at it. I've had meetings with Nick and discussed some design alternatives, but not really looked at the code. It is good if more people than me look at it. Below follows a non-exhaustive list of things that needs to be addressed, aside from the obvious crap that makes this not mergeable at the moment

  • Features.hs is commented out at the moment, and needs to be fixed

  • The shrinker uses the UserInterrupt exception at the moment but should use a QC internal exception instead

  • Graceful combinator also uses UserInterrupt, but should use the same internal exception the shrinker uses (they must be the same)

  • Should we support both offset and stride for size computations? I am not a fan of how the size is computed, but what I've done is retrofit this solution over the existing implementation.

  • I moved some things around. I can't remember everything, but I think I e.g. moved State to another module to avoid some cyclic problems

  • I added a CTRL-C handler, but I know @nick8325 doesn't like this as users might have their own handlers installed. This should go (easy to do)

  • Currently, the same flag is used to indicate the number of testers as well as the number of shrinkers (the -N flag populates this field). Experimentation showed that fewer workers are better when shrinking. On my 4-core 8-thread machine, even if I could utilise 8 threads for testing, I reached an optimal when shrinking if I used 3-4 workers.

  • Currently, the right to run a test is claimed before the test actually starts running. An effect of this is that if you run 100 tests and 99 of them finish very fast, whereas the 100th one is an outlier and takes a while, it will seem as if the tests are stuck at 99/100. Another possible design is that the right to run a test is claimed when you report the results, meaning that you have already run the test. An effect of this is that we won't hang like described above, but also that we will explore a different space of tests. If a slow-running test is an indication that a bug is present, we are less likely to find it.

  • When I made my fork there was very little action in the repository. Since then, @MaximilianAlgehed has managed to squash a lot of old issues! I tried to reapply some of the changes that were applied to the testing loop, but there were some that I've not reapplied yet. TODO Look up which (I have this written on a note at work) and add as points here

  • I need to go over all the documentation and update it to reflect the current state of the implementation. I documented it some while back but can't quite remember if I changed relevant stuff since then.

While this MR is quite unpolished, with input from the other maintainers I can put in some more work and get this in a proper state.

@Rewbert Rewbert added this to the 2.16 milestone Jun 18, 2024
@Rewbert Rewbert self-assigned this Jun 18, 2024
Comment on lines +1 to +24
Roberts fork: this fork contains experimental support for running tests in parallel, and shrinking in parallel.
Only the internal evaluation of a property is changed, so the API of QC remains unchanged. There is a module
`Test.QuickCheck.Features` which I've completely commented out for now, so that will obviously not work.

In order to try this out yourself, you must follow three steps:

1: You need to make sure cabal knows where my fork of QC exists. You do this by either cloning this repository onto your local machine and pointing your `cabal.project` to it.
You do this e.g. by adding the line `packages: *.cabal <path-to-qc>/QuickCheck.cabal`. You can also optionally point your cabal to this remote repository. You do this by
editing your `cabal.project` to say
```
source-repository-package
type: git
location: https://github.com/Rewbert/quickcheck.git
-- optionally also add this to point to a particular commit, otherwise you will always get the freshest master commit
-- tag: <commit hash of the version you want>
```

2: You need to add some flags when you compile your code. Specifically, `-threaded -feager-blackholing -rtsopts`.

3: Finally, all that is left is to change the call to `quickCheck` with a call to `quickCheckPar`. If you don't want parallel shrinking, you should call `quickCheckParWith (stdArgs { parallelShrinking = False}) property`.

4: You also need to pass in the runtime option that actually creates more HECs. You need to either instead of `cabal run executable` do `cabal run executable -- +RTS -N` or `-Nx` where x is a number between 1 and the number of cores you have. You can also ddd another compilation option `-with-rtsopts=-N` or `-with-rtsopts=-Nx`

As a sanity check of whether you are using my fork or not, if you run `quickCheckPar` with just 1 HEC available, the word `donkey` will be printed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not something we want in the final readme ;)

Comment on lines +19 to +28
-- prop_noNewFeatures :: Testable prop => Set String -> prop -> Property
-- prop_noNewFeatures feats prop =
-- mapResult f prop
-- where
-- f res =
-- case ok res of
-- Just True
-- | not (features (P.labels res) (Set.fromList (P.classes res)) `Set.isSubsetOf` feats) ->
-- res{ok = Just False, P.reason = "New feature found"}
-- _ -> res
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this needs to be commented out?

Comment on lines +286 to +292
, maybeDiscardedRatio :: Maybe Int
-- ^ maximum number of discarded tests per succesful test
, maybeMaxShrinks :: Maybe Int
-- ^ maximum number of shrinks
, maybeMaxTestSize :: Maybe Int
-- ^ maximum test size
, labels :: [String]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly concerned this is showing up in the diff - these are on master. What's going on there?

@MaximilianAlgehed
Copy link
Collaborator

Ok, before I do a proper review of this I would really like the conflicts to be resolved so that I can get a more clean view of what's going on.

The big thing we will have to think about here is how we make sure we maintain compatibility with tools like hspec and tasty. For that I would really like @Bodigrim and @sol to weigh in on how these tools interact (if at all?) with the internals of QuickCheck. Specifically, I want to know how they interact with the Result and State types and what interfaces we'd need to add to make sure we hide that interaction.

There is now way to avoid some breaking changes with this PR as far as I can tell. That's not a problem as far as I'm concerned so long as we also clean up the interface a bit and maybe move some things to Test.QuickCheck.Internal (still exposing it, but making it clear that it's not to be relied on stability-wise). That's for a different PR after this has been merged, however.

@sol
Copy link
Contributor

sol commented Jul 1, 2024

This is the contact surface of Hspec with QuikCheck:

https://github.com/hspec/hspec/blob/main/hspec-core/src/Test/Hspec/Core/QuickCheck/Util.hs

https://github.com/hspec/hspec/blob/main/hspec-core/src/Test/Hspec/Core/QuickCheck.hs#L66-L105

@Bodigrim
Copy link
Contributor

Bodigrim commented Jul 1, 2024

QuickCheck-specific parts of tasty are all in one file: https://github.com/UnkindPartition/tasty/blob/master/quickcheck/Test/Tasty/QuickCheck.hs

Both tasty and, I imagine, hspec parallelize test execution by running individual tests in separate threads. It's likely to be unhelpful if underlying QuickCheck properties try to spawn multiple threads as well. Would it be possible to keep default singlethreaded?

@Rewbert
Copy link
Collaborator Author

Rewbert commented Jul 1, 2024

QuickCheck-specific parts of tasty are all in one file: https://github.com/UnkindPartition/tasty/blob/master/quickcheck/Test/Tasty/QuickCheck.hs

Both tasty and, I imagine, hspec parallelize test execution by running individual tests in separate threads. It's likely to be unhelpful if underlying QuickCheck properties try to spawn multiple threads as well. Would it be possible to keep default singlethreaded?

Yes, the default behavior in this PR is singlethreaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run tests on multiple cores
4 participants