Allow shrinking of generated values into wider strategies for their type#4713
Allow shrinking of generated values into wider strategies for their type#4713
Conversation
|
some late-night takes
|
Annotate spans that produced a primitive value with that value, and use those annotations in a new shrink pass that tries to widen a non-zero one_of selector down to zero by replacing the selected span's choices with a single forced choice holding the annotated value. To make this work for ``just`` and ``sampled_from``, add a ``maybe_add_choice_node_for`` method on ``ConjectureData`` that records a forced marker choice when the generated value is one of the primitive choice types, so these strategies' spans are no longer empty and the widening pass has something to replace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``Span.annotation`` is now ``Span.generated_primitive_value``, and the corresponding backing fields on ``Spans`` / ``SpanRecord`` follow suit. ``SpanRecord.annotate_current_span`` is renamed to ``record_value_for_span``, and the primitive-type check is now done inside that method so callers don't need to guard it themselves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``add_choice_node_for`` now always records a choice - primitive values still produce a forced choice of the matching type, and non-primitive values produce a forced ``True`` boolean as a minimal marker. To avoid polluting one_of's selector span with those markers (which would break the widening pass's "immediately followed by" condition), ``OneOfStrategy.do_draw`` now draws an integer index directly instead of routing the alternative selection through ``SampledFromStrategy``. Update two invariant tests that asserted the old ``does not draw`` behaviour, and refresh the ``@reproduce_failure`` blob whose encoded choice sequence shape has changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
df6dafc to
687ba30
Compare
…n slices - Change the placeholder choice added by ``add_choice_node_for`` for non-primitive values from ``draw_boolean(forced=True)`` to ``draw_boolean(forced=False)``. False is the trivial boolean value, so the resulting span has the same minimal sort-key as an empty span would, which preserves the relative complexity ordering of strategies like ``none()`` and ``booleans()``. - Refresh the ``@reproduce_failure`` blob to match the new forced value. - Skip slices containing only forced choices in the explain-phase loop that decides when to add ``# or any other generated value`` to an argument. If every node in the slice is forced, there's nothing to vary and the comment would be misleading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add direct unit tests for ``_choice_node_for_value`` (all primitive branches + the non-primitive assertion), which the quality tests were exercising via integration but the conjecture-only coverage run was not. - Add a direct test for ``LazyStrategy.do_draw``; the normal ``data.draw`` path unwraps lazy strategies, so this method is only reachable via a direct call. - Use a ``# pragma: no cover`` for the forced-only-slice skip in the explain pass - it's only reachable through a full explain-phase run, not through conjecture-level unit tests. - Use double backticks for ``my_primitive_strategy | some_more_complicated_strategy`` in ``RELEASE.rst`` so Sphinx treats it as a literal rather than a Python object reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9cb877a to
25b74d5
Compare
|
@Liam-DeVoe I investigated the failures from always inserting a choice node and it turns out that they were 100% that it makes @Zac-HD I'm... open in principle to an invert based approach to this, but I think it turns this from a minor nice-to-have chunk of work to a major feature that I'm not going to get around to implementing any time soon. It requires some very intrusive reworking of the way that the shrinker interacts with data generation in a way that I'm not quite prepared to think through the implementation details of right now. Agreed it would be nice to make this work in generality, but I actually do think the primitive values are the highest priority version of this (actually I think |
|
There seems to be a flaky test made worse by this PR. I've not investigated enough yet. Here's what Claude says about it: DetailsThe flaky test was tests/pandas/test_argument_validation.py::test_raise_invalid_argument[...rows=just(['x'])...] (and an adjacent variant with rows=just({'a': 'x'})).What it checks: that constructing a data_frames() strategy with a row shape that doesn't match the declared columns raises InvalidArgument. Concretely: pdst.data_frames( Two columns but each row has length 1 → should raise InvalidArgument. What actually happened on those runs: instead of InvalidArgument, pytest saw a RuntimeWarning: overflow encountered in scalar add. With -bb -X dev on CI that gets escalated Why my changes made it more likely to flake: previously just([...]) / just({...}) never touched the choice sequence. Now add_choice_node_for inserts a forced boolean marker. It passed locally 10× in a row with -W error and passed on rerun every time on CI — so the test is racy, not broken. Not a correctness issue with our shrinker changes: the |
Zac-HD
left a comment
There was a problem hiding this comment.
I'm not convinced we should do this now, rather than waiting for a more-general ._invert()-based solution.
Further notes:
- All the tests are currently for
text() | sampled_from(...)or equivalents. We could handle this with a strategy-level change, which taught string strategies about these additional constants using our existing constants machinery. - For more complex tests, e.g. using
text() | specific_json().map(json.encodes), I worry that our primitive shrinking logic is not actually that sophisticated - we've historically relied on the structure from the strategy to make this easier, rather thanshrinkray-style intelligence.
| """Quality tests for ``basic_strategy | specific_strategy`` combinations. | ||
|
|
||
| Each predicate below is chosen so that random draws from the basic (open) | ||
| strategy satisfy it with probability well under 1%. That ensures the | ||
| initial failing example essentially always comes from the specific | ||
| (complicated) branch, so the test genuinely exercises the shrinker's | ||
| ability to widen out of the specific branch into the open one. |
There was a problem hiding this comment.
I worry this might be wrong, given that we find and upweight magic constants?
| """Try to navigate away from a specific ``one_of`` alternative into | ||
| an earlier one by using the span's recorded generated primitive value. | ||
|
|
||
| If we have an integer choice with ``min_value == 0`` currently set to | ||
| a non-zero value, and it is immediately followed by a span whose | ||
| corresponding strategy produced a primitive value, we replace the | ||
| integer with ``0`` and the span's choices with a single choice | ||
| holding that primitive value. The engine then re-runs the test | ||
| against the earlier alternative with that value. | ||
|
|
||
| This is useful for ``basic_strategy | specific_strategy``, where | ||
| the specific branch produced a primitive that the basic branch could | ||
| also have produced: we slip the primitive across into the basic | ||
| branch so that normal shrinking can take it the rest of the way. |
There was a problem hiding this comment.
This feels uncomfortably brittle; it's obviously useful if the strategy is exactly like this, but there are some very sharp edges in how it works. (for example: you're just out of luck if you wrote branches in any other order)
Plausibly still worth doing, but it weighs against the PR for me.
This is perhaps heresy, but a case I've been finding myself caring about recently is that sometimes you want shrinking to not go via generation. You care about the entire input space, but you want generators that can explore parts of the space that you'll never hit by chance. e.g. you might want to not crash on arbitrary bytes, but the interesting space corresponds to some grammar.
This PR makes it so that the shrinker can, under some limited cases, do this. If you e.g. have
st.text() | some_strategy_that_generates_text, the shrinker can now reliably take the text produced by the second strategy and shrink it as if it were arbitrary text.We achieve this in two ways:
justandsampled_fromwill insert artificial choice nodes for primitive values, so they are not artificially simpler than their generated values and this sort of shrinking is possible.This is a bit of a gimmick and I'm not 100% sure that it's worth it, but I think it's a really nice feature for the small subset of cases where it matters. The shrink pass is also very narrowly designed so it's pretty cheap-to-free in cases where it's not useful.