Allow shrinking of generated values into wider strategies for their type by DRMacIver · Pull Request #4713 · HypothesisWorks/hypothesis

DRMacIver · 2026-04-23T21:18:28Z

This is perhaps heresy, but a case I've been finding myself caring about recently is that sometimes you want shrinking to not go via generation. You care about the entire input space, but you want generators that can explore parts of the space that you'll never hit by chance. e.g. you might want to not crash on arbitrary bytes, but the interesting space corresponds to some grammar.

This PR makes it so that the shrinker can, under some limited cases, do this. If you e.g. have st.text() | some_strategy_that_generates_text, the shrinker can now reliably take the text produced by the second strategy and shrink it as if it were arbitrary text.

We achieve this in two ways:

We annotate spans with their generated primitive value.
just and sampled_from will insert artificial choice nodes for primitive values, so they are not artificially simpler than their generated values and this sort of shrinking is possible.

This is a bit of a gimmick and I'm not 100% sure that it's worth it, but I think it's a really nice feature for the small subset of cases where it matters. The shrink pass is also very narrowly designed so it's pretty cheap-to-free in cases where it's not useful.

Zac-HD · 2026-04-24T04:32:04Z

some late-night takes

the UX is good, so we should do whatever crimes it takes to offer, but also see if there are pointers to a nice abstraction
restricting it to the primitive types is ugly and unprincipled
we should instead go with a minimal SearchStrategy._invert() method, use that, and give OneOfStrategy special integration with the choice sequence and Just/SampledFromStrategy.
- basics should be about as easy, with a clear pathway to wildly more power later (search, and maybe solvers). Recording the object ids associated with choice-sequence-spans might also be helpful here, and maybe shared with call-pprinting?
- SearchStrategy._invert() would also let us pull off complete bullshit features like "paste an externally-reported failure into @example(...), and we'll shrink it for you". I want this.

Annotate spans that produced a primitive value with that value, and use those annotations in a new shrink pass that tries to widen a non-zero one_of selector down to zero by replacing the selected span's choices with a single forced choice holding the annotated value. To make this work for ``just`` and ``sampled_from``, add a ``maybe_add_choice_node_for`` method on ``ConjectureData`` that records a forced marker choice when the generated value is one of the primitive choice types, so these strategies' spans are no longer empty and the widening pass has something to replace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

``Span.annotation`` is now ``Span.generated_primitive_value``, and the corresponding backing fields on ``Spans`` / ``SpanRecord`` follow suit. ``SpanRecord.annotate_current_span`` is renamed to ``record_value_for_span``, and the primitive-type check is now done inside that method so callers don't need to guard it themselves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

``add_choice_node_for`` now always records a choice - primitive values still produce a forced choice of the matching type, and non-primitive values produce a forced ``True`` boolean as a minimal marker. To avoid polluting one_of's selector span with those markers (which would break the widening pass's "immediately followed by" condition), ``OneOfStrategy.do_draw`` now draws an integer index directly instead of routing the alternative selection through ``SampledFromStrategy``. Update two invariant tests that asserted the old ``does not draw`` behaviour, and refresh the ``@reproduce_failure`` blob whose encoded choice sequence shape has changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n slices - Change the placeholder choice added by ``add_choice_node_for`` for non-primitive values from ``draw_boolean(forced=True)`` to ``draw_boolean(forced=False)``. False is the trivial boolean value, so the resulting span has the same minimal sort-key as an empty span would, which preserves the relative complexity ordering of strategies like ``none()`` and ``booleans()``. - Refresh the ``@reproduce_failure`` blob to match the new forced value. - Skip slices containing only forced choices in the explain-phase loop that decides when to add ``# or any other generated value`` to an argument. If every node in the slice is forced, there's nothing to vary and the comment would be misleading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Add direct unit tests for ``_choice_node_for_value`` (all primitive branches + the non-primitive assertion), which the quality tests were exercising via integration but the conjecture-only coverage run was not. - Add a direct test for ``LazyStrategy.do_draw``; the normal ``data.draw`` path unwraps lazy strategies, so this method is only reachable via a direct call. - Use a ``# pragma: no cover`` for the forced-only-slice skip in the explain pass - it's only reachable through a full explain-phase run, not through conjecture-level unit tests. - Use double backticks for ``my_primitive_strategy | some_more_complicated_strategy`` in ``RELEASE.rst`` so Sphinx treats it as a literal rather than a Python object reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DRMacIver · 2026-04-24T11:54:44Z

@Liam-DeVoe I investigated the failures from always inserting a choice node and it turns out that they were 100% that it makes one_of worse because of its internal use of sampled_from, so I just changed it to not do that and everything is happy.

@Zac-HD I'm... open in principle to an invert based approach to this, but I think it turns this from a minor nice-to-have chunk of work to a major feature that I'm not going to get around to implementing any time soon. It requires some very intrusive reworking of the way that the shrinker interacts with data generation in a way that I'm not quite prepared to think through the implementation details of right now.

Agreed it would be nice to make this work in generality, but I actually do think the primitive values are the highest priority version of this (actually I think bytes and text are the highest priority versions of this and the other primitive values are just there along for the ride because might as well)

DRMacIver · 2026-04-24T19:42:57Z

There seems to be a flaky test made worse by this PR. I've not investigated enough yet. Here's what Claude says about it:

Details

The flaky test was tests/pandas/test_argument_validation.py::test_raise_invalid_argument[...rows=just(['x'])...] (and an adjacent variant with rows=just({'a': 'x'})).

What it checks: that constructing a data_frames() strategy with a row shape that doesn't match the declared columns raises InvalidArgument. Concretely:

pdst.data_frames(
columns=pdst.columns(["a", "b"], dtype=str, elements=st.text()), # 2 columns
rows=st.just(["x"]), # 1-element row
index=pdst.indexes(dtype=int, min_size=1),
)

Two columns but each row has length 1 → should raise InvalidArgument.

What actually happened on those runs: instead of InvalidArgument, pytest saw a RuntimeWarning: overflow encountered in scalar add. With -bb -X dev on CI that gets escalated
to a hard error. The overflow happens somewhere inside pandas/numpy scalar arithmetic when pdst.indexes(dtype=int, min_size=1) draws a large-magnitude int and pandas does an
internal addition on it — before Hypothesis's own validation path rejects the bad shape.

Why my changes made it more likely to flake: previously just([...]) / just({...}) never touched the choice sequence. Now add_choice_node_for inserts a forced boolean marker.
Every choice that indexes(...) draws comes out of a slightly different sort-key position in the sequence, so the distribution of the int values it produces shifted, and with
Python 3.14 + pandas 3.0.2 that new distribution hits the overflow corner more often.

It passed locally 10× in a row with -W error and passed on rerun every time on CI — so the test is racy, not broken. Not a correctness issue with our shrinker changes: the
InvalidArgument path is still the outcome the majority of the time; we're just widening the window where pandas trips over the specific int. A proper fix (not part of this
PR) would be to bound pdst.indexes in that parametrization, or to suppress the overflow warning for argument-validation tests.

Zac-HD

I'm not convinced we should do this now, rather than waiting for a more-general ._invert()-based solution.

Further notes:

All the tests are currently for text() | sampled_from(...) or equivalents. We could handle this with a strategy-level change, which taught string strategies about these additional constants using our existing constants machinery.
For more complex tests, e.g. using text() | specific_json().map(json.encodes), I worry that our primitive shrinking logic is not actually that sophisticated - we've historically relied on the structure from the strategy to make this easier, rather than shrinkray-style intelligence.

Zac-HD · 2026-04-25T20:09:36Z

+"""Quality tests for ``basic_strategy | specific_strategy`` combinations.
+
+Each predicate below is chosen so that random draws from the basic (open)
+strategy satisfy it with probability well under 1%. That ensures the
+initial failing example essentially always comes from the specific
+(complicated) branch, so the test genuinely exercises the shrinker's
+ability to widen out of the specific branch into the open one.


I worry this might be wrong, given that we find and upweight magic constants?

Zac-HD · 2026-04-25T20:14:18Z

+        """Try to navigate away from a specific ``one_of`` alternative into
+        an earlier one by using the span's recorded generated primitive value.
+
+        If we have an integer choice with ``min_value == 0`` currently set to
+        a non-zero value, and it is immediately followed by a span whose
+        corresponding strategy produced a primitive value, we replace the
+        integer with ``0`` and the span's choices with a single choice
+        holding that primitive value. The engine then re-runs the test
+        against the earlier alternative with that value.
+
+        This is useful for ``basic_strategy | specific_strategy``, where
+        the specific branch produced a primitive that the basic branch could
+        also have produced: we slip the primitive across into the basic
+        branch so that normal shrinking can take it the rest of the way.


This feels uncomfortably brittle; it's obviously useful if the strategy is exactly like this, but there are some very sharp edges in how it works. (for example: you're just out of luck if you wrote branches in any other order)

Plausibly still worth doing, but it weighs against the PR for me.

DRMacIver requested review from Liam-DeVoe and Zac-HD as code owners April 23, 2026 21:18

DRMacIver and others added 5 commits April 24, 2026 09:35

Reformat

f7ca0d9

Reformat

687ba30

DRMacIver force-pushed the DRMacIver/value-based-shrinking branch from df6dafc to 687ba30 Compare April 24, 2026 09:35

DRMacIver and others added 2 commits April 24, 2026 10:08

DRMacIver force-pushed the DRMacIver/value-based-shrinking branch from 9cb877a to 25b74d5 Compare April 24, 2026 10:54

Zac-HD reviewed Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow shrinking of generated values into wider strategies for their type#4713

Allow shrinking of generated values into wider strategies for their type#4713
DRMacIver wants to merge 7 commits intomasterfrom
DRMacIver/value-based-shrinking

DRMacIver commented Apr 23, 2026

Uh oh!

Zac-HD commented Apr 24, 2026 •

edited

Loading

Uh oh!

DRMacIver commented Apr 24, 2026

Uh oh!

DRMacIver commented Apr 24, 2026 •

edited

Loading

Uh oh!

Zac-HD left a comment

Uh oh!

Zac-HD Apr 25, 2026

Uh oh!

Zac-HD Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DRMacIver commented Apr 23, 2026

Uh oh!

Zac-HD commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DRMacIver commented Apr 24, 2026

Uh oh!

DRMacIver commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zac-HD left a comment

Choose a reason for hiding this comment

Uh oh!

Zac-HD Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Zac-HD Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Zac-HD commented Apr 24, 2026 •

edited

Loading

DRMacIver commented Apr 24, 2026 •

edited

Loading