New harder task #405

josephdviviano · 2025-10-03T16:47:25Z

I've read the .github/CONTRIBUTING.md file
My code follows the typing guidelines
I've added appropriate tests
I've run pre-commit hooks locally

Description

Added 3 new hypergrid tasks which should be more challenging. Note that the specifics are very much up for debate. I tried to identify environments which were easy to divide + conquer vs those which require compositional knowledge (and therefore some amount of knowledge sharing among agents in a multi-agent setting).
Added mode verification logic (to ensure that your particular configuration actually contains modes to find).
Added lots of tests around these new rewards.
Added visualizations of the reward landscape for these various rewards.

…d visualizations

younik

I am not able to review 1,000+ math-dense LOC for hypergrid.py :(
If you want a careful review, consider splitting this.

younik · 2025-10-06T10:19:56Z

src/gfn/gym/hypergrid.py

+                        self._n_modes_via_ids_estimate = float(torch.unique(ids).numel())
+                        self._mode_stats_kind = "approx"
+            except Exception:
+                warnings.warn("+ Warning: Failed to compute mode_stats (skipping).")


better to use logger.exception here, to print the exception as well

Also it would be better to avoid catching Exception in general. Why this can fail?

this would catch the ValueError in "exact" branch as well. Is this what we want? Should we catch at all?

younik · 2025-10-06T10:25:12Z

src/gfn/gym/hypergrid.py

+            # Cheap exact threshold (up to ~200k states)
+            if self.n_states <= 200_000:
+                axes = [
+                    torch.arange(self.height, dtype=torch.long) for _ in range(self.ndim)
+                ]
+                grid = torch.cartesian_prod(*axes)
+                rewards = self.reward_fn(grid)


how did you come up with this number? Doing the cartesian product seems memory intensive

this number might need to be lowered. It was arbitrary.

younik · 2025-10-06T10:26:31Z

src/gfn/gym/hypergrid.py

+        except Exception:
+            # Fall back to heuristic paths below
+            pass


maybe add a logger
I don't think in general it is a good idea to mask a lot of stuff to the user. Sometimes we compute the exact mode existence, sometimes we use heuristic

yes, agreed

younik · 2025-10-06T10:30:08Z

src/gfn/gym/hypergrid.py

+        for col in range(m):
+            # Find pivot
+            piv = None
+            for r in range(row, k):
+                if A[r, col]:
+                    piv = r
+                    break
+            if piv is None:
+                continue
+            # Swap
+            if piv != row:
+                A[[row, piv]] = A[[piv, row]]
+                c[[row, piv]] = c[[piv, row]]
+            # Eliminate below
+            for r in range(row + 1, k):
+                if A[r, col]:
+                    A[r, :] ^= A[row, :]
+                    c[r] ^= c[row]
+            row += 1
+            if row == k:
+                break
+        # Check for inconsistency: 0 = 1 rows
+        for r in range(k):
+            if not A[r, :].any() and c[r]:
+                return False
+        return True


I didn't check the details tbh, but it seems quite inefficient and not easily readable. Can we rely to scipy for these stuffs?

https://stackoverflow.com/questions/15638650/is-there-a-standard-solution-for-gauss-elimination-in-python

I'll look into it

younik · 2025-10-06T10:33:24Z

src/gfn/gym/hypergrid.py

+        """
+        with torch.no_grad():
+            device = torch.device("cpu")
+            B = min(2048, max(128, 8 * self.ndim))


what are these numbers? Maybe use constant to improve clarity

younik · 2025-10-06T10:34:37Z

src/gfn/gym/hypergrid.py

+        try:
+            all_states = self.all_states
+            if all_states is not None:
+                mask = self.mode_mask(all_states)
+                ids = self.mode_ids(all_states)
+                ids = ids[mask]
+                ids = ids[ids >= 0]
+                return int(torch.unique(ids).numel())
+        except Exception:
+            pass
+        if self._mode_stats_kind == "exact" and self._n_modes_via_ids_exact is not None:
+            return int(self._n_modes_via_ids_exact)
+        if (
+            self._mode_stats_kind == "approx"
+            and self._n_modes_via_ids_estimate is not None
+        ):
+            return int(self._n_modes_via_ids_estimate)
+
        return 2**self.ndim


do we need to recompute this every time?

no you're right it should be stored.

younik · 2025-10-06T10:35:24Z

src/gfn/gym/hypergrid.py

+        except Exception:
+            pass


similar to other comment, this is not nice for debuggability

josephdviviano · 2025-10-06T14:28:41Z

Hi @younik - I hear you, this is a big PR. The "splits" would have to be along tasks, though, so the resulting PRs would still be large.

I appreciate your comments on the code. I think it would make sense to also look at the tasks (the stuff that's plotted in the notebook) to see if they make sense. I'm not convinced by all of the tasks.

I would be open to removing a task or two. I think the one that works best for it's intended purpose is the coprime reward.

hyeok9855 · 2025-10-07T11:13:48Z

In the above commit, I fixed the comments of Deceptive Reward and also fixed a pyright error.

hyeok9855 · 2025-10-07T11:19:23Z

I would be open to removing a task or two. I think the one that works best for it's intended purpose is the coprime reward.

I do think Template Minkowski and Bitwise/XOR rewards are not very interesting to benchmark, especially if you care about the mode coverage. Multiplicative/Coprime seems challenging, but you may want to increase the reward for further modes from the origin.

…_harder_task

josephdviviano added 3 commits October 3, 2025 01:22

added new harder hypergrid variant - tests failing - with notebook an…

4a1f943

…d visualizations

formatting

ace32d0

added verifiable mode visualization for hypergrid

a46f6b9

josephdviviano requested review from hyeok9855, saleml and younik October 3, 2025 16:47

josephdviviano self-assigned this Oct 3, 2025

josephdviviano added the enhancement New feature or request label Oct 3, 2025

documentation improvements, removal of magic numbers

9e39720

younik approved these changes Oct 6, 2025

View reviewed changes

fix comments and pyright error

3ca16b2

hyeok9855 and others added 3 commits October 7, 2025 13:25

fix test

8844310

added documentation on the hypergrid tasks

ca13536

Merge branch 'new_harder_task' of github.com:GFNOrg/torchgfn into new…

07756f7

…_harder_task

New harder task #405

Are you sure you want to change the base?

New harder task #405

Uh oh!

Conversation

josephdviviano commented Oct 3, 2025

Description

Uh oh!

younik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josephdviviano commented Oct 6, 2025

Uh oh!

hyeok9855 commented Oct 7, 2025

Uh oh!

hyeok9855 commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants