Add the capability to do adjoint transforms #633

mreineck · 2025-02-18T10:14:42Z

This is a first outline how I propose to add adjoint transforms; it is mainly meant as a basis for discussions and measurements. The computation of the actual adjoint is completely untested yet, but the standard functionality should still be OK, as far as my tests show.

@ahbarnett, @DiamonDinoia please let me know your thoughts on this one!

mreineck · 2025-02-18T10:35:28Z

Just to be clear: at this point, there is no interface support for the new functioality. I just want to show which kind of impact the new feature has on the existing implementation.

mreineck · 2025-02-18T16:22:18Z

I've added the C and Python interfaces, as well as basic Python unit tests.

mreineck · 2025-02-18T16:54:00Z

Test failures seem to be "near misses" in adjoint type 3 transforms. No idea why that direction should be less accurate, and why it only happens in some of the tests.
This needs more investigation, but the approach seems to work well in principle.

…type 3

mreineck · 2025-02-19T09:14:14Z

Implements #571 and #566 (on CPU).

…s. Perhaps 1e-6 is just a bit too close to the machine epsilon for single precision

mreineck · 2025-02-19T13:29:23Z

I think this is ready for "technical" review. If there is agreement that the change is desirable, I can try to

provide more language-specific interfaces
document the new function

mreineck

Added a few explanatory comments

mreineck · 2025-02-24T13:53:53Z

python/finufft/examples/guru1d1f.py

@@ -20,7 +20,7 @@
 strt = time.time()

 #plan
-plan = fp.Plan(1,(N,),dtype='single')


This change is unrelated to the PR, but without it, CI simply fails. I don't know why this hasn't caused issues so far.

It seems because we removed the conversion for real dtypes in https://github.com/flatironinstitute/finufft/pull/606/files

mreineck · 2025-02-24T13:54:12Z

python/finufft/examples/guru2d1f.py

@@ -34,7 +34,7 @@

 # instantiate the plan (note n_trans must be set here), also setting tolerance:
 t0 = time.time()
-plan = finufft.Plan(nufft_type, (N1, N2), eps=1e-4, n_trans=K, dtype='float32')


This change is unrelated to the PR, but without it, CI simply fails. I don't know why this hasn't caused issues so far.

Thanks for catching this, it seems all the *f.py examples(in python/finufft/examples) should change according to PR606? otherwise the check is_single_dtype(dtype)

finufft/python/finufft/finufft/_interfaces.py

Line 106 in d2e0ff7

is_single = is_single_dtype(dtype)

will fail.

mreineck · 2025-02-24T13:56:16Z

python/finufft/test/test_finufft_plan.py

@@ -86,6 +120,19 @@ def test_finufft3_plan(dtype, dim, n_source_pts, n_target_pts, output_arg):

    utils.verify_type3(source_pts, source_coefs, target_pts, target_coefs, 1e-6)

+    # test adjoint type 3
+    plan = Plan(3, dim, dtype=dtype, isign=-1, eps=1e-5)


I'm increasing eps from 1e-6 to 1e-5 here, because I get occasional failures with single precision otherwise. Given that 1e-6 is uncomfortably close to machine epsilon, I'm not too worried about this change.

I agree. type 3 errors are usually 2-3x bigger than type 1 or 2 at the same tolerance.

mreineck · 2025-02-24T13:57:59Z

python/finufft/test/utils.py

@@ -154,7 +154,7 @@ def verify_type1(pts, coefs, shape, sig_est, tol):

    type1_rel_err = np.linalg.norm(fk_target - fk_est) / np.linalg.norm(fk_target)

-    assert type1_rel_err < 25 * tol


Switching from assert to np.testing.assert_allclose here, because the latter will provide more information in case of failure, which speeds up debugging a lot.

mreineck · 2025-02-24T14:02:05Z

src/fft.cpp

      }
    }
  }
 #endif
 #else
-  p->fftPlan->execute(); // if thisBatchSize<batchSize it wastes some flops


This needs discussion: how do we want to deal with this situation? The trick used here is nice and simple, but it will make the adjoint call slower than the forward one.

I actually don't understand the trick or why it's slower - sorry for my stupidity. I'd also like to understand how FFTW is handled. Since your change to the fft.h adds the data pointer, can that work with FFTW too (and allow us to allocate-at-execute stage) ?

The trick is the equivalence:
fft_adjoint(x) = conj(fft(conj(x)))
(since we do not have a plan for fft_adjoint, we just mess around with the sign of the imaginary part of input and output to get te same effect as if we had switched isign).
The additional complex conjugate of input and output is of course not completely free, but it might be sub-dominant compared to the cost of the FFT itself.

And yes, we do allocate the temporary at the execute stage in this PR, also for FFTW.

ahbarnett · 2025-02-25T16:48:27Z

I just fixed py examples in 19a24c6

…

On Mon, Feb 24, 2025 at 9:30 AM Libin Lu ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In python/finufft/examples/guru2d1f.py <#633 (comment)> : > @@ -34,7 +34,7 @@ # instantiate the plan (note n_trans must be set here), also setting tolerance: t0 = time.time() -plan = finufft.Plan(nufft_type, (N1, N2), eps=1e-4, n_trans=K, dtype='float32') Thanks for catching this, it seems all the *f.py examples(in python/finufft/examples) should change according to PR606? otherwise the check is_single_dtype(dtype) https://github.com/flatironinstitute/finufft/blob/d2e0ff7cd3a7bd66384d961d647060f497fde13c/python/finufft/finufft/_interfaces.py#L106 will fail. — Reply to this email directly, view it on GitHub <#633 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACNZRSUEOSDPVX33Q6B3WTT2RMUHHAVCNFSM6AAAAABXLHSEQ2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDMMZXGI4TSMZSGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

ahbarnett

Hi Martin,

[This is supposed to go as a PR review but appeared as a comment]

Your work here is great. As usual you are able to slot something in with minimal disruption. It will certainly allow power users to slip in the adjoint when needed. I don't understand how the FFTW part works - there seem to be no changes which would allow the fftw_execute to change type (or handle the new data ptr in the fft.h interface) - or did I miss it?

However, I am thinking about the interface from the new user perspective, and think it is getting more confusing than needs be. In your PR a user plans a type 1, then can do its adjoint (type 2 with flipped isign) freely. Users will want to know if that's the same a planning a type 2 and doing a forward - it's not due to isign. (Extra confusion: in Europe t1 is called "adjoint" and t2 "forward" NUDFT.) I see the isign flip between going from 1 to 2 vs 1 to its adj being a major source of confusion. The motivation for your PR was users wanting a single plan for t1 and t2, or for t1 and its adj. I think we should bounce around ideas for a new interface that fits the need. An idea to reduce confusion is to have a "plan 1 and 2", and then execute_1(c,f,isign) execute_2(c,f,isign). No confusion. 1+ is adj of 2-, and 1- adj of 2+. Question is can we do it in a neat way that preserves the existing guru interface (add a type=12 to the plan?) since legacy users need to still be able to plan a type 2 explicitly and have plan execute(c,f) do that t2.
Other questions are: 4 if FFTW plan can handle the isign swtiching?
This would all be for the plan (guru) interface. It would expand that interface from 4 to six commands. Under the hood (as you do in this PR with execute_internal()), execute() could call execute_1 or _2.

I have lots of deadlines in the next month so will have to wait a bit, but I'd love to discuss this and hash out the best interface, since I don't want a confusing interface to become locked in...

Best, Alex

ahbarnett · 2025-04-04T14:38:32Z

src/fft.cpp

      }
    }
  }
 #endif
 #else
-  p->fftPlan->execute(); // if thisBatchSize<batchSize it wastes some flops


I actually don't understand the trick or why it's slower - sorry for my stupidity. I'd also like to understand how FFTW is handled. Since your change to the fft.h adds the data pointer, can that work with FFTW too (and allow us to allocate-at-execute stage) ?

ahbarnett · 2025-04-04T14:39:58Z

src/finufft_core.cpp

@@ -424,72 +425,76 @@ static void deconvolveshuffle3d(int dir, T prefac, std::vector<T> &ker1,
 // --------- batch helper functions for t1,2 exec: ---------------------------

 template<typename T>
-static int spreadinterpSortedBatch(int batchSize, FINUFFT_PLAN_T<T> *p,
-                                   std::complex<T> *fwBatch, std::complex<T> *cBatch)
+static int spreadinterpSortedBatch(int batchSize, const FINUFFT_PLAN_T<T> &p,


THis is just part of your ongoing cleanup, right?

Partly yes. But the change to const also ensures that we don't do things like

innerT2plan->ntrans = thisBatchSize;

any more (which we had in execute() before). This sort of "messing with the state of another object" can cause a lot of trouble once plans are perhaps invoked in parallel ... and that's what I wanted to make possible with this patch.
The switch from pointer to reference is cosmetics, and I should perhaps have delayed that until later ... sorry,

src/finufft_core.cpp

src/fft.cpp

ahbarnett · 2025-04-04T15:07:40Z

src/finufft_core.cpp

  /* See ../docs/cguru.doc for current documentation.

   For given (stack of) weights cj or coefficients fk, performs NUFFTs with
   existing (sorted) NU pts and existing plan.
-   For type 1 and 3: cj is input, fk is output.
-   For type 2: fk is input, cj is output.
+   For adjoint == false:


these comments are super helpful

ahbarnett · 2025-04-04T15:08:58Z

src/finufft_core.cpp

+     - 0 < ntrans_actual <= batchSize: instead of doing ntrans transforms,
+         perform only ntrans_actual
+
+   scratch_size, aligned_scratch:


Is this something we should do anyway with t3? (regardless of adjoint?)

src/finufft_core.cpp

mreineck · 2025-06-26T06:20:53Z

I just added two lines to the Changelog mentioning the thread safety and allocate-on-execute feature.

From my side this is ready to merge (assuming that the missing interfaces/docs/tests can be provided in a separate PR).

DiamonDinoia · 2025-06-26T14:06:49Z

I just added two lines to the Changelog mentioning the thread safety and allocate-on-execute feature.

From my side this is ready to merge (assuming that the missing interfaces/docs/tests can be provided in a separate PR).

I think only fortran is missing?

For GPU, It will take some time be implement this. I plan to understand everything while reviewing and integrating it once I have time.

mreineck · 2025-06-26T14:26:07Z

I think only fortran is missing?

Interface-wise, yes. Comprehensive tests still need to be done (together with the standard guru interface tests).
Actually, the necessary changes for the Fortran interface should be minimal, I can probably add them quickly. But without testing they shouldn't be trusted.

ahbarnett · 2025-06-26T16:01:12Z

uh-oh, I just did the same thing!

ahbarnett · 2025-06-26T16:03:24Z

Hang on and let me merge too. I did the complementary example. I changed finufftfort.cpp which I see you didn't yet..

…nufft into adj Martin & Alex both doing fortran adjoint

mreineck · 2025-06-26T16:05:48Z

Oops, I actually did, but I forgot the git add ...
Please go ahead!

ahbarnett · 2025-06-26T16:06:45Z

Ok, merged and pushed - diff finufftfort.cpp and see it we are bitwise identical :)

ahbarnett · 2025-06-26T16:08:09Z

I see your rel err is 0.44E-02 when I run your guru1d2_adjoint{f} ...

ahbarnett · 2025-06-26T16:09:26Z

You forgot to negate isign :)

ahbarnett · 2025-06-26T16:11:56Z

I'll fix and clean up the headers for those files. I can also finish up the fortran docs.

For the guru tests, one idea is to have a dedicated test of adjointness that tests <f,A^g> = <Af,g> where f,g are random vectors, and A^ means execute_adjoint, A means execute. THis would be somewhat easier to write than a new "full math test of guru" C++ code.

mreineck · 2025-06-26T16:13:59Z

You forgot to negate isign :)

True ;-)

We now have a file called guru1d1_adjoint.f, which is identical to guru1d2_adjoint.f. Should we remove it?

ahbarnett · 2025-06-26T16:16:07Z

Have a look at guru1d1_adjoint.f - the math is different, and it's basically a new type-2 tester (with iflag flipped). I'd like to keep it. But your ones need commenting for users, and maybe I'll simplify them to a single guru call...

ahbarnett · 2025-06-26T16:32:28Z

ok, fortran done now :)

mreineck · 2025-06-26T17:23:56Z

Beautiful, thanks!

DiamonDinoia · 2025-06-26T18:13:57Z

@mreineck, @ahbarnett is it ready for a thorough review?

mreineck · 2025-06-26T19:23:48Z

I'd say yes.

ahbarnett · 2025-06-27T03:43:30Z

Well, modulo tests, yes. It definitely needs also trying on DUCC vs FFTW builds. Any eyes and help welcome.

…

On Thu, Jun 26, 2025 at 3:24 PM mreineck ***@***.***> wrote: *mreineck* left a comment (flatironinstitute/finufft#633) <#633 (comment)> I'd say yes. — Reply to this email directly, view it on GitHub <#633 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACNZRSWHLXXRHJ6EZ4XCNXT3FRCFXAVCNFSM6AAAAABXLHSEQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMBZGY3TMOJWHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

first version; adjoint still completely untested

1180e27

mreineck added 2 commits February 18, 2025 17:13

add Python interface; fix a few bugs

e7bd659

add tests; add overlooked file

fe0dcdc

use assertions that are more helpful in cse of errors

e148935

test commit to see if tests pass with incresed tolerance for adjoint …

b7d7ce1

…type 3

mreineck added 2 commits February 19, 2025 10:42

increase overall requested plan accuracy for adjoint type 3 transform…

f54a1dd

…s. Perhaps 1e-6 is just a bit too close to the machine epsilon for single precision

be more clever about memory consumption

d50a912

mreineck added 3 commits February 24, 2025 11:48

document new parameters for execute_internal()

a4c313f

Merge remote-tracking branch 'origin/master' into add_adjoint

2692658

update CHANGELOG

eafcf3e

mreineck commented Feb 24, 2025

View reviewed changes

mreineck added 4 commits February 26, 2025 10:10

merge master

7f19185

more comments

0fa2a4e

small tweak

da3df92

merge master

ec969b5

ahbarnett mentioned this pull request Mar 20, 2025

consider whether the big array fwbatch should be malloced at plan or at execute. #480

Open

mreineck added 3 commits March 22, 2025 09:12

fix typos

5a61c86

merge master

9bc0e02

maerge master

3dbd742

ahbarnett reviewed Apr 4, 2025

View reviewed changes

src/fft.cpp Show resolved Hide resolved

ahbarnett reviewed Apr 4, 2025

View reviewed changes

src/finufft_core.cpp Show resolved Hide resolved

mreineck linked an issue Jun 26, 2025 that may be closed by this pull request

allow type 1 and 2 plans to be interchangeable to reduce from two to one plan for transform pairs #571

Open

mreineck marked this pull request as ready for review June 26, 2025 06:23

merge master

02eda19

chaithyagr mentioned this pull request Jun 26, 2025

Add support for upcoming memory efficient (cu)fiNUFFT mind-inria/mri-nufft#280

Open

mreineck and others added 3 commits June 26, 2025 17:54

add Fortran adjoint interface and example

733f025

add adjoint to Fortran docs

20a5bc0

execute_adjoint in Fortran, with example. No docs yet

e3d9da3

ahbarnett added 2 commits June 26, 2025 12:04

Merge branch 'add_adjoint' of https://github.com/flatironinstitute/fi…

7d7056a

…nufft into adj Martin & Alex both doing fortran adjoint

add Martin's guru1d2_adjoint.f to makefile

225f6ed

negate iflags in Martin's guru1d2_adjoint{f}.f

0e206f2

ahbarnett added 2 commits June 26, 2025 12:20

both fort adj examples in doc page

6312e20

simplified guru1d2_adjoint{f}.f

f74c302

make octave runs adjoint example

2526214

		@@ -154,7 +154,7 @@ def verify_type1(pts, coefs, shape, sig_est, tol):

		type1_rel_err = np.linalg.norm(fk_target - fk_est) / np.linalg.norm(fk_target)

		assert type1_rel_err < 25 * tol

Add the capability to do adjoint transforms #633

Are you sure you want to change the base?

Add the capability to do adjoint transforms #633

Uh oh!

Conversation

mreineck commented Feb 18, 2025

Uh oh!

mreineck commented Feb 18, 2025

Uh oh!

mreineck commented Feb 18, 2025

Uh oh!

mreineck commented Feb 18, 2025

Uh oh!

mreineck commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mreineck commented Feb 19, 2025

Uh oh!

mreineck left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahbarnett commented Feb 25, 2025 via email

Uh oh!

ahbarnett left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mreineck Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mreineck commented Jun 26, 2025

Uh oh!

DiamonDinoia commented Jun 26, 2025

Uh oh!

mreineck commented Jun 26, 2025

Uh oh!

ahbarnett commented Jun 26, 2025

Uh oh!

ahbarnett commented Jun 26, 2025

Uh oh!

mreineck commented Jun 26, 2025

Uh oh!

ahbarnett commented Jun 26, 2025

Uh oh!

mreineck commented Feb 19, 2025 •

edited

Loading

ahbarnett left a comment •

edited

Loading

mreineck Apr 14, 2025 •

edited

Loading