Skip to content

Fix ssp correction #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

krisvanneste
Copy link
Collaborator

Claudio,

Here is my pull request to fix the interpolation issue when residual correction is applied. As you suggested, I created the branch from the fix_source_residuals branch, but now it says that it can't automatically merge... Let me know if I need to do it differently.

I ran the fix with my test case and it appears to work, but you should be aware that in the spectrum returned by the station_correction function, the linear-spaced and log-spaced versions may not have the same length and frequency range. The linear one is truncated to the common frequency range with the station residual, while the logarithmic one keeps the original length, with values outside the common frequency range set to NaN. I'm not sure if this could cause any problems further in the code, but I did not notice anything.

@krisvanneste
Copy link
Collaborator Author

Note that I won't be in the office tomorrow. We can continue on Thursday or Friday.

@claudiodsf claudiodsf force-pushed the fix_ssp_correction branch from 146d642 to 8d6f0e6 Compare May 7, 2025 08:45
@claudiodsf
Copy link
Member

Dear Kris, thanks for the PR.

There were indeed many conflicts (my suggestion to start from the previous branch was bad).
To fix that, I started a new branch from current upstream/main and "cherry-picked" the three commits, then renamed the branch to fix_ssp_correction and force-pushed.
Please force-pull on your side.

I can now check the PR contents... 😉

@claudiodsf
Copy link
Member

I added a couple of commits to move common interpolation code to a function and to remove the unnecessary return spec_st (spec_st is altered in place)

@krisvanneste
Copy link
Collaborator Author

I force-pulled the branch.

@claudiodsf
Copy link
Member

There is still a problem in my tests when computing spectral weights from S/N: signal spectrum and noise spectrum sometimes do not have the same length.

Today's is holiday in France and tomorrow I'll take the day off.
I'll be back working on this PR on Monday 😉

@krisvanneste
Copy link
Collaborator Author

On my side it still works.
Are the spectral weights computed before or after the residual correction?

@claudiodsf
Copy link
Member

I think the problem is here:

specnoise is computed independently of spec, based on the assumption that both are computed from a time window that has the same length, and therefore they must have the same frequencies.

This assumption is no longer valid if the frequency range spec is modified during the correction.

A possible solution would be to pass the frequency range as an extra optional argument when calling _build_spectrum() for spec_noise, and interpolate spec_noise to this new frequency range.

So something like:

specnoise = _build_spectrum(config, trace_noise, interpolate_freqs=spec.freqs)

@krisvanneste
Copy link
Collaborator Author

OK, I will have a look as well. Enjoy your long weekend!

@krisvanneste
Copy link
Collaborator Author

I understand the problem now. I think it would be better not to change the frequency range of the signal spectrum, and set NaN values outside the frequency range of the residual for the linear-spaced spectrum as well, thus also keeping linear-spaced and log-spaced versions in sync. We then need to check if the NaN values do not give any problem when calculating spectral SNR. I will improve the ssp_correction.station_correction function.

@krisvanneste
Copy link
Collaborator Author

Claudio, I changed the station correction function in a way that the original frequency range is not modified. Can you check if this solves your problem with noise weighting?

@claudiodsf
Copy link
Member

Hi Kris, thanks for this additional work.

I had to make some further modifications:

  • replace NaNs in the weight spectrum with small values: otherwise the logspaced spectrum becomes all NaNs
  • replace np.min() and np.max() with np.nanmin() and np.nanmax() in stacked spectral plotting

Now it seems to work, but I need to test it more extensively.

Could you also check on your side?

@krisvanneste
Copy link
Collaborator Author

OK

@claudiodsf
Copy link
Member

One more commit related to NaN data 😉

@krisvanneste
Copy link
Collaborator Author

My results look identical.
Note that you can't use scipy.interpolate.interp1d with arrays containing NaN values. I had to ignore these to interpolate the log-spaced spectrum in station_correction.

@claudiodsf
Copy link
Member

My results look identical.

Great!

Note that you can't use scipy.interpolate.interp1d with arrays containing NaN values. I had to ignore these to interpolate the log-spaced spectrum in station_correction.

Ok, so maybe I can use a similar approach for weight computation, instead of replacing NaN with 1e-9...

@krisvanneste
Copy link
Collaborator Author

Ok, so maybe I can use a similar approach for weight computation, instead of replacing NaN with 1e-9...

I did it like this:

nan_idxs = np.isnan(spec_corr.data_mag)
f = interp1d(spec_corr.freq[~nan_idxs], spec_corr.data_mag[~nan_idxs],
             bounds_error=False)
spec_corr.data_mag_logspaced = f(spec_corr.freq_logspaced)

This is possible because we know the NaN values only occur at the start and/or end. The interpolated spectrum will also be NaN where the original one is NaN, because we don't allow extrapolation.

@claudiodsf claudiodsf force-pushed the fix_ssp_correction branch from c7cdfeb to e818baa Compare May 19, 2025 12:48
@claudiodsf
Copy link
Member

claudiodsf commented May 19, 2025

Hi Kris,

I just force-pushed to this branch a commit that moves all the interpolation into the Spectrum class, so that everything that depends on interp1d is in there.

I still have to rework the weighting part (where I replaced NaN with small values), but I would like you to test this implementation, whenever you have time.

In particular, I reimplemented your search and removal of NaN values before interpolation, which is now made by making sure that only NaN values at the edges are removed. However, I found that in my tests this removal is not essential. You can test with or without the removal by setting nan_idxs = [] at line 434 in spectrum.py.

Oh, and if you have better names for the new methods in the Spectrum class, don't hesitate to propose!

Looking forward to your feedback 😉

P.S. You must force-pull since I rebased this PR over a few additional commits added to main (which are unrelated)

if self.freq_logspaced.size == 0:
raise ValueError('Logspaced frequency axis is empty')
# Find NaN indexes at the beginning and end of the data
nan_idxs = _find_nan_edges(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to nan_idxs = [] if you want to ignore this step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claudio,

Thanks for the update. I will try to run it one of the coming days (Friday at the latest). Is that OK?
Kris

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! thanks!

@krisvanneste
Copy link
Collaborator Author

Claudio,

I ran my test case again after force-pulling the branch. The results (in terms of source parameters) are practically the same. However, there again appear some edge effects that were mostly gone in the previous version.

Previous run:
17540 ssp 00

New run:
17540 ssp 00

I also noticed that the frequency ranges of the corrected spectra reported in the log file are not the same:

Previous run:

2025-05-12 14:19:03,360 ssp_correction       INFO     BE.BEBN..HHH: corrected, frequency range is: 4.00 38.29 Hz
2025-05-12 14:19:03,363 ssp_correction       INFO     BE.MRG..EHH: corrected, frequency range is: 0.59 44.94 Hz
2025-05-12 14:19:03,366 ssp_correction       INFO     BE.MOL2..HHH: corrected, frequency range is: 0.60 74.33 Hz
2025-05-12 14:19:03,369 ssp_correction       INFO     BE.SKQ..EHH: corrected, frequency range is: 2.13 39.06 Hz
2025-05-12 14:19:03,371 ssp_correction       INFO     BE.MOL3..HHH: corrected, frequency range is: 0.80 47.75 Hz
2025-05-12 14:19:03,374 ssp_correction       INFO     BE.TNL..EHH: corrected, frequency range is: 0.60 44.92 Hz
2025-05-12 14:19:03,377 ssp_correction       INFO     BE.HOU..HHH: corrected, frequency range is: 1.63 22.63 Hz
2025-05-12 14:19:03,380 ssp_correction       INFO     BE.MOLS..HHH: corrected, frequency range is: 0.60 39.92 Hz
2025-05-12 14:19:03,383 ssp_correction       INFO     BE.MEMS..EHH: corrected, frequency range is: 0.55 44.92 Hz
2025-05-12 14:19:03,386 ssp_correction       INFO     BE.MOL4..HHH: corrected, frequency range is: 0.80 57.34 Hz
2025-05-12 14:19:03,388 ssp_correction       INFO     BE.DOU..HHH: corrected, frequency range is: 1.30 30.31 Hz
2025-05-12 14:19:03,392 ssp_correction       INFO     BE.RCHB..HHH: corrected, frequency range is: 0.51 40.00 Hz
2025-05-12 14:19:03,396 ssp_correction       INFO     BE.MOL5..HHH: corrected, frequency range is: 0.80 56.74 Hz
2025-05-12 14:19:03,398 ssp_correction       INFO     BE.MOLT..HHH: corrected, frequency range is: 0.80 44.91 Hz
2025-05-12 14:19:03,402 ssp_correction       INFO     BE.OPTB..EHH: corrected, frequency range is: 1.40 23.15 Hz
2025-05-12 14:19:03,404 ssp_correction       INFO     BE.UCCB..HHH: corrected, frequency range is: 0.49 40.05 Hz
2025-05-12 14:19:03,407 ssp_correction       INFO     BE.MOL1..HHH: corrected, frequency range is: 0.80 49.35 Hz

New run:

2025-05-21 16:05:12,801 ssp_correction       INFO     BE.MOLS..HHH: corrected, frequency range is: 0.60 39.92 Hz
2025-05-21 16:05:12,803 ssp_correction       INFO     BE.HOU..HHH: corrected, frequency range is: 1.63 22.63 Hz
2025-05-21 16:05:12,806 ssp_correction       INFO     BE.MEMS..EHH: corrected, frequency range is: 0.55 44.92 Hz
2025-05-21 16:05:12,808 ssp_correction       INFO     BE.BEBN..HHH: corrected, frequency range is: 4.00 38.29 Hz
2025-05-21 16:05:12,810 ssp_correction       INFO     BE.MOL3..HHH: corrected, frequency range is: 0.80 47.75 Hz
2025-05-21 16:05:12,813 ssp_correction       INFO     BE.MOL1..HHH: corrected, frequency range is: 0.80 49.35 Hz
2025-05-21 16:05:12,816 ssp_correction       INFO     BE.MOL2..HHH: corrected, frequency range is: 0.60 74.33 Hz
2025-05-21 16:05:12,818 ssp_correction       INFO     BE.MOLT..HHH: corrected, frequency range is: 0.80 44.91 Hz
2025-05-21 16:05:12,821 ssp_correction       INFO     BE.UCCB..HHH: corrected, frequency range is: 0.49 40.05 Hz
2025-05-21 16:05:12,823 ssp_correction       INFO     BE.TNL..EHH: corrected, frequency range is: 0.60 44.92 Hz
2025-05-21 16:05:12,825 ssp_correction       INFO     BE.MRG..EHH: corrected, frequency range is: 0.59 44.94 Hz
2025-05-21 16:05:12,828 ssp_correction       INFO     BE.SKQ..EHH: corrected, frequency range is: 2.13 39.06 Hz
2025-05-21 16:05:12,830 ssp_correction       INFO     BE.MOL4..HHH: corrected, frequency range is: 0.80 57.34 Hz
2025-05-21 16:05:12,833 ssp_correction       INFO     BE.MOL5..HHH: corrected, frequency range is: 0.80 56.74 Hz
2025-05-21 16:05:12,836 ssp_correction       INFO     BE.DOU..HHH: corrected, frequency range is: 1.30 30.31 Hz
2025-05-21 16:05:12,839 ssp_correction       INFO     BE.OPTB..EHH: corrected, frequency range is: 1.40 23.15 Hz
2025-05-21 16:05:12,841 ssp_correction       INFO     BE.RCHB..HHH: corrected, frequency range is: 0.51 40.00 Hz

Any idea what could be causing this?

@claudiodsf
Copy link
Member

Thanks, Kris for the test.

The frequency ranges are actually the same, when you sort the two files 😉

I need to think on why there is again this edge case on BE.MOLT...

@krisvanneste
Copy link
Collaborator Author

The frequency ranges are actually the same, when you sort the two files 😉

Oops, I didn't see that...

@krisvanneste
Copy link
Collaborator Author

I need to think on why there is again this edge case on BE.MOLT...

Note that it's not just BE.MOLT. Also BE.MOL1, and a number of other stations (I did not include all the plots).

@claudiodsf
Copy link
Member

Note that it's not just BE.MOLT. Also BE.MOL1, and a number of other stations (I did not include all the plots).

Probably the best would be if you can send me a working example

@krisvanneste
Copy link
Collaborator Author

Claudio,

Sorry, I have been very busy the past few weeks.

I had another look at the issue, and I think the edge effects are still due to extrapolation.
If I modify ssp_plot_spectra.py to plot the linear spectra instead of the log-spaced ones, the problem is much less.
Next, I could verify that the problem occurs in the make_logspaced_from_linear method of the Spectrum class, when there are no NaN values at the start or end. In this case, it is still possible that the outermost logspaced frequencies are (slightly) outside the linear frequency range, in which case extrapolation occurs. However, when I try set these values to NaN after the interpolation, an error occurs elsewhere in the code. So, this should only be done in this specific case. I temporarily solved it by adding an 'extrapolate' argument with default value True, which overrides the value based on nan_idxs.size, and calling make_logspaced_from_linear with extrapolate=False from the station_correction function. Then the edge effects largely disappear.

We can discuss next week if this solution is OK for you or if we can further improve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants