Skip to content

Conversation

@franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented May 15, 2025

Fix #5355 by upgrading dependencies toml11 and nlohmann_json to versions containing the fix:

  • nlohmann_json: upgraded to version 3.12.0
  • toml11: upgraded to current development branch (commit 2a18a89008d3daac6d8f9db03ddd582173032c7a)

This is a relatively simple workaround for the issue documented therein, especially as it does not require changes in openPMD and hence fixes the problem also for old versions.

Explanation: inline namespace creates a namespace that does not actually need to be used in including code. Instead, it only instructs the linker to emit qualified symbols, thus avoiding the problematic symbol clashes.

Aside from this, I am currently experimenting with automatically installing the internally shipped versions of toml11 / nlohmann-json along with openPMD upon make install in openPMD/openPMD-api#1757. The far goal would be to use the logic described there as a "protocol" for openPMD and PIConGPU to agree on common versions for internal header-only libraries.

TODO:

  • Document the scripts a bit.

@franzpoeschel
Copy link
Contributor Author

Just ran a test, the segfault that I saw is fixed by this.

@psychocoderHPC psychocoderHPC added bug a bug in the project's code affects latest release a bug that affects the latest stable release component: third party third party libraries that are shipped and/or linked labels May 16, 2025
@psychocoderHPC psychocoderHPC added this to the 0.9.0 / next stable milestone May 16, 2025
@franzpoeschel
Copy link
Contributor Author

I was just going to fix this for nlohmann-json, but noticed that it has already been fixed with version 3.11 nlohmann/json#3590. Internally, we still use version 3.9.1. I will remove my workaround and upgrade the single header instead.
I supplied a similar PR to toml11 which has been merged by now: ToruNiina/toml11#291
But it is not yet part of any release, so for the moment, I will leave that part as it is.

@franzpoeschel franzpoeschel force-pushed the fix-toml-json-version-mismatches branch 5 times, most recently from 7436f3f to f1007d3 Compare June 2, 2025 12:31
@franzpoeschel
Copy link
Contributor Author

Upgrading nlohmann_json is not entirely trivial, since #4812 apparently added the library also to targets compiled by the device compiler (it had previously been provided only to the host compiler since nlohmann_json does not support NVIDIA compilers and there had been other issues in the past). This again lets us run into this issue that nlohmann_json has with some versions of NVCC. There is now a draft to add minimal support for NVCC on nlohmann/json#4796, but this particular issue will not be fixed as it seems to be a compiler/STL implementation bug.
The workaround is to set #define JSON_HAS_RANGES 0 which needs to be done in CMake since even some Param files seem to include nlohmann_json by now and we have no control over those. Even in CMake, it's a bit of an adventure to catch every usage.

CI currently failing due to 403 errors, will try later.

@franzpoeschel franzpoeschel force-pushed the fix-toml-json-version-mismatches branch 4 times, most recently from 6f1bf4f to 2ed37ef Compare June 4, 2025 17:19
@psychocoderHPC
Copy link
Member

psychocoderHPC commented Jun 10, 2025

@franzpoeschel ~I am wondering why there are toml changes in this PR, I thought it is about json only?~~

#5367 (comment) explain my question

@psychocoderHPC
Copy link
Member

@franzpoeschel could you commit the toml and json update as generic user

GIT_AUTHOR_NAME="Third Party" GIT_AUTHOR_EMAIL="[email protected]" git commit

and all cmake changes within a seperate commit.

# https://github.com/ComputationalRadiationPhysics/picongpu/pull/5367
#
# This script helps applying the patch (manual intervention likely needed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, I was not aware that we change the thirt party code.

Copy link
Contributor Author

@franzpoeschel franzpoeschel Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a temporary patch until this fix is part of a toml11 release. I do not want to wait for a release since this behavior causes actual crashes, and this is the only way to deploy a bugfix without having to wait.

The script for nlohmann_json can be removed though.

@psychocoderHPC
Copy link
Member

why do we not use the openPMD shipped json library if a target nlohmann_json::nlohmann_json is already available?
Let's discuss it in our next developer meeting.

@franzpoeschel
Copy link
Contributor Author

why do we not use the openPMD shipped json library if a target nlohmann_json::nlohmann_json is already available? Let's discuss it in our next developer meeting.

openPMD does not ship the JSON library, it just uses it internally, but does not install it. With the current setup, any consumer of openPMD should not even notice in its build system that the library has been used. This PR adds fixes to restore this behavior.

openPMD/openPMD-api#1757 adds an option to install the internal nlohmann-json and toml11 alongside openPMD, but we cannot rely on that in PIConGPU.

@franzpoeschel
Copy link
Contributor Author

@franzpoeschel could you commit the toml and json update as generic user

GIT_AUTHOR_NAME="Third Party" GIT_AUTHOR_EMAIL="[email protected]" git commit

and all cmake changes within a seperate commit.

The updated JSON tree is already on a separate commit with Niels Lohmann as an author: 59213d3

@franzpoeschel franzpoeschel force-pushed the fix-toml-json-version-mismatches branch from 2ed37ef to f78a5dc Compare June 11, 2025 15:24
@franzpoeschel franzpoeschel changed the title toml11/nlohmann-json: avoid symbol clashes with different versions in upstream dependencies [WIP] toml11/nlohmann-json: avoid symbol clashes with different versions in upstream dependencies Jun 11, 2025
@franzpoeschel
Copy link
Contributor Author

The current version of toml11 uses std::source_location which is only supported in Clang 15 (https://en.cppreference.com/w/cpp/compiler_support/20) and hence breaks our CI runs of Clang 12, 13 and 14. Patching that out of toml11 is a single line (0cd31f4).

Should we

  • Drop support for Clang < 15 (I've not even been able to set things up locally since those Clang versions have many other problems with C++20)
  • Keep the patch in our internal clone of toml11
  • Add some define like TOML11_DISABLE_SOURCE_LOCATION to override toml11's detection mechanics and submit that patch to the mainline

If success, this will be a PR to toml11
@franzpoeschel franzpoeschel force-pushed the fix-toml-json-version-mismatches branch from abafd39 to 9258b5e Compare June 16, 2025 09:54
@psychocoderHPC
Copy link
Member

@franzpoeschel You set the toml and lohmann json developer as auther. I thought about is. We should not do that. We should them set as co-auther and commit as tools users. The reason is that currently they will be shown up in the statistic as project contributors but they contributed only indirect because we use there project as dependency.
It could be they do not want to shown as contributors directly.

So after thinking a lot about it I would say we should push the changes clearly as subtree as in #5364 to not fake the project statistic with external developers.

@franzpoeschel franzpoeschel marked this pull request as draft July 9, 2025 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects latest release a bug that affects the latest stable release bug a bug in the project's code component: third party third party libraries that are shipped and/or linked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Conflicting versions of toml11 and nlohmann_json when used as internally shipped dependencies at multiple points in the software stack

4 participants