From fd89493c58b2b5829c9289e4f095c4ed36c25a81 Mon Sep 17 00:00:00 2001 From: Melissa Linkert Date: Wed, 25 Sep 2024 19:38:51 -0500 Subject: [PATCH 1/4] Outline of conversion with supplemental metadata --- .../Convert_with_metadata.ipynb | 107 ++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 notebooks/advanced_topics/Convert_with_metadata.ipynb diff --git a/notebooks/advanced_topics/Convert_with_metadata.ipynb b/notebooks/advanced_topics/Convert_with_metadata.ipynb new file mode 100644 index 0000000..6cbd351 --- /dev/null +++ b/notebooks/advanced_topics/Convert_with_metadata.ipynb @@ -0,0 +1,107 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "022cf89b", + "metadata": {}, + "source": [ + "## Adding metadata when converting to DICOM\n", + "\n", + "\n", + "When converting data to DICOM (as described in the conversion tools notebook), DICOM tags can be inserted during conversion, so that experimental metadata is included in the output DICOM dataset. This eliminates the need for a post-processing step to add metadata, which speeds up the total time to create a complete DICOM dataset." + ] + }, + { + "cell_type": "markdown", + "id": "ef608704", + "metadata": {}, + "source": [ + "### Recap of required packages" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a55547d9", + "metadata": {}, + "outputs": [], + "source": [ + "# Required for downloading data from IDC\n", + "!pip install idc-index\n", + "\n", + "# Install bfconvert via bftools\n", + "# Install bfconvert via bftools\n", + "!wget https://downloads.openmicroscopy.org/bio-formats/7.3.1/artifacts/bftools.zip\n", + "!unzip bftools.zip" + ] + }, + { + "cell_type": "markdown", + "id": "ed29c909", + "metadata": {}, + "source": [ + "### Download SVS input data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "44288420", + "metadata": {}, + "outputs": [], + "source": [ + "# Download sample data from OpenSlide\n", + "!wget https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/CMU-1-Small-Region.svs" + ] + }, + { + "cell_type": "markdown", + "id": "104fa6a4", + "metadata": {}, + "source": [ + "### Write supplemental metadata file\n", + "\n", + "DICOM tags to be written are provided as a JSON file." + ] + }, + { + "cell_type": "markdown", + "id": "b4953b13", + "metadata": {}, + "source": [ + "### Convert SVS to DICOM with supplemental metadata" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "724609cf", + "metadata": {}, + "outputs": [], + "source": [ + "!./bftools/bfconvert -noflat CMU-1-Small-Region.svs CMU-1.dcm -extra-metadata supplemental-metadata.json" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 013255744cd9ec290890298607c666e0ab53d3ec Mon Sep 17 00:00:00 2001 From: Melissa Linkert Date: Mon, 7 Oct 2024 19:27:18 -0500 Subject: [PATCH 2/4] Fill in more information about the JSON tag format --- .../Convert_with_metadata.ipynb | 100 +++++++++++++++++- 1 file changed, 97 insertions(+), 3 deletions(-) diff --git a/notebooks/advanced_topics/Convert_with_metadata.ipynb b/notebooks/advanced_topics/Convert_with_metadata.ipynb index 6cbd351..4fef446 100644 --- a/notebooks/advanced_topics/Convert_with_metadata.ipynb +++ b/notebooks/advanced_topics/Convert_with_metadata.ipynb @@ -8,7 +8,9 @@ "## Adding metadata when converting to DICOM\n", "\n", "\n", - "When converting data to DICOM (as described in the conversion tools notebook), DICOM tags can be inserted during conversion, so that experimental metadata is included in the output DICOM dataset. This eliminates the need for a post-processing step to add metadata, which speeds up the total time to create a complete DICOM dataset." + "When converting data to DICOM (as described in the conversion tools notebook), DICOM tags can be inserted during conversion, so that experimental metadata is included in the output DICOM dataset. This eliminates the need for a post-processing step to add metadata, which speeds up the total time to create a complete DICOM dataset.\n", + "\n", + "See also relevant [Bio-Formats documentation](https://bio-formats.readthedocs.io/en/stable/users/comlinetools/conversion.html#cmdoption-bfconvert-extra-metadata)" ] }, { @@ -61,7 +63,99 @@ "source": [ "### Write supplemental metadata file\n", "\n", - "DICOM tags to be written are provided as a JSON file." + "DICOM tags to be written are provided as a JSON file.\n", + "\n", + "The structure of the JSON file is based on that used by [dcmqi](https://github.com/QIICR/dcmqi/tree/master/doc/examples), but with several additions.\n", + "\n", + "Each DICOM tag is a single JSON object, e.g.:\n", + "\n", + "```\n", + "{\n", + " \"BodyPartExamined\": {\n", + " \"Value\": \"BRAIN\",\n", + " \"VR\": \"CS\",\n", + " \"Tag\": \"(0018,0015)\"\n", + " }\n", + "}\n", + "```\n", + "\n", + "The object's name (`BodyPartExamined`) should be the name of the tag in the DICOM dictionary, with spaces removed.\n", + "There is only 1 required key/value pair:\n", + "\n", + "- `Value` (here, `BRAIN`), which is the tag's value\n", + "\n", + "\n", + "There are also 3 optional key/value pairs:\n", + "\n", + "- `Tag` (here, `(0018,0015)`, which is the tag corresponding to the object name in the DICOM dictionary. If not defined, this will be looked up automatically.\n", + "- `VR` (here `CS`), which is the value representation to use when writing the tag. If not defined, the default VR will be looked up in the DICOM dictionary.\n", + "- `ResolutionStrategy`, which defines what to do with this tag it was defined multiple times. Valid values are `IGNORE`, `APPEND`, and `REPLACE`. `APPEND` is the default if the `VR` is `SQ` (a sequence), or `REPLACE` for all other VRs.\n", + "\n", + "\n", + "In the example above, tag `(0018,0015)` (`BodyPartExamined`) would always be set to `BRAIN`. In this example though:\n", + "\n", + "```\n", + "{\n", + " \"BodyPartExamined\": {\n", + " \"Value\": \"BRAIN\",\n", + " \"VR\": \"CS\",\n", + " \"Tag\": \"(0018,0015)\",\n", + " \"ResolutionStrategy\": \"IGNORE\"\n", + " }\n", + "}\n", + "```\n", + "\n", + "tag `(0018,0015)` (`BodyPartExamined`) would only be set to `BRAIN` if the tag wasn't previously defined.\n", + "\n", + "`ResolutionStrategy` is particularly useful when trying to alter metadata that Bio-Formats' DICOM writer already writes. For example, Bio-Formats will automatically write an `OpticalPathSequence` with the appropriate number of channels, but may have missing wavelengths or other metadata. To fully replace the default `OpticalPathSequence`, the entire sequence can be defined with a `ResolutionStrategy` of `REPLACE`:\n", + "\n", + "```\n", + " \"OpticalPathSequence\": {\n", + " \"VR\": \"SQ\",\n", + " \"Tag\": \"(0048,0105)\",\n", + " \"Sequence\": {\n", + " \"IlluminationTypeCodeSequence\": {\n", + " \"VR\": \"SQ\",\n", + " \"Tag\": \"(0022,0016)\",\n", + " \"Sequence\": {\n", + " \"CodeValue\": {\n", + " \"VR\": \"SH\",\n", + " \"Tag\": \"(0008,0100)\",\n", + " \"Value\": \"111743\"\n", + " },\n", + " \"CodingSchemeDesignator\": {\n", + " \"VR\": \"SH\",\n", + " \"Tag\": \"(0008,0102)\",\n", + " \"Value\": \"DCM\"\n", + " },\n", + " \"CodeMeaning\": {\n", + " \"VR\": \"LO\",\n", + " \"Tag\": \"(0008,0104)\",\n", + " \"Value\": \"Epifluorescence illumination\"\n", + " }\n", + " }\n", + " },\n", + " \"IlluminationWaveLength\": {\n", + " \"VR\": \"FL\",\n", + " \"Tag\": \"(0022,0055)\",\n", + " \"Value\": \"488.0\"\n", + " },\n", + " \"OpticalPathIdentifier\": {\n", + " \"VR\": \"SH\",\n", + " \"Tag\": \"(0048,0106)\",\n", + " \"Value\": \"1\"\n", + " },\n", + " \"OpticalPathDescription\": {\n", + " \"VR\": \"ST\",\n", + " \"Tag\": \"(0048,0107)\",\n", + " \"Value\": \"replacement channel\"\n", + " }\n", + " },\n", + " \"ResolutionStrategy\": \"REPLACE\"\n", + " }\n", + " ```\n", + " \n", + " Additional technical discussion of how to represent DICOM tags in JSON is available [here](https://github.com/ome/bioformats/pull/4016)." ] }, { @@ -79,7 +173,7 @@ "metadata": {}, "outputs": [], "source": [ - "!./bftools/bfconvert -noflat CMU-1-Small-Region.svs CMU-1.dcm -extra-metadata supplemental-metadata.json" + "!./bftools/bfconvert -noflat -precompressed CMU-1-Small-Region.svs CMU-1.dcm -extra-metadata supplemental-metadata.json" ] } ], From 185edf9d723e982a25f4449e864d7e1082352cfb Mon Sep 17 00:00:00 2001 From: Melissa Linkert Date: Wed, 9 Oct 2024 15:12:23 -0500 Subject: [PATCH 3/4] Fill in a few more details of `Value` attribute in JSON elements --- .../Convert_with_metadata.ipynb | 61 +++++++++++++++++-- 1 file changed, 56 insertions(+), 5 deletions(-) diff --git a/notebooks/advanced_topics/Convert_with_metadata.ipynb b/notebooks/advanced_topics/Convert_with_metadata.ipynb index 4fef446..771b00a 100644 --- a/notebooks/advanced_topics/Convert_with_metadata.ipynb +++ b/notebooks/advanced_topics/Convert_with_metadata.ipynb @@ -67,6 +67,10 @@ "\n", "The structure of the JSON file is based on that used by [dcmqi](https://github.com/QIICR/dcmqi/tree/master/doc/examples), but with several additions.\n", "\n", + "Additional technical discussion of how to represent DICOM tags in JSON is available [here](https://github.com/ome/bioformats/pull/4016).\n", + "\n", + "#### Basic tag structure\n", + "\n", "Each DICOM tag is a single JSON object, e.g.:\n", "\n", "```\n", @@ -84,7 +88,6 @@ "\n", "- `Value` (here, `BRAIN`), which is the tag's value\n", "\n", - "\n", "There are also 3 optional key/value pairs:\n", "\n", "- `Tag` (here, `(0018,0015)`, which is the tag corresponding to the object name in the DICOM dictionary. If not defined, this will be looked up automatically.\n", @@ -92,7 +95,28 @@ "- `ResolutionStrategy`, which defines what to do with this tag it was defined multiple times. Valid values are `IGNORE`, `APPEND`, and `REPLACE`. `APPEND` is the default if the `VR` is `SQ` (a sequence), or `REPLACE` for all other VRs.\n", "\n", "\n", - "In the example above, tag `(0018,0015)` (`BodyPartExamined`) would always be set to `BRAIN`. In this example though:\n", + "#### Writing values for different VRs\n", + "\n", + "The `Value` is interpreted according to the VR that was either defined or looked up in the dictionary.\n", + "\n", + "For VRs representing a string of characters (e.g. `SH`), the `Value` is used directly. It is not necessary to ensure that `Value` contains an even number of characters. If needed, Bio-Formats' DICOM writer will pad the string to the correct width.\n", + "\n", + "For VRs representing a numeric type (e.g. `US`), the `Value` is parsed and then saved to DICOM as the correct type (e.g. uint16 for `US`). When a value multiplicity greater than 1 (i.e. an array of values) is needed, the values should be separated by a comma:\n", + "\n", + "\n", + "```\n", + "{\n", + " \"ReferencedFrameNumber\": {\n", + " \"Value\": \"1,3,5,9\",\n", + " \"VR\": \"IS\",\n", + " \"Tag\": \"(0008,1160)\"\n", + " }\n", + "}\n", + "```\n", + "\n", + "#### Handling duplicate or conflicting tags\n", + "\n", + "In the first example above, tag `(0018,0015)` (`BodyPartExamined`) would always be set to `BRAIN`. In this example though:\n", "\n", "```\n", "{\n", @@ -153,9 +177,7 @@ " },\n", " \"ResolutionStrategy\": \"REPLACE\"\n", " }\n", - " ```\n", - " \n", - " Additional technical discussion of how to represent DICOM tags in JSON is available [here](https://github.com/ome/bioformats/pull/4016)." + " ```" ] }, { @@ -166,6 +188,26 @@ "### Convert SVS to DICOM with supplemental metadata" ] }, + { + "cell_type": "code", + "execution_count": null, + "id": "217d21df", + "metadata": {}, + "outputs": [], + "source": [ + "# save one of the JSON examples to a file\n", + "# edit this as needed, or paste a different example from above\n", + "json = '''{\n", + " \"BodyPartExamined\": {\n", + " \"Value\": \"BRAIN\",\n", + " \"VR\": \"CS\",\n", + " \"Tag\": \"(0018,0015)\"\n", + " }\n", + "}'''\n", + "with open('supplemental-metadata.json', 'w') as f:\n", + " f.write(json)" + ] + }, { "cell_type": "code", "execution_count": null, @@ -173,8 +215,17 @@ "metadata": {}, "outputs": [], "source": [ + "!cat supplemental-metadata.json\n", "!./bftools/bfconvert -noflat -precompressed CMU-1-Small-Region.svs CMU-1.dcm -extra-metadata supplemental-metadata.json" ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3e98476c", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { From 49e63e0df71ec82b6a1ac08aaebafba58d764847 Mon Sep 17 00:00:00 2001 From: Melissa Linkert Date: Thu, 10 Oct 2024 13:41:42 -0500 Subject: [PATCH 4/4] Fix a few typos --- notebooks/advanced_topics/Convert_with_metadata.ipynb | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/notebooks/advanced_topics/Convert_with_metadata.ipynb b/notebooks/advanced_topics/Convert_with_metadata.ipynb index 771b00a..132451f 100644 --- a/notebooks/advanced_topics/Convert_with_metadata.ipynb +++ b/notebooks/advanced_topics/Convert_with_metadata.ipynb @@ -32,7 +32,6 @@ "!pip install idc-index\n", "\n", "# Install bfconvert via bftools\n", - "# Install bfconvert via bftools\n", "!wget https://downloads.openmicroscopy.org/bio-formats/7.3.1/artifacts/bftools.zip\n", "!unzip bftools.zip" ] @@ -92,7 +91,7 @@ "\n", "- `Tag` (here, `(0018,0015)`, which is the tag corresponding to the object name in the DICOM dictionary. If not defined, this will be looked up automatically.\n", "- `VR` (here `CS`), which is the value representation to use when writing the tag. If not defined, the default VR will be looked up in the DICOM dictionary.\n", - "- `ResolutionStrategy`, which defines what to do with this tag it was defined multiple times. Valid values are `IGNORE`, `APPEND`, and `REPLACE`. `APPEND` is the default if the `VR` is `SQ` (a sequence), or `REPLACE` for all other VRs.\n", + "- `ResolutionStrategy`, which defines what to do with this tag if it was defined multiple times. Valid values are `IGNORE`, `APPEND`, and `REPLACE`. `APPEND` is the default if the `VR` is `SQ` (a sequence), or `REPLACE` for all other VRs.\n", "\n", "\n", "#### Writing values for different VRs\n",