diff --git a/doc/_static/analysis_tools/histPlotExample.png b/doc/_static/analysis_tools/histPlotExample.png new file mode 100644 index 000000000..2521bb904 Binary files /dev/null and b/doc/_static/analysis_tools/histPlotExample.png differ diff --git a/doc/_static/analysis_tools/stellarLocusExample.png b/doc/_static/analysis_tools/stellarLocusExample.png new file mode 100644 index 000000000..8eecc68d8 Binary files /dev/null and b/doc/_static/analysis_tools/stellarLocusExample.png differ diff --git a/doc/lsst.analysis.tools/currentActions.rst b/doc/lsst.analysis.tools/currentActions.rst new file mode 100644 index 000000000..a0181913d --- /dev/null +++ b/doc/lsst.analysis.tools/currentActions.rst @@ -0,0 +1,34 @@ +=============== +Current Actions +=============== + +Here is a list of the current actions implemented in ``analysis tools``, please look at these carefully before +adding new actions to avoid duplication. + +Plot actions are covered in a separate :doc:`section `. + +Vector Actions +============== + +.. automodapi:: lsst.analysis.tools.actions.vector + :no-inheritance-diagram: + :no-inherited-members: + :skip: ConfigurableActionField + :skip: ConfigurableActionStructField + :skip: DictField + :skip: Field + :skip: Vector + :skip: VectorAction + +Scalar Actions +============== + +.. automodapi:: lsst.analysis.tools.actions.scalar + :no-inheritance-diagram: + :no-inherited-members: + :skip: ChoiceField + :skip: Field + :skip: Scalar + :skip: ScalarAction + :skip: Vector + diff --git a/doc/lsst.analysis.tools/currentPlots.rst b/doc/lsst.analysis.tools/currentPlots.rst new file mode 100644 index 000000000..f9c4a16e7 --- /dev/null +++ b/doc/lsst.analysis.tools/currentPlots.rst @@ -0,0 +1,9 @@ +Current Plot Types +================== +Details of current plot types are here, these plots are all in ``python/lsst/analysis/tools/actions/plot`` and +can all be found on `github `__. + + +.. automodapi:: lsst.analysis.tools.actions.plot + :no-inheritance-diagram: + :no-inherited-members: diff --git a/doc/lsst.analysis.tools/gettingStartedGuide.rst b/doc/lsst.analysis.tools/gettingStartedGuide.rst new file mode 100644 index 000000000..33503db4b --- /dev/null +++ b/doc/lsst.analysis.tools/gettingStartedGuide.rst @@ -0,0 +1,490 @@ +.. _analysis-tools-getting-started: + +Getting Started Guide +===================== + +What is analysis_tools? +----------------------- +Analysis tools is a package for the creation of plots and metrics. It allows +them to be created from the same code to ensure that they are consistent +and repeatable. + +It has a lot of very powerful and flexible functionality but can be a little +bit overwhelming at first. We are going to cover getting the stack set up +and cloning analysis_tools. Then an overview of the package and follow that +with walking through a series of examples of increasing complexity. + +Analysis tools is designed to work with any sort of keyed data but to make it +more intuitive initially we’ll talk about tables and column names. + +Setting Up the Package and Getting Started With The Stack +--------------------------------------------------------- +To set up the stack you need to source your version of: + +``source /opt/lsst/software/stack/loadLSST.bash`` + +and then setup lsst_distrib. + +``setup lsst_distrib`` + +If you don't have a local version of analysis_tools but you want to contribute or change things +then you can git clone the package from https://github.com/lsst/analysis_tools. + +``git clone git@github.com:lsst/analysis_tools.git`` + +Once the package has been closed it needs to be setup and scons needs to be run. + +``setup -j -r repos/analyis_tools`` +``scons repos/analysis_tools`` + +More details can be found here: +https://pipelines.lsst.io/install/package-development.html?highlight=github#un-set-up-the-development-package +You can check what is set up using this command: + +``eups list -s`` + +hopefully this will show your local version of analysis_tools. + +-------------- + +Package Layout +============== +There are a bunch of files in analysis_tools but we are going to focus on two directories, +``python/lsst/analysis/tools/`` and ``pipelines``, which contain the python code and the +pipelines that run it respecitvely. + +Pipelines +--------- +**visitQualityCore.yaml** + +| The core plots for analysing the quality of the visit level data. The core pipeline is run as standard as part of the regular reprocessing. The most helpful plots go into this pipeline. + +**visitQualityExtended.yaml** + +| An extended pipeline of plots and metrics to study the visit level data. The extended pipeline is run when a problem comes up or we need to look deeper into the data quality. Useful plots that address specific issues but don't need to be run on a regular basis get added to this pipeline. + +**coaddQualityCore.yaml** + +| The core plots for analysing the quality of the coadd level data. + +**coaddQualityExtended.yaml** + +| The extended plots for analysing the coadd data. + +**matchedVisitQualityCore.yaml** + +| Plots and metrics that assess the repeatability of sources per tract by matching them between visits. + +**apCcdVisitQualityCore.yaml** + +| The core plots to assess the quality of the ccd visit dataset. + +python/lsst/analysis/tools +-------------------------- +**actions** + +| This contains the actions that plot things and calculate things. +| Check here before adding new actions to avoid duplication. + + **scalar** + + Contains a lot of useful actions that return scalar values. + E.g. median or sigma MAD. + + **vector** + + These actions run on vectors and return vectors. + E.g. the S/N selector which returns an array of bools. + + **keyedData** + + These actions are base classes for other actions. You + shouldn't need to add stuff here. Use the scalar or + vector actions. + + **plots** + + The plotting code lives in here. You shouldn't need to touch + this unless you have to add a new plot type. Try to use one of + the existing ones first rather than duplicating things. + +**analysisMetrics** + +| Metric classes go in here. One off metrics and very simple metrics go into analysisMetrics.py. Sets of metrics go into their own file, i.e. psfResidualMetrics.py + +**analysisParts** + +| Shared code between plots and metric goes in here. Try to have as much of this as possible so that nothing changes between the plots and their associated metrics. +| I.e. shapeSizeFractionalDiff.py. + +**analysisPlots** + +| Plotting classes go in here. One off plots and very simple plots go into analysisPlots.py Sets of plots go into their own file, i.e. skyObject.py. + +**contexts** + +| Generic settings to be applied in a given circumstance. For example overrides that are specific to a coadd or visit level plot/metric such as the default flag selector which is different between coadd and visit analysis. + +**tasks** + +| Each different dataset type requires its own task to handle the reading of the inputs. +| For example: objectTableTractAnalysis.py which handles the reading in of object tables. + +------------------------- + +A Simple Plotting Example +========================= +The first example we are going to look at is a very simple one and then we can build +up from there. We're going to start by adapting an existing plot to our needs, we'll use a +sky plot to show the on sky distribution of the values of a column in the table. + +We use ‘actions’ to tell the code what to plot on the axis, these can be defined by anyone +but standard ones exist already. This example will showcase some of these standard ones and +then we’ll look more into how to define them. One of the great things about actions is that +they allow us to only read in the columns we need from large tables. + +Each plot or metric is its own class, each one has a prep, process and produce section. +The prep section applies things like flag cuts and signal to noise cuts to the data. +The process section builds the data required for the plot, for example if the plot +is of a magnitude difference against a magnitude then the actions defined in the +process section will identify which flux column needs to be read in and turned into a magnitude. +Then another will take the fluxes needed, turn them into magnitudes and then calculate their +difference. The produce section takes the prepared and pre calculated data and plots it on +the graph. The plot options, such as axis labels, are set in this section. + +When naming new classes it is recommended to have the word Plot in the name and that the name of the classes +matches the one that is used in the pipeline (detailed later). This name can be further expanded to include +the plot type as well. + +.. code-block:: python + + class newPlot(AnalysisPlot): + def setDefaults(self): + super().setDefaults() + self.prep.selectors.flagSelector = CoaddPlotFlagSelector() + self.prep.selectors.flagSelector.bands = ["{band}"] + + self.prep.selectors.snSelector = SnSelector() + self.prep.selectors.snSelector.fluxType = "{band}_psfFlux" + self.prep.selectors.snSelector.threshold = 300 + + self.prep.selectors.starSelector = StarSelector() + self.prep.selectors.starSelector.vectorKey = "{band}_extendedness" + + self.process.buildActions.xStars = LoadVector() + self.process.buildActions.xStars.vectorKey = "coord_ra" + self.process.buildActions.yStars = LoadVector() + self.process.buildActions.yStars.vectorKey = "coord_dec" + + self.process.buildActions.starStatMask = SnSelector() + self.process.buildActions.starStatMask.fluxType = "{band}_psfFlux" + + self.process.buildActions.zStars = ExtinctionCorrectedMagDiff() + self.process.buildActions.zStars.magDiff.col1 = "{band}_ap12Flux" + self.process.buildActions.zStars.magDiff.col2 = "{band}_psfFlux" + + self.produce = SkyPlot() + self.produce.plotTypes = ["stars"] + self.produce.plotName = "ap12-psf_{band}" + self.produce.xAxisLabel = "R.A. (degrees)" + self.produce.yAxisLabel = "Dec. (degrees)" + self.produce.zAxisLabel = "Ap 12 - PSF [mag]" + self.produce.plotOutlines = False + +Let's look at what the bits do in more detail. + +.. code-block:: python + + self.prep.selectors.flagSelector = CoaddPlotFlagSelector() + self.prep.selectors.flagSelector.bands = ["{band}"] + +The flag selector option lets us apply selectors based on flags to cut the data down. Multiple can be applied +at once and any flag that is in the input can be used. However pre built selectors already exist for the +common and recommended flag combinations. + +CoaddPlotFlagSelector - this is the standard set of flags for coadd plots. The “{band}” syntax means it gets applied in the band the plot is being made in. + +.. code-block:: python + + self.prep.selectors.snSelector = SnSelector() + self.prep.selectors.snSelector.fluxType = "{band}_psfFlux" + self.prep.selectors.snSelector.threshold = 300 + +SnSelector - this is the standard way of cutting the data down on S/N, you can set the flux type that is used to calculate the ratio and the threshold which the data must be above to be kept. + +.. code-block:: python + + self.prep.selectors.starSelector = StarSelector() + self.prep.selectors.starSelector.vectorKey = "{band}_extendedness" + +The starSelector option is for defining a selector which picks out the specific type of object that you want +to look at. You can define this anyway you want but there are pre defined ones that can be used to choose +stars or galaxies. You can also plot both at the same time, either separately or as one dataset but the +different dynamic ranges they often cover can make the resulting plot sub optimal. + +starSelector - this is the standard selector for stars. It uses the extendedness column, though any column can +be specified, the threshold in starSelector is defined for the extendedness column. + +.. code-block:: python + + self.process.buildActions.xStars = LoadVector() + self.process.buildActions.xStars.vectorKey = "coord_ra" + self.process.buildActions.yStars = LoadVector() + self.process.buildActions.yStars.vectorKey = "coord_dec" + +This section, the xStars and yStars options, sets what is plotted on each axis. In this case it is just the +column, post selectors applied, that is directly plotted. To do this the LoadVector action is used, it just +takes a vectorKey which in this case is the column name. However this can be any action, common actions are +already defined but you can define whatever you need and use it here. + +.. code-block:: python + + self.process.buildActions.starStatMask = SnSelector() + self.process.buildActions.starStatMask.fluxType = "{band}_psfFlux" + +The sky plot prints some statistics on the plot, the mask that selects the points to use for these stats is +defined by the starStatMask option. In this case it uses a PSF flux based S/N selector. + +.. code-block:: python + + self.process.buildActions.zStars = ExtinctionCorrectedMagDiff() + self.process.buildActions.zStars.magDiff.col1 = "{band}_ap12Flux" + self.process.buildActions.zStars.magDiff.col2 = "{band}_psfFlux" + +The points on the sky plot are color coded by the value defined in the zStars action. Here we have gone for +the ExtinctionCorrectedMagDiff, which calculates the magnitude from each of the columns specified as col1 and +col2 and then applies extinction corrections and subtracts them. If there is no extinction corrections for the +data then it defaults to a straight difference between them. + +.. code-block:: python + + self.produce = SkyPlot() + self.produce.plotTypes = ["stars"] + self.produce.plotName = "ap12-psf_{band}" + self.produce.xAxisLabel = "R.A. (degrees)" + self.produce.yAxisLabel = "Dec. (degrees)" + self.produce.zAxisLabel = "Ap 12 - PSF [mag]" + self.produce.plotOutlines = False + +This final section declares the plot type and adds labels and things. We declare that we want to make a sky +plot, that plots only objects of type star. Next we give the plot a name that is informative for later +identification and add axis labels. The final option specifies if we want patch outlines plotted. The plot + + +This new class then needs to be added to a file in analysisPlots, one off and simple plots go into the +analysisPlots file directly and the others are filed by category. For example all sky object related plots are +in the skyObjects.py file. + +Once we have added the class to the relevant file we can now run it from the command line. To do this we need +to add the class to a pipeline. + +.. code-block:: yaml + + description: | + An example pipeline to run our new plot + tasks: + testNewPlot: + class: lsst.analysis.tools.tasks.ObjectTableTractAnalysisTask + config: + connections.outputName: testNewPlot + plots.newPlot: newPlot + python: | + from lsst.analysis.tools.analysisPlots import * + +The class line assumes that we want to run the plot on an objectTable_tract. Each different dataset type has +its own assocaited task. Many tasks already exist for different dataset types but depending on what you want +to look at you might need to make your own. + +Once we have the pipeline we can run it, the same as we would run other pipetasks. + +.. code-block:: bash + + pipetask run -p pipelines/myNewPipeline.yaml + -b /sdf/group/rubin/repo/main/butler.yaml + -i HSC/runs/RC2/w_2022_28/DM-35609 + -o u/sr525/newPlotTest + --register-dataset-types --prune-replaced=purge --replace-run + +Let's look at each of the parts that go into the command. + +.. code-block:: bash + + pipetask run -p pipelines/myNewPipeline.yaml + +-p is the pipeline file, the location is relative to the directory that the command is run from. + +.. code-block:: bash + + -b /sdf/group/rubin/repo/main/butler.yaml + +-b is the location of the butler for the data that you want to process. This example is using the HSC data at the USDF. + +.. code-block:: bash + + -i HSC/runs/RC2/w_2022_28/DM-35609 + +-i is the input collection to plot from, here we are using one of the weekly reprocessing runs of the RC2 data. This path is relative to the one given for the butler.yaml file in the -b option. + +.. code-block:: bash + + -o u/sr525/newPlotTest + +-o is the output collection that you want the plots to go into. The standard way of organising things is to put them into u/your-user-name. + +.. code-block:: bash + + --register-dataset-types --prune-replaced=purge --replace-run + +The other options are sometimes necessary when running the pipeline. --register-dataset-types is needed when you have a dataset type that hasn't been made before and needs to be added. --prune-replaced=purge and --replace-run are useful if you are running the same thing multiple times into the same output, for example when debugging. They replace the previous versions of the plot and just keep the most recent version. + +If you don't want to include all of the data in the input collection then you need to specify a data id which +is done with the -d option. + +.. code-block:: bash + + -d "instrument='HSC' AND (band='g' or band='r' or band='i' or band='z' or band='y') AND skymap='hsc_rings_v1' + AND tract=9813 AND patch=68" + +This example data id tells the processing that the instrument being used is HSC, that we want to make the plot +in the g, r, i, z and y bands, that the skymap used is the hsc_rings_v1 map, that the tract is 9813 and that +we only want to process data from patch 68 rather than all the data. + +Making A New Metric +------------------- +Metrics work in a very similar way to plots and we won't go through another full example of them. They can be +added to the same pipelines as the plots and the pipeline is run as detailled above. Metrics follow the same +structure and have a prep, process and produce step. If a plot and metric are going to be made of the same +quantity then the shared code should be factored out into a shared class in ``analysisParts``, see +the `stelllar locus base class `__ for examples on how to do this. The shared code +is in ``analysisParts`` with very little code in ``analysisPlots.py`` and ``analysisMetrics.py``. The plots +and metrics from these files are then called in `pipelines/coaddQualityCore.yaml `__ and make a good reference +for how to make new plots/metrics/combinations of plots and metrics. + +------------ + +Adding an Action +================ + +Actions go in one of the sub folders of the actions directory depending on what type they are, this is covered in the package layout section. Before you add a new action check if it is already included before adding a duplicate. Sometimes it will probably be better to generalise an exisiting action rather than making a new one that is very similar to something that already exists. If the new action is long or specific to a given circumatance then add it to a new file, for example the ellipticity actions in `python/lsst/analysis/tools/actions/vector/ellipticity.py `__. + +Let's look at some examples of actions. The first one is a scalar action. + +.. code-block:: python + + class MedianAction(ScalarAction): + vectorKey = Field[str]("Key of Vector to median.") + + def getInputSchema(self) -> KeyedDataSchema: + return ((self.vectorKey, Vector),) + + def __call__(self, data: KeyedData, **kwargs) -> Scalar: + mask = self.getMask(**kwargs) + return cast(Scalar, float(np.nanmedian(cast(Vector, data[self.vectorKey.format(**kwargs)])[mask]))) + +Let's go through what each bit of the action does. + +.. code-block:: python + + vectorKey = Field[str]("Key of Vector to median.") + +This is a config option, when you use the action you declare the column name using this field. This is +consistent across all actions. + +.. code-block:: python + + def getInputSchema(self) -> KeyedDataSchema: + return ((self.vectorKey, Vector),) + +Every action needs a getInputSchema, this is what it uses to know which columns to read in from the table. +This means that only the needed columns can be read in allowing large tables to be accessed without memory +issues. This is one of the bonus benefits of using the ```analysis_tools``` framework. + +.. code-block:: python + + def __call__(self, data: KeyedData, **kwargs) -> Scalar: + mask = self.getMask(**kwargs) + return cast(Scalar, float(np.nanmedian(cast(Vector, data[self.vectorKey.format(**kwargs)])[mask]))) + +This actually does the work. It uses a mask, if it is given, and then takes the nan median of the relevant column from the data. The various calls to cast and type declarations are because it is made to work on very generic input data, any sort of keyed data type. Also we’ve got to keep typing happy otherwise we can’t merge to main. + +Next we have an example of a vector action, these take vectors and return vectors. + +.. code-block:: python + + class SubtractVector(VectorAction): + """Calculate (A-B)""" + + actionA = ConfigurableActionField(doc="Action which supplies vector A", dtype=VectorAction) + actionB = ConfigurableActionField(doc="Action which supplies vector B", dtype=VectorAction) + + def getInputSchema(self) -> KeyedDataSchema: + yield from self.actionA.getInputSchema() # type: ignore + yield from self.actionB.getInputSchema() # type: ignore + + def __call__(self, data: KeyedData, **kwargs) -> Vector: + vecA = self.actionA(data, **kwargs) # type: ignore + vecB = self.actionB(data, **kwargs) # type: ignore + + return vecA - vecB + +Vector actions are similar to scalar actions but we will break this one down and look at the components. + +.. code-block:: python + + actionA = ConfigurableActionField(doc="Action which supplies vector A", dtype=VectorAction) + actionB = ConfigurableActionField(doc="Action which supplies vector B", dtype=VectorAction) + +These lines are the config options, here they are the actions which give you the two values to subtract. These actions can be the loadVector action which just reads in a column without changing it in anyway. + +.. code-block:: python + + def getInputSchema(self) -> KeyedDataSchema: + yield from self.actionA.getInputSchema() # type: ignore + yield from self.actionB.getInputSchema() # type: ignore + +Here we get the column names from each of the actions being used, you can nest actions as deep as you want. + +.. code-block:: python + + def __call__(self, data: KeyedData, **kwargs) -> Vector: + vecA = self.actionA(data, **kwargs) # type: ignore + vecB = self.actionB(data, **kwargs) # type: ignore + + return vecA - vecB + +This section does the work and calculates the two actions and then subtracts them, returning the results. + +These are two very simple examples of actions and how they can be used. They can be as complicated or as +simple as you want and can be composed of multiple other actions allowing common segments to be their own +actions and then reused. + +------------------ + +Adding a Plot Type +================== +Hopefully there will be very few instances where you will need to add a new plot type and if you do please +check open ticket branches to make sure that you are not duplicating someone else's work. Try to use already +existant plot types so that we don't end up with lots of very similar plot types. Hopefully you won't really +need to touch the ploting code and can just define new classes and actions. + +If you add a new plot then please make sure that you include enough providence information on the plot. There +should be enough information that anyone can recreate the plot and access the full dataset for further +investigation. See the other plots for more information on how to do this. Also please add doc strings to the +plot and then add documentation here for other users so that they can easily see what already exists. + +------------------ + +Current Plot Types +================== +The current plot types that are available are detailed :doc:`here`. Most common plots are +already coded up and please try to reuse them before making your own. Before adding a new plot type please +think about if some of the already coded ones can be adapted to your needs rather than making multiple plots +that are basically identical. + +--------------- + +Current Actions +=============== +The current actions that are available are detailed :doc:`here`. Most common requests are already coded up and +please try to reuse actions that already exist before making your own. Please also try to make actions as +reusable as possible so that other people can also use them. diff --git a/doc/lsst.analysis.tools/index.rst b/doc/lsst.analysis.tools/index.rst index 251fd74e2..b4abdd4a6 100644 --- a/doc/lsst.analysis.tools/index.rst +++ b/doc/lsst.analysis.tools/index.rst @@ -8,21 +8,36 @@ lsst.analysis.tools .. Paragraph that describes what this Python module does and links to related modules and frameworks. +``analysis_tools`` is the plotting and metric framework that is used to perform QA on the pipeline products. +It is a very powerful way to explore and interact with the pipeline outputs. + .. .. _lsst.analysis.tools-using: -.. Using lsst.analysis.tools -.. ========================= +Using lsst.analysis.tools +========================= +For a tutorial on working with +``analysis_tools`` please see the :ref:`getting started guide `. .. toctree linking to topics related to using the module's APIs. .. .. toctree:: -.. :maxdepth: 1 +.. :glob: +.. currentActions +.. currentPlots + +Need Help? +========== + +If you get stuck with ``analysis_tools`` then feel free to reach out to the ``#rubinobs-analysis-tools`` +channel on slack and hopefully someone will help you! + .. _lsst.analysis.tools-contributing: Contributing ============ + ``lsst.analysis.tools`` is developed at https://github.com/lsst/analysis_tools. You can find Jira issues for this module under the `analysis_tools `_ component. diff --git a/doc/manifest.yaml b/doc/manifest.yaml index 5c412cfd7..d43e1190c 100644 --- a/doc/manifest.yaml +++ b/doc/manifest.yaml @@ -8,5 +8,5 @@ modules: # Name of the static content directories (subdirectories of `_static`). # Static content directories are usually named after the package. # Most packages do not need a static content directory (leave commented out). -# statics: -# - "_static/analysis_tools" + statics: + - "_static/analysis_tools" diff --git a/python/lsst/analysis/tools/actions/plot/__init__.py b/python/lsst/analysis/tools/actions/plot/__init__.py index e69de29bb..d5dd310d9 100644 --- a/python/lsst/analysis/tools/actions/plot/__init__.py +++ b/python/lsst/analysis/tools/actions/plot/__init__.py @@ -0,0 +1,9 @@ +from lsst.analysis.tools.actions.plot.barPlots import * +from lsst.analysis.tools.actions.plot.colorColorFitPlot import * +from lsst.analysis.tools.actions.plot.diaSkyPlot import * +from lsst.analysis.tools.actions.plot.histPlot import * +from lsst.analysis.tools.actions.plot.multiVisitCoveragePlot import * +from lsst.analysis.tools.actions.plot.rhoStatisticsPlot import * +from lsst.analysis.tools.actions.plot.scatterplotWithTwoHists import * +from lsst.analysis.tools.actions.plot.skyPlot import * +from lsst.analysis.tools.actions.plot.xyPlot import * diff --git a/python/lsst/analysis/tools/actions/plot/barPlots.py b/python/lsst/analysis/tools/actions/plot/barPlots.py index aa93a70c6..16976f4e3 100644 --- a/python/lsst/analysis/tools/actions/plot/barPlots.py +++ b/python/lsst/analysis/tools/actions/plot/barPlots.py @@ -54,6 +54,10 @@ class BarPanel(Config): class BarPlot(PlotAction): + """A plotting tool which can take multiple keyed data inputs + and can create one or more bar graphs. + """ + panels = ConfigDictField( doc="A configurable dict describing the panels to be plotted, and the bar graphs for each panel.", keytype=str, @@ -88,22 +92,22 @@ def makePlot( plotInfo : `dict` An optional dictionary of information about the data being plotted with keys: - `"run"` - Output run for the plots (`str`). - `"tractTableType"` - Table from which results are taken (`str`). - `"plotName"` - Output plot name (`str`) - `"SN"` - The global signal-to-noise data threshold (`float`) - `"skymap"` - The type of skymap used for the data (`str`). - `"tract"` - The tract that the data comes from (`int`). - `"bands"` - The bands used for this data (`str` or `list`). - `"visit"` - The visit that the data comes from (`int`) + `"run"` + Output run for the plots (`str`). + `"tractTableType"` + Table from which results are taken (`str`). + `"plotName"` + Output plot name (`str`) + `"SN"` + The global signal-to-noise data threshold (`float`) + `"skymap"` + The type of skymap used for the data (`str`). + `"tract"` + The tract that the data comes from (`int`). + `"bands"` + The bands used for this data (`str` or `list`). + `"visit"` + The visit that the data comes from (`int`) Returns ------- diff --git a/python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py b/python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py index 0ae6a3fac..bd3a1d57d 100644 --- a/python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py +++ b/python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py @@ -40,6 +40,13 @@ class ColorColorFitPlot(PlotAction): + """Makes a color-color plot and overplots a + prefited line to the specified area of the plot. + This is mostly used for the stellar locus plots + and also includes panels that illustrate the + goodness of the given fit. + """ + xAxisLabel = Field[str](doc="Label to use for the x axis", optional=False) yAxisLabel = Field[str](doc="Label to use for the y axis", optional=False) magLabel = Field[str](doc="Label to use for the magnitudes used to color code by", optional=False) @@ -108,42 +115,63 @@ def makePlot( Parameters ---------- - catPlot : `pandas.core.frame.DataFrame` + data : `KeyedData` The catalog to plot the points from. plotInfo : `dict` A dictionary of information about the data being plotted with keys: - ``"run"`` - The output run for the plots (`str`). - ``"skymap"`` - The type of skymap used for the data (`str`). - ``"filter"`` - The filter used for this data (`str`). - ``"tract"`` - The tract that the data comes from (`str`). + + * ``"run"`` + The output run for the plots (`str`). + * ``"skymap"`` + The type of skymap used for the data (`str`). + * ``"filter"`` + The filter used for this data (`str`). + * ``"tract"`` + The tract that the data comes from (`str`). fitParams : `dict` The parameters of the fit to the stellar locus calculated elsewhere, they are used to plot the fit line on the figure. - ``"bHW"`` - The hardwired intercept to fall back on. - ``"b_odr"`` - The intercept calculated by the orthogonal distance - regression fitting. - ``"mHW"`` - The hardwired gradient to fall back on. - ``"m_odr"`` - The gradient calculated by the orthogonal distance - regression fitting. - ``"magLim"`` - The magnitude limit used in the fitting. - ``"x1`"`` - The x minimum of the box used in the fit. - ``"x2"`` - The x maximum of the box used in the fit. - ``"y1"`` - The y minimum of the box used in the fit. - ``"y2"`` - The y maximum of the box used in the fit. + + * ``"bHW"`` + The hardwired intercept to fall back on. + * ``"bODR"`` + The intercept calculated by the orthogonal distance + regression fitting. + * ``"bODR2"`` + The intercept calculated by the second iteration of + orthogonal distance regression fitting. + * ``"mHW"`` + The hardwired gradient to fall back on. + * ``"mODR"`` + The gradient calculated by the orthogonal distance + regression fitting. + * ``"mODR2"`` + The gradient calculated by the second iteration of + orthogonal distance regression fitting. + * ``"xMin`"`` + The x minimum of the box used in the fit. + * ``"xMax"`` + The x maximum of the box used in the fit. + * ``"yMin"`` + The y minimum of the box used in the fit. + * ``"yMax"`` + The y maximum of the box used in the fit. + * ``"mPerp"`` + The gradient of the line perpendicular to the line from + the second ODR fit. + * ``"bPerpMin"`` + The intercept of the perpendicular line that goes through xMin. + * ``"bPerpMax"`` + The intercept of the perpendicular line that goes through xMax. + * ``f"{self.plotName}_sigmaMAD"`` + The sigma mad of the distances to the line fit. + * ``f"{self.identity or ''}_median"`` + The median of the distances to the line fit. + * ``f"{self.identity or ''}_hardwired_sigmaMAD"`` + The sigma mad of the distances to the initial fit. + * ``f"{self.identity or ''}_hardwired_median"`` + The median of the distances to the initial fit. Returns ------- @@ -152,14 +180,38 @@ def makePlot( Notes ----- - Makes a color-color plot of `self.config.xColName` against - `self.config.yColName`, these points are color coded by i band - CModel magnitude. The stellar locus fits calculated from - the calcStellarLocus task are then overplotted. The axis labels - are given by `self.config.xLabel` and `self.config.yLabel`. - The selector given in `self.config.sourceSelectorActions` - is used for source selection. The distance of the points to + The axis labels are given by `self.config.xLabel` and + `self.config.yLabel`. The perpendicular distance of the points to the fit line is given in a histogram in the second panel. + + For the code to work it expects various quantities to be + present in the 'data' that it is given. + + The quantities that are expected to be present are: + + * Statistics that are shown on the plot or used by the plotting code: + approxMagDepth, + f"{self.plotName}_sigmaMAD", + f"{self.plotName}_sigmaMAD", + f"{self.plotName}_sigmaMAD", + f"{self.plotName}_sigmaMAD", + f"{self.plotName}_sigmaMAD" + + * Parameters from the fitting code that are illustrated on the plot: + xMin, xMax, yMin, yMax, mHW, bHW, mODR, bODR, + yBoxMin, yBoxMax, bPerpMin, bPerpMax, mODR2, bODR2, mPerp + + * The main inputs to plot: + x, y, mag + + Examples + -------- + An example of the plot produced from this code is here: + + .. image:: /_static/analysis_tools/stellarLocusExample.png + + For a detailed example of how to make a plot from the command line + please see the :ref:`getting started guide`. """ # Define a new colormap diff --git a/python/lsst/analysis/tools/actions/plot/histPlot.py b/python/lsst/analysis/tools/actions/plot/histPlot.py index a2180142b..6afd88a1c 100644 --- a/python/lsst/analysis/tools/actions/plot/histPlot.py +++ b/python/lsst/analysis/tools/actions/plot/histPlot.py @@ -68,8 +68,6 @@ class HistStatsPanel(Config): `~lsst.pex.config.DictField`'s in HistPanel for each parameter for clarity and consistency. - - Notes ----- This is intended to be used as a configuration of the HistPlot/HistPanel @@ -110,6 +108,10 @@ def validate(self): class HistPanel(Config): + """A Config class that holds parameters to configure a single panel of a + histogram plot. This class is intended to be used within the ``HistPlot`` + class. + """ label = Field[str]( doc="Panel x-axis label.", default="label", @@ -193,6 +195,10 @@ def validate(self): class HistPlot(PlotAction): + """Make an N-panel plot with a configurable number of histograms displayed + in each panel. Reference lines showing values of interest may also be added + to each histogram. Panels are configured using the ``HistPanel`` class. + """ panels = ConfigDictField( doc="A configurable dict describing the panels to be plotted, and the histograms for each panel.", keytype=str, @@ -248,6 +254,14 @@ def makePlot( fig : `matplotlib.figure.Figure` The resulting figure. + Examples + -------- + An example histogram plot may be seen below: + + .. image:: /_static/analysis_tools/histPlotExample.png + + For further details on how to generate a plot, please refere to the + :ref:`getting started guide`. """ # set up figure diff --git a/python/lsst/analysis/tools/actions/plot/multiVisitCoveragePlot.py b/python/lsst/analysis/tools/actions/plot/multiVisitCoveragePlot.py index 7c653704a..4b2772bc5 100644 --- a/python/lsst/analysis/tools/actions/plot/multiVisitCoveragePlot.py +++ b/python/lsst/analysis/tools/actions/plot/multiVisitCoveragePlot.py @@ -252,10 +252,10 @@ def makePlot( plotInfo : `dict` [`str`], optional A dictionary of information about the data being plotted with (at least) keys: - `"run"` - Output run for the plots (`str`). - `"tableName"` - Name of the table from which results are taken (`str`). + `"run"` + Output run for the plots (`str`). + `"tableName"` + Name of the table from which results are taken (`str`). camera : `lsst.afw.cameraGeom.Camera`, optional The camera object associated with the data. This is to enable the conversion of to focal plane coordinates (if needed, i.e. for the diff --git a/python/lsst/analysis/tools/actions/plot/scatterplotWithTwoHists.py b/python/lsst/analysis/tools/actions/plot/scatterplotWithTwoHists.py index 0185fc693..db61289ff 100644 --- a/python/lsst/analysis/tools/actions/plot/scatterplotWithTwoHists.py +++ b/python/lsst/analysis/tools/actions/plot/scatterplotWithTwoHists.py @@ -52,6 +52,10 @@ class ScatterPlotStatsAction(KeyedDataAction): + """Calculates the statistics needed for the + scatter plot with two hists. + """ + vectorKey = Field[str](doc="Vector on which to compute statistics") highSNSelector = ConfigurableActionField[SnSelector]( doc="Selector used to determine high SN Objects", default=SnSelector(threshold=2700) @@ -134,6 +138,10 @@ class _StatsContainer(NamedTuple): class ScatterPlotWithTwoHists(PlotAction): + """Makes a scatter plot of the data with a marginal + histogram for each axis. + """ + yLims = ListField[float]( doc="ylimits of the plot, if not specified determined from data", length=2, @@ -240,28 +248,32 @@ def makePlot( ) -> Figure: """Makes a generic plot with a 2D histogram and collapsed histograms of each axis. + Parameters ---------- - data : `pandas.core.frame.DataFrame` + data : `KeyedData` The catalog to plot the points from. plotInfo : `dict` A dictionary of information about the data being plotted with keys: - ``"run"`` - The output run for the plots (`str`). - ``"skymap"`` - The type of skymap used for the data (`str`). - ``"filter"`` - The filter used for this data (`str`). - ``"tract"`` - The tract that the data comes from (`str`). + + * ``"run"`` + The output run for the plots (`str`). + * ``"skymap"`` + The type of skymap used for the data (`str`). + * ``"filter"`` + The filter used for this data (`str`). + * ``"tract"`` + The tract that the data comes from (`str`). sumStats : `dict` A dictionary where the patchIds are the keys which store the R.A. and dec of the corners of the patch, along with a summary statistic for each patch. + Returns ------- fig : `matplotlib.figure.Figure` The resulting figure. + Notes ----- Uses the axisLabels config options `x` and `y` and the axisAction @@ -272,6 +284,28 @@ def makePlot( of the resultant plot. The code uses the selectorActions to decide which points to plot and the statisticSelector actions to determine which points to use for the printed statistics. + + If this function is being used within the pipetask framework + that takes care of making sure that data has all the required + elements but if you are runnign this as a standalone function + then you will need to provide the following things in the + input data. + + * If stars is in self.plotTypes: + xStars, yStars, starsHighSNMask, starsLowSNMask and + {band}_highSNStars_{name}, {band}_lowSNStars_{name} + where name is median, sigma_Mad, count and approxMag. + + * If it is for galaxies/unknowns then replace stars in the above + names with galaxies/unknowns. + + * If it is for any (which covers all the points) then it + becomes, x, y, and any instead of stars for the other + parameters given above. + + * In every case it is expected that data contains: + lowSnThreshold, highSnThreshold and patch + (if the summary plot is being plotted). """ if not self.plotTypes: noDataFig = Figure() diff --git a/python/lsst/analysis/tools/actions/plot/skyPlot.py b/python/lsst/analysis/tools/actions/plot/skyPlot.py index 050f594b3..f9ec2f029 100644 --- a/python/lsst/analysis/tools/actions/plot/skyPlot.py +++ b/python/lsst/analysis/tools/actions/plot/skyPlot.py @@ -40,6 +40,14 @@ class SkyPlot(PlotAction): + """Plots the on sky distribution of a parameter. + + Plots the values of the parameter given for the z axis + according to the positions given for x and y. Optimised + for use with RA and Dec. Also calculates some basic + statistics and includes those on the plot. + """ + xAxisLabel = Field[str](doc="Label to use for the x axis.", optional=False) yAxisLabel = Field[str](doc="Label to use for the y axis.", optional=False) zAxisLabel = Field[str](doc="Label to use for the z axis.", optional=False) @@ -134,21 +142,25 @@ def makePlot( sumStats: Optional[Mapping] = None, **kwargs, ) -> Figure: - """Prep the catalogue and then make a skyPlot of the given column. + """Make a skyPlot of the given data. Parameters ---------- - catPlot : `pandas.core.frame.DataFrame` + data : `KeyedData` The catalog to plot the points from. - dataId : - `lsst.daf.butler.core.dimensions._coordinate._ExpandedTupleDataCoordinate` - The dimensions that the plot is being made from. - runName : `str` - The name of the collection that the plot is written out to. - skymap : `lsst.skymap` - The skymap used to define the patch boundaries. - tableName : `str` - The type of table used to make the plot. + plotInfo : `dict` + A dictionary of information about the data being plotted with keys: + ``"run"`` + The output run for the plots (`str`). + ``"skymap"`` + The type of skymap used for the data (`str`). + ``"filter"`` + The filter used for this data (`str`). + ``"tract"`` + The tract that the data comes from (`str`). + sumStats : `dict` + A dictionary where the patchIds are the keys which store the R.A. + and dec of the corners of the patch. Returns ------- @@ -158,55 +170,27 @@ def makePlot( Notes ----- - The catalogue is first narrowed down using the selectors specified in - `self.config.selectorActions`. - If the column names are 'Functor' then the functors specified in - `self.config.axisFunctors` are used to calculate the required values. - After this the following functions are run: - - `parsePlotInfo` which uses the dataId, runName and tableName to add - useful information to the plot. - - `generateSummaryStats` which parses the skymap to give the corners of - the patches for later plotting and calculates some basic statistics - in each patch for the column in self.config.axisActions['zAction']. + Expects the data to contain slightly different things + depending on the types specified in plotTypes. This + is handled automatically if you go through the pipetask + framework but if you call this method separately then you + need to make sure that data contains what the code is expecting. - `SkyPlot` which makes the plot of the sky distribution of - `self.config.axisActions['zAction']`. + If stars is in the plot types given then it is expected that + data contains: xStars, yStars, zStars and starStatMask. - Makes a generic plot showing the value at given points on the sky. + If galaxies is present: xGalaxies, yGalaxies, zGalaxies and + galaxyStatsMask. - Parameters - ---------- - catPlot : `pandas.core.frame.DataFrame` - The catalog to plot the points from. - plotInfo : `dict` - A dictionary of information about the data being plotted with keys: - ``"run"`` - The output run for the plots (`str`). - ``"skymap"`` - The type of skymap used for the data (`str`). - ``"filter"`` - The filter used for this data (`str`). - ``"tract"`` - The tract that the data comes from (`str`). - sumStats : `dict` - A dictionary where the patchIds are the keys which store the R.A. - and dec of the corners of the patch. + If unknown is present: xUnknowns, yUnknowns, zUnknowns and + unknownStatMask. - Returns - ------- - fig : `matplotlib.figure.Figure` - The resulting figure. + If any is specified: x, y, z, statMask. - Notes - ----- - Uses the config options `self.config.xColName` and - `self.config.yColName` to plot points color coded by - `self.config.axisActions['zAction']`. - The points plotted are those selected by the selectors specified in - `self.config.selectorActions`. + These options are not exclusive and multiple can be specified + and thus need to be present in data. """ + fig = plt.figure(dpi=300) ax = fig.add_subplot(111)