Skip to content

Commit 1a779f1

Browse files
Fix error when loading fits for pandas dataframe (#85)
* Fixed error that occurred when reading a fits file and converting it to pandas using tables_io and python in version 3.10. * Changed the opening of the file with the default type 'astropy table' in the get_product function * update tutorial notebook: date of last verification --------- Co-authored-by: Julia Gschwend <[email protected]>
1 parent 30c4576 commit 1a779f1

File tree

4 files changed

+119
-24
lines changed

4 files changed

+119
-24
lines changed

docs/notebooks/intro_notebook.ipynb

Lines changed: 109 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@
99
"\n",
1010
"# Photo-z Server - Tutorial Notebook\n",
1111
"\n",
12-
"Contact author: [Julia Gschwend](mailto:[email protected])\n",
12+
"Contact author: Julia Gschwend ([[email protected]](mailto:[email protected]))\n",
1313
"\n",
14-
"Last verified run: 2024-Jun-04<br>"
14+
"Last verified run: 2024-Jun-25<br>"
1515
]
1616
},
1717
{
@@ -29,6 +29,7 @@
2929
" - [How to upload a data product to the PZ Server](#how-to-upload-a-data-product-to-the-pz-server)\n",
3030
" - [How to download a data product from the PZ Server](#how-to-download-a-data-product-from-the-pz-server)\n",
3131
"- PZ Server API (Python library pzserver)\n",
32+
"- PZ Server API (Python library pzserver)\n",
3233
" - [How to get general info from PZ Server](#how-to-get-general-info-from-pz-server)\n",
3334
" - [How to display the metadata of a data product](#how-to-display-the-metadata-of-a-data-product)\n",
3435
" - [How to download data products as .zip files](#how-to-download-data-products-as-zip-files) \n",
@@ -63,28 +64,33 @@
6364
"The Photo-z (PZ) Server is an online service available for the LSST Community to host and share lightweight photo-z related data products. The upload and download of data and metadata can be done at the website [pz-server.linea.org.br](https://pz-server.linea.org.br/) (during the development phase, a test environment is available at [pz-server-dev.linea.org.br](https://pz-server-dev.linea.org.br/)). There, you will find two separate pages containing a list of data products each: one for LSST Data Management's oficial data products, and other for user-generated data products. **The registered data products can also be accessed directly from Python code using the PZ Server's data access API, as demonstrated below.**\n",
6465
"\n",
6566
"The PZ Server is developed and delivered as part of the in-kind contribution program BRA-LIN, from LIneA to the Rubin Observatory's LSST. The service is hosted in the Brazilian IDAC, not directly connected to the [Rubin Science Platform (RSP)](https://data.lsst.cloud/). However, it requires RSP credentials for user's authentication. For a comprehensive documentation about the PZ Server, please visit the [PZ Server's documentation page](https://linea-it.github.io/pz-lsst-inkind-doc/). There, you will find also an overview of all LIneA's contributions related to Photo-zs. The internal documentation of the API functions is available on the [API's documentation page](https://linea-it.github.io/pzserver/html/index.html). "
67+
"The PZ Server is developed and delivered as part of the in-kind contribution program BRA-LIN, from LIneA to the Rubin Observatory's LSST. The service is hosted in the Brazilian IDAC, not directly connected to the [Rubin Science Platform (RSP)](https://data.lsst.cloud/). However, it requires RSP credentials for user's authentication. For a comprehensive documentation about the PZ Server, please visit the [PZ Server's documentation page](https://linea-it.github.io/pz-lsst-inkind-doc/). There, you will find also an overview of all LIneA's contributions related to Photo-zs. The internal documentation of the API functions is available on the [API's documentation page](https://linea-it.github.io/pzserver/html/index.html). "
6668
]
6769
},
6870
{
6971
"cell_type": "markdown",
7072
"metadata": {},
7173
"source": [
74+
"## How to upload a data product on the PZ Server website\n",
7275
"## How to upload a data product on the PZ Server website\n",
7376
"\n",
7477
"To upload a data product, click on the button **NEW PRODUCT** on the top left of the **User-generated Data Products** page and fill in the Upload Form with relevant metadata. Alternatively, the user can upload files to the PZ Server programatically via the `pzserver` Python Library (described below). \n",
78+
"To upload a data product, click on the button **NEW PRODUCT** on the top left of the **User-generated Data Products** page and fill in the Upload Form with relevant metadata. Alternatively, the user can upload files to the PZ Server programatically via the `pzserver` Python Library (described below). \n",
7579
"\n",
7680
"The photo-z-related products are organized into four categories (product types):\n",
7781
"\n",
7882
"- **Spec-z Catalog:** Catalog of spectroscopic redshifts and positions (usually equatorial coordinates).\n",
7983
"- **Training Set:** Training set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.\n",
8084
"- **Photo-z Validation Results:** Results of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set, photo-z validation metrics, validation plots, etc.\n",
8185
"- **Photo-z Table:** Results of a photo-z estimation procedure. Ideally in the same format as the photo-z tables delivered by the DM as part of the LSST data releases. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (and instructions on accessing the data should be provided in the description field). "
86+
"- **Photo-z Table:** Results of a photo-z estimation procedure. Ideally in the same format as the photo-z tables delivered by the DM as part of the LSST data releases. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (and instructions on accessing the data should be provided in the description field). "
8287
]
8388
},
8489
{
8590
"cell_type": "markdown",
8691
"metadata": {},
8792
"source": [
93+
"## How to download a data product from the PZ Server website\n",
8894
"## How to download a data product from the PZ Server website\n",
8995
"\n",
9096
"To download a data product available on the Photo-z Server, go to one of the two pages by clicking on the card \"LSST PZ Data Products\" (for official products released by LSST DM Team) or \"User-generated Data Products\" (for products uploaded by the members of LSST community. The download button is on the left side of each data product (each row of the list). "
@@ -95,6 +101,7 @@
95101
"metadata": {},
96102
"source": [
97103
"# The PZ Server API (Python library `pzserver`)"
104+
"# The PZ Server API (Python library `pzserver`)"
98105
]
99106
},
100107
{
@@ -171,6 +178,25 @@
171178
"outputs": [],
172179
"source": [
173180
"# pz_server = PzServer(token=\"<your token>\", host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase "
181+
"# pz_server = PzServer(token=\"<your token>\", host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase "
182+
]
183+
},
184+
{
185+
"cell_type": "markdown",
186+
"metadata": {},
187+
"source": [
188+
"For convenience, the token can be saved into a file named as `token.txt` (which is already listed in the .gitignore file in this repository). "
189+
]
190+
},
191+
{
192+
"cell_type": "code",
193+
"execution_count": null,
194+
"metadata": {},
195+
"outputs": [],
196+
"source": [
197+
"with open('token.txt', 'r') as file:\n",
198+
" token = file.read()\n",
199+
"pz_server = PzServer(token=token, host=\"pz-dev\") # \"pz-dev\" is the temporary host for test phase "
174200
]
175201
},
176202
{
@@ -202,6 +228,7 @@
202228
"cell_type": "markdown",
203229
"metadata": {},
204230
"source": [
231+
"The object `pz_server` just created above can provide access to data and metadata stored in the PZ Server. It also brings useful methods for users to navigate through the available contents. The methods with the preffix `get_` return the result of a query on the PZ Server database as a Python dictionary, and are most useful to be used programatically (see detaials on the [API documentation page](https://linea-it.github.io/pzserver/html/index.html)). Alternatively, those with the preffix `display_` show the results as a styled [_Pandas DataFrames_](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), optimized for Jupyter Notebook (note: column names might change in the display version). For instance:\n",
205232
"The object `pz_server` just created above can provide access to data and metadata stored in the PZ Server. It also brings useful methods for users to navigate through the available contents. The methods with the preffix `get_` return the result of a query on the PZ Server database as a Python dictionary, and are most useful to be used programatically (see detaials on the [API documentation page](https://linea-it.github.io/pzserver/html/index.html)). Alternatively, those with the preffix `display_` show the results as a styled [_Pandas DataFrames_](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), optimized for Jupyter Notebook (note: column names might change in the display version). For instance:\n",
206233
"\n",
207234
"Display the list of product types supported with a short description;"
@@ -345,6 +372,14 @@
345372
"## How to upload a data product to via Python API (alternative method) "
346373
]
347374
},
375+
{
376+
"cell_type": "markdown",
377+
"metadata": {},
378+
"source": [
379+
"The default method to upload a data product to the PZ Server is the upload tool on PZ Server website, as shown above. Alternatively, data products can be sent to the host service using the `pzserver` Python library. \n",
380+
"## How to upload a data product to via Python API (alternative method) "
381+
]
382+
},
348383
{
349384
"cell_type": "markdown",
350385
"metadata": {},
@@ -416,6 +451,75 @@
416451
"upload.product_id"
417452
]
418453
},
454+
{
455+
"cell_type": "markdown",
456+
"metadata": {},
457+
"source": [
458+
"First, prepare a dictionary with the relevant information about your data product: "
459+
]
460+
},
461+
{
462+
"cell_type": "code",
463+
"execution_count": null,
464+
"metadata": {},
465+
"outputs": [],
466+
"source": [
467+
"data_to_upload = {\n",
468+
" \"name\":\"upload example 1\",\n",
469+
" \"product_type\": \"specz_catalog\", \n",
470+
" \"release\": None, # LSST release, use None if not LSST data \n",
471+
" \"main_file\": \"example.csv\", # full path \n",
472+
" \"auxiliary_files\": [\"example.html\", \"example.ipynb\"] # full path\n",
473+
"}"
474+
]
475+
},
476+
{
477+
"cell_type": "code",
478+
"execution_count": null,
479+
"metadata": {},
480+
"outputs": [],
481+
"source": [
482+
"upload = pz_server.upload(**data_to_upload) "
483+
]
484+
},
485+
{
486+
"cell_type": "markdown",
487+
"metadata": {},
488+
"source": []
489+
},
490+
{
491+
"cell_type": "code",
492+
"execution_count": null,
493+
"metadata": {},
494+
"outputs": [],
495+
"source": [
496+
"columns_dict = {\"ID\" : \"ID\", \n",
497+
" \"RA\" : \"RA\", \n",
498+
" \"Dec\": \"DEC\",\n",
499+
" \"z\" : \"Z\",\n",
500+
" \"z_err\" : \"ERR_Z\",\n",
501+
" \"z_flag\": \"FLAG_DES\" \n",
502+
" }"
503+
]
504+
},
505+
{
506+
"cell_type": "code",
507+
"execution_count": null,
508+
"metadata": {},
509+
"outputs": [],
510+
"source": [
511+
"upload.make_columns_association(columns_dict) "
512+
]
513+
},
514+
{
515+
"cell_type": "code",
516+
"execution_count": null,
517+
"metadata": {},
518+
"outputs": [],
519+
"source": [
520+
"upload.product_id"
521+
]
522+
},
419523
{
420524
"cell_type": "markdown",
421525
"metadata": {},
@@ -427,6 +531,7 @@
427531
"cell_type": "markdown",
428532
"metadata": {},
429533
"source": [
534+
"The metadata of a given data product is the information provided by the user on the upload form. This information is attached to the data product contents and is available for consulting on the PZ Server page or using this Python API (`pzserver`). \n",
430535
"The metadata of a given data product is the information provided by the user on the upload form. This information is attached to the data product contents and is available for consulting on the PZ Server page or using this Python API (`pzserver`). \n",
431536
"\n",
432537
"All data products stored on PZ Server are identified by a unique **id** number or an unique name, a _string_ called **internal_name**, which is created automatically at the moment of the upload by concatenating the product **id** to the name given by its owner (replacing blank spaces by \"_\", lowering cases, and removing special characters). "
@@ -562,7 +667,7 @@
562667
"metadata": {},
563668
"outputs": [],
564669
"source": [
565-
"catalog.data"
670+
"catalog.data\n"
566671
]
567672
},
568673
{
@@ -981,20 +1086,6 @@
9811086
"\n",
9821087
"Is something important missing? [Click here to open an issue in the PZ Server library repository on GitHub](https://github.com/linea-it/pzserver/issues/new). "
9831088
]
984-
},
985-
{
986-
"cell_type": "code",
987-
"execution_count": null,
988-
"metadata": {},
989-
"outputs": [],
990-
"source": []
991-
},
992-
{
993-
"cell_type": "code",
994-
"execution_count": null,
995-
"metadata": {},
996-
"outputs": [],
997-
"source": []
9981089
}
9991090
],
10001091
"metadata": {
@@ -1014,6 +1105,7 @@
10141105
"nbconvert_exporter": "python",
10151106
"pygments_lexer": "ipython3",
10161107
"version": "3.10.10"
1108+
"version": "3.10.10"
10171109
},
10181110
"nbsphinx": {
10191111
"execute": "never"

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ pandas>=1.2.0
44
requests>=2.23.0
55
astropy>=5.0.0
66
matplotlib>=3.6.0
7-
tables_io>=0.7.9
7+
tables_io>=0.9.6
88
Jinja2>=3.1.2
99
ipython>=8.5.0
1010
h5py>=3.8.0

src/pzserver/catalog.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
"""
44

55
import matplotlib.pyplot as plt
6-
import pandas as pd
76
from IPython.display import display
87

98

@@ -16,7 +15,8 @@ def __init__(self, data=None, metadata=None, metadata_df=None):
1615
"""
1716
Catalog class constructor
1817
"""
19-
self.data = pd.DataFrame(data)
18+
19+
self.data = data
2020
self.metadata = metadata
2121
self.columns = metadata.get("main_file").get("columns_association")
2222
self.metadata_df = metadata_df

src/pzserver/core.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -377,13 +377,16 @@ def get_product(self, product_id=None, fmt=None):
377377
return dataframe
378378
results = self.__transform_df(dataframe, metadata)
379379
else:
380+
dataframe = tables_io.read(file_path, tables_io.types.AP_TABLE)
381+
380382
if fmt == "astropy":
381-
return tables_io.read(file_path, tType=tables_io.types.AP_TABLE)
382-
dataframe = tables_io.read(
383-
file_path, tType=tables_io.types.PD_DATAFRAME
384-
)
383+
return dataframe
384+
385+
dataframe = dataframe.to_pandas()
386+
385387
if fmt == "pandas":
386388
return dataframe
389+
387390
results = self.__transform_df(dataframe, metadata)
388391

389392
print("Done!")

0 commit comments

Comments
 (0)