-
Notifications
You must be signed in to change notification settings - Fork 473
Add Galaxy wrapper for LEMUR: Latent Embedding Multivariate Regression #7173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Cool 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nice!!
Thanks @dianichj
Please add a |
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
…column param, updated R script, XML and test data (Zenodo links used, metadata.tsv local), added .shed.yml
tools/lemur/lemur.R
Outdated
@@ -9,10 +9,19 @@ suppressPackageStartupMessages({ | |||
library(ggplot2) | |||
}) | |||
|
|||
#----- Function to save plots in different formats ---- | |||
save_plot <- function(filename, plot, format = "pdf", width = 6, height = 5, dpi = 300) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the dpi, width, and height configurable.
<param name="meta_table" type="data" format="tabular" label="Sample metadata table (TSV)" help="TSV file with one row per sample/cell. Must contain columns for condition and optionally for batch." /> | ||
<param name="condition_column" type="data_column" data_ref="meta_table" label="Condition column" help="Select the condition column (e.g., treatment vs control). Only appears after loading metadata." /> | ||
<param name="batch_column" type="data_column" data_ref="meta_table" optional="true" label="Batch column (optional)" help="Optional batch variable (e.g., patient ID). Only appears after loading metadata." /> | ||
<param name="contrast_condition" type="text" value="panobinostat" label="Condition for contrast (e.g. treatment)" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it is OK here to have a default value.
Maybe at least keep it as "treatment" please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And please add validators for both text params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping for the validator
@@ -9,10 +9,19 @@ suppressPackageStartupMessages({ | |||
library(ggplot2) | |||
}) | |||
|
|||
#----- Function to save plots in different formats ---- | |||
save_plot <- function(filename, plot, format = "pdf", width = 6, height = 5, dpi = 300) { | |||
ext <- tolower(format) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your formats are already lowercase, no?
tools/lemur/lemur.xml
Outdated
<data name="gene_hist_pdf" format="pdf" from_work_dir="gene_hist.pdf" label="Gene histogram plot" /> | ||
<data name="chr_scatter_pdf" format="pdf" from_work_dir="chr_scatter.pdf" label="Chromosome scatter plot" /> | ||
<data name="tumor_umap_pdf" format="pdf" from_work_dir="tumor_umap.pdf" label="Tumor UMAP plot" /> | ||
<data name="gene_umap" format="pdf" from_work_dir="gene_umap.pdf" label="Gene UMAP plot"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work? You are hardcoding it to look for gene_umap.pdf file.
Please add a test using other formats as output.
tools/lemur/lemur.xml
Outdated
<when input="plot_format" value="png" format="png"/> | ||
<when input="plot_format" value="jpg" format="jpg"/> | ||
</change_format> | ||
</data> | ||
<data name="de_results" format="tabular" from_work_dir="de_results.tsv" label="LEMUR DE results" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use this lable for all outputs:
${tool.name} on ${on_string}:
tools/lemur/lemur.xml
Outdated
<assert_contents> | ||
<has_line_matching expression="^name\tn_cells\t.*did_lfc$"/> | ||
<has_text text="ENSG00000210082"/> | ||
</assert_contents> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add has_n_lines
so we know the file has correct number of genes
LEMUR uses the `SingleCellExperiment` (SCE) format because it is the standard Bioconductor structure for single-cell data in R. Unlike Seurat or AnnData, SCE is lightweight, interoperable, and not tied to a specific framework. It cleanly separates assays (e.g., expression values), cell metadata (`colData`), feature metadata (`rowData`), and reduced dimensions (`reducedDims`). | ||
|
||
For LEMUR to work correctly, the SCE object **must include**: | ||
- **`logcounts()`**: matrix of log-normalized gene expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also needs counts
Categories [['Statistical Analysis']] unknown. |
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Co-authored-by: Amirhossein Nilchi <[email protected]>
Ping @nilchia :)! |
Co-authored-by: Rand Zoabi <[email protected]>
@@ -0,0 +1,271 @@ | |||
<tool id="lemur" name="LEMUR" version="1.0.1+galaxy0" profile="25.0" license="MIT"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version should be 1.4.0+galaxy0 I think to indicate the version for the main dependency.
For that you can use a macro
<param name="meta_table" type="data" format="tabular" label="Sample metadata table (TSV)" help="TSV file with one row per sample/cell. Must contain columns for condition and optionally for batch." /> | ||
<param name="condition_column" type="data_column" data_ref="meta_table" label="Condition column" help="Select the condition column (e.g., treatment vs control). Only appears after loading metadata." /> | ||
<param name="batch_column" type="data_column" data_ref="meta_table" optional="true" label="Batch column (optional)" help="Optional batch variable (e.g., patient ID). Only appears after loading metadata." /> | ||
<param name="contrast_condition" type="text" value="panobinostat" label="Condition for contrast (e.g. treatment)" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping for the validator
<param name="batch_column" type="data_column" data_ref="meta_table" optional="true" label="Batch column (optional)" help="Optional batch variable (e.g., patient ID). Only appears after loading metadata." /> | ||
|
||
<param name="contrast_condition" type="text" value="panobinostat" label="Condition for contrast (e.g. treatment)" /> | ||
<param name="reference_condition" type="text" value="ctrl" label="Reference condition (e.g. control)" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a validator would be nice here as well
<param name="tumor_annotation_column" type="text" value="chromosome" | ||
label="Tumor annotation column in rowData" | ||
help="Used for tumor classification. Default is 'chromosome'. Override if your SCE uses a different annotation column." /> | ||
</inputs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is a bit off here, please use the galaxy-language-server to reformat this tool
help="Used for tumor classification. Default is 'chromosome'. Override if your SCE uses a different annotation column." /> | ||
</inputs> | ||
|
||
<outputs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are a lot of outputs, are all of them always needed? Should the user be able to select only a few of them?
--output_umap 'umap.$plot_format' | ||
--output_volcano 'volcano.$plot_format' | ||
--output_de 'de_results.tsv' | ||
#if str($sel_gene): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"#end if" is missing for some reason.
FOR CONTRIBUTOR:
This PR adds a new Galaxy tool for the LEMUR R package, designed for fitting a latent embedding multivariate regression model to multi-condition single-cell data.
🧬 Tool purpose
LEMUR provides a parametric framework to:
It is especially suited for complex experimental designs such as treatment vs. control, time-course, or disease progression studies.
📂 Tool contents
lemur.xml
: Galaxy wrapper with required parameters and outputslemur.R
: R script that runs the LEMUR pipeline (fit, align, test)test-data/
: Includes example RDS input and expected PDF/TSV outputs:🧪 Test data
The test data included are derived from a publicly available glioblastoma single-cell dataset. They demonstrate:
🛠️ Requirements
Conda packages required:
bioconductor-lemur
bioconductor-singlecellexperiment
r-optparse
r-tidyverse
r-uwot
These are defined in the wrapper's
<requirements>
section.📌 Notes
Please help me improve this tool wrapper, it still needs work. Thanks so much for reviewing and for your help!
cc: @nilchia 🚀🖖