Skip to content

add viber module #8797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

giorgiagandolfi
Copy link

PR checklist

Closes #8712

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@giorgiagandolfi giorgiagandolfi added the new module Adding a new module label Jul 30, 2025
Copy link
Contributor

@akaviaLab akaviaLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have quite a few comments on the R code, and some on the NF code. The bits I haven't commented on seem fine to me.


library(VIBER)
library(dplyr)
library(tidyverse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code calls tidyverse, but your conda library doesn't. If you actually need all of tidyverse, I think you need to add it to the conda environment yml.

# Ensure the option vectors are length 2 (key/ value) to catch empty ones
args_vals = lapply(args_vals, function(z){ length(z) = 2; z})

parsed_args = structure(lapply(args_vals, function(x) x[2]), names = lapply(args_vals, function(x) x[1]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems an odd bit of code. Can you clarify what it does and why it is needed?

output:
tuple val(meta), path("*_viber_best_st_fit.rds"), emit: viber_rds
tuple val(meta), path("*_viber_best_st_heuristic_fit.rds"), emit: viber_heuristic_rds
tuple val(meta), path("*_${plot1}"), emit: viber_plots_rds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit weird that there are two options for outptut, but neither is actually optional.
Could you just use the common pattern as the defintion, like plot2 would be something like "heuristic.rds"

Why do you call them plots when they are rds (not plots)?

Don't plot1 and plo2 overlap with viber_rds and viber_heuristic_rds if you have more than one sample? If these are obligatory and always outputted, just make the other version optional.

input_obj = readRDS("$rds_join")
if (class(input_obj) == "m_cnaqc") {
shared = input_obj %>% get_sample(sample=samples, which_obj="shared")
joint_table = lapply(names(shared),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the mixture of lapply and dplyr grating, and would prefer using purrr:: and dplyr::, but that's a personal taste.


## Extract DP (depth)
dp = reads_data %>%
# dplyr::filter(mutation_id %in% non_tail) %>% ## this step should be managed before by other module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these commented lines are not necessary, please just remove them.

dp = reads_data %>%
# dplyr::filter(mutation_id %in% non_tail) %>% ## this step should be managed before by other module
dplyr::select(dplyr::starts_with("DP")) %>%
dplyr::mutate(dplyr::across(.cols=dplyr::everything(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't line 75 imply there is only one DP column? Then why eveything? Also, since the same code repeats, why not just use tidyr::replace_na() for all the values you want to replace in one go?

#try(expr = {multivariate = plot(best_fit) %>% patchwork::wrap_plots()} )
#top_p = patchwork::wrap_plots(marginals, multivariate, design=ifelse(n_samples>2, "ABB", "AAB"))

try(expr = {multivariate = plot(best_fit)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no catch, so is the point only to continue if this fails? Or is the error standard? Do you need encapsulated in try? If not, just have the inside code and let it fail.


try(expr = {multivariate = plot(best_fit)})
try(expr = {multivariate = ggpubr::ggarrange(plotlist = multivariate)})
top_p = ggpubr::ggarrange(plotlist = list(marginals, multivariate), widths=ifelse(n_samples>2, c(1,2), c(2,1)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try running a code fromatter, because these kind of lines are hard to read. RStudio has a good code formatter.

@@ -0,0 +1,175 @@
#!/usr/bin/env Rscript
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that the script be called something more informtive like
viber_main_script.R
or viber_template.R
or run_viber.R

Calling it main_script.R is too generic to be really useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new module Adding a new module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new module: VIBER
2 participants