-
Notifications
You must be signed in to change notification settings - Fork 887
add viber module #8797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
add viber module #8797
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have quite a few comments on the R code, and some on the NF code. The bits I haven't commented on seem fine to me.
|
||
library(VIBER) | ||
library(dplyr) | ||
library(tidyverse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code calls tidyverse, but your conda library doesn't. If you actually need all of tidyverse, I think you need to add it to the conda environment yml.
# Ensure the option vectors are length 2 (key/ value) to catch empty ones | ||
args_vals = lapply(args_vals, function(z){ length(z) = 2; z}) | ||
|
||
parsed_args = structure(lapply(args_vals, function(x) x[2]), names = lapply(args_vals, function(x) x[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems an odd bit of code. Can you clarify what it does and why it is needed?
output: | ||
tuple val(meta), path("*_viber_best_st_fit.rds"), emit: viber_rds | ||
tuple val(meta), path("*_viber_best_st_heuristic_fit.rds"), emit: viber_heuristic_rds | ||
tuple val(meta), path("*_${plot1}"), emit: viber_plots_rds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit weird that there are two options for outptut, but neither is actually optional.
Could you just use the common pattern as the defintion, like plot2 would be something like "heuristic.rds"
Why do you call them plots when they are rds (not plots)?
Don't plot1 and plo2 overlap with viber_rds and viber_heuristic_rds if you have more than one sample? If these are obligatory and always outputted, just make the other version optional.
input_obj = readRDS("$rds_join") | ||
if (class(input_obj) == "m_cnaqc") { | ||
shared = input_obj %>% get_sample(sample=samples, which_obj="shared") | ||
joint_table = lapply(names(shared), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the mixture of lapply and dplyr grating, and would prefer using purrr:: and dplyr::, but that's a personal taste.
|
||
## Extract DP (depth) | ||
dp = reads_data %>% | ||
# dplyr::filter(mutation_id %in% non_tail) %>% ## this step should be managed before by other module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these commented lines are not necessary, please just remove them.
dp = reads_data %>% | ||
# dplyr::filter(mutation_id %in% non_tail) %>% ## this step should be managed before by other module | ||
dplyr::select(dplyr::starts_with("DP")) %>% | ||
dplyr::mutate(dplyr::across(.cols=dplyr::everything(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't line 75 imply there is only one DP column? Then why eveything? Also, since the same code repeats, why not just use tidyr::replace_na() for all the values you want to replace in one go?
#try(expr = {multivariate = plot(best_fit) %>% patchwork::wrap_plots()} ) | ||
#top_p = patchwork::wrap_plots(marginals, multivariate, design=ifelse(n_samples>2, "ABB", "AAB")) | ||
|
||
try(expr = {multivariate = plot(best_fit)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no catch, so is the point only to continue if this fails? Or is the error standard? Do you need encapsulated in try? If not, just have the inside code and let it fail.
|
||
try(expr = {multivariate = plot(best_fit)}) | ||
try(expr = {multivariate = ggpubr::ggarrange(plotlist = multivariate)}) | ||
top_p = ggpubr::ggarrange(plotlist = list(marginals, multivariate), widths=ifelse(n_samples>2, c(1,2), c(2,1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try running a code fromatter, because these kind of lines are hard to read. RStudio has a good code formatter.
@@ -0,0 +1,175 @@ | |||
#!/usr/bin/env Rscript |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that the script be called something more informtive like
viber_main_script.R
or viber_template.R
or run_viber.R
Calling it main_script.R is too generic to be really useful.
PR checklist
Closes #8712
versions.yml
file.label
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda