-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Generally, we want to be able to pass a subset of our data as the calibration
set to fit.workflow()
and expect workflows to handle the step of generating predictions for those observations and then use those to fit the calibrator. This also came up in #289.
The documentation for calibration
currently says that the dataset needs to already contain those predictions. That's rather inconvenient when you are just trying to fit your primary model!
Lines 23 to 26 in 835ee35
#' @param calibration A data frame of predictors and outcomes to use when | |
#' fitting the postprocessor. See the "Data Usage" section of [add_tailor()] | |
#' for more information. | |
#' |
Poking around in fit.action_tailor()
, there is a call to generate the predictions already, so we need to update the docs.
workflows/R/post-action-tailor.R
Lines 135 to 145 in 835ee35
fit( | |
object = tailor, | |
.data = augment(workflow_mock, data), | |
outcome = names(extract_mold(workflow_mock)$outcomes), | |
estimate = tidyselect::any_of(c(".pred", ".pred_class")), | |
probabilities = c( | |
tidyselect::contains(".pred_"), | |
-tidyselect::matches("^\\.pred$|^\\.pred_class$") | |
) | |
) | |
Reprex of it working:
library(tidymodels)
library(tailor)
set.seed(1)
dat <- sim_regression(200)
split <- initial_split(dat)
rf_spec <- rand_forest(mode = "regression", trees = 20)
tlr <- tailor() |> adjust_numeric_calibration(method = "linear")
rf_wflow <- workflow(outcome ~ ., rf_spec, tlr)
# set aside data for calibration
set.seed(2)
i_split <- inner_split(split, .get_split_args(split))
i_analysis_dat <- analysis(i_split)
cal_dat <- assessment(i_split)
manual_fit <- fit(rf_wflow, data = i_analysis_dat, calibration = cal_dat)
#> Registered S3 method overwritten by 'butcher':
#> method from
#> as.character.dev_topic generics
Created on 2025-07-16 with reprex v2.1.1