Skip to content

rsample (development version)

Bug fixes

  • vfold_cv() now utilizes the breaks argument correctly for repeated cross-validation (@ZWael, #471).

  • Grouped resampling functions now work with an explicit strata = NULL instead of strata being either a name or missing (#485).

Breaking changes

  • The class of grouped MC splits is now group_mc_split instead of grouped_mc_split, aligning it with the other grouped splits (#478).

  • The rsplit objects of an apparent() split now have the correct class inheritance structure. The order is now apparent_split and then rsplit rather than the other way around (#477).

Documentation improvements

rsample 1.2.1

CRAN release: 2024-03-25

rsample 1.2.0

CRAN release: 2023-08-23

rsample 1.1.1

CRAN release: 2022-12-07

rsample 1.1.0

CRAN release: 2022-08-08

  • rset objects now include all parameters used to create them as attributes (#329).

  • Objects returned by sliding functions now have an index attribute, where appropriate, containing the column name used as an index (#329).

  • Objects returned by permutations() now have a permutes attribute containing the column name used for permutation (#329).

  • Added breaks and pool as attributes to all functions which support stratification (#329).

  • Changed the “strata” attribute on rset objects so that it now is either a character vector identifying the column used to stratify the data, and is not present (set to NULL) if stratification was not used. (#329)

  • Added a new function, reshuffle_rset(), which takes an rset object and generates a new version of it using the same arguments but the current random seed. (#79, #329)

  • Added arguments to control how group_vfold_cv() combines groups. Use balance = "groups" to assign (roughly) the same number of groups to each fold, or balance = "observations" to assign (roughly) the same number of observations to each fold.

  • Added a repeats argument to group_vfold_cv() (#330).

  • Added new functions for grouped resampling: group_mc_cv() (#313), group_initial_split() and group_validation_split() (#315), and group_bootstraps() (#316).

  • Added a new function, reverse_splits(), to swap analysis and assessment splits (#319, #284).

  • Improved the error thrown when calling assessment() on a perm_split object created by permutations() (#321, #322).

rsample 1.0.0

CRAN release: 2022-06-24

  Note: Using an external vector in selections is ambiguous.
  i Use `all_of(strata)` instead of `strata` to silence this message.
  i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
  • Added better printing methods for initial split objects.

rsample 0.1.1

CRAN release: 2021-11-08

  • Updated documentation on stratified sampling (#245).

  • Changed make_splits() to an S3 generic, with the original functionality a method for list and a new method for dataframes that allows users to create a split from existing analysis & assessment sets (@LiamBlake, #246).

  • Added validation_time_split() for a single validation sample taking the first samples for training (@mine-cetinkaya-rundel, #256).

  • Escalated the deprecation of the gather() method for rset objects to a hard deprecation. Use tidyr::pivot_longer() instead (#257).

  • Changed resample “fingerprint” to hash the indices only rather than the entire resample result (including the data object). This is much faster and will still ensure the same resample for the same original data object (#259).

rsample 0.1.0

CRAN release: 2021-05-08

rsample 0.0.9

CRAN release: 2021-02-17

  • New rset_reconstruct(), a developer tool to ease creation of new rset subclasses (#210).

  • Added permutations(), a function for creating permutation resamples by performing column-wise shuffling (@mattwarkentin, #198).

  • Fixed an issue where empty assessment sets couldn’t be created by make_splits() (#188).

  • rset objects now contain a “fingerprint” attribute that can be used to check to see if the same object uses the same resamples.

  • The reg_intervals() function is a convenience function for lm(), glm(), survreg(), and coxph() models (#206).

  • A few internal functions were exported so that rsample-adjacent packages can use the same underlying code.

  • The obj_sum() method for rsplit objects was updated (#215).

  • Changed the inheritance structure for rsplit objects from specific to general and simplified the methods for the complement() generic (#216).

rsample 0.0.8

CRAN release: 2020-09-23

rsample 0.0.7

CRAN release: 2020-06-04

  • Lower threshold for pooling strata to 10% (from 15%) (#149).

  • The print() methods for rsplit and val_split objects were adjusted to show "<Analysis/Assess/Total>" and <Training/Validation/Total>, respectively.

  • The drinks, attrition, and two_class_dat data sets were removed. They are in the modeldata package.

  • Compatability with dplyr 1.0.0.

rsample 0.0.6

CRAN release: 2020-03-31

  • Added validation_set() for making a single resample.

  • Correct the tidy method for bootstraps (#115).

  • Changes for upcoming `tibble release.

  • Exported constructors for rset and split objects (#40)

  • initial_time_split() and rolling_origin() now have a lag parameter that ensures that previous data are available so that lagged variables can be calculated. (#135, #136)

rsample 0.0.5

CRAN release: 2019-07-12

rsample 0.0.4

CRAN release: 2019-01-07

Small maintenance release.

Minor improvements and fixes

  • fill() was removed per the deprecation warning.
  • Small changes were made for the new version of tibble.

rsample 0.0.3

CRAN release: 2018-11-20

New features

Minor improvements and fixes

  • fill() has been renamed populate() to avoid a conflict with tidyr::fill().

  • Changed the R version requirement to be R >= 3.1 instead of 3.3.3.

  • The recipes-related prepper() function was moved to the recipes package. This makes the rsample install footprint much smaller.

  • rsplit objects are shown differently inside of a tibble.

  • Moved from the broom package to the generics package.

rsample 0.0.2

CRAN release: 2017-11-12

  • initial_split, training, and testing were added to do training/testing splits prior to resampling.
  • Another resampling method, group_vfold_cv, was added.
  • caret2rsample and rsample2caret can convert rset objects to those used by caret::trainControl and vice-versa.
  • A function called form_pred can be used to determine the original names of the predictors in a formula or terms object.
  • A vignette and a function (prepper) were included to facilitate using the recipes with rsample.
  • A gather method was added for rset objects.
  • A labels method was added for rsplit objects. This can help identify which resample is being used even when the whole rset object is not available.
  • A variety of dplyr methods were added (e.g. filter(), mutate(), etc) that work without dropping classes or attributes of the rsample objects.

rsample 0.0.1 (2017-07-08)

CRAN release: 2017-07-08

Initial public version on CRAN