It’s time, yet again, for a simple and useful function I created that helped me at work. I was looking for a way to sample whole rows of a very large data frame with many columns so that I could build a regression model on a subset of my data.
While I was acquainted with the sample() function, I realized today that it couldn’t be used direclty to sample whole rows of a data frame. It’s really just meant for vectors. So, I looked up how to get a sample of rows from a data frame and found an answer on the R help forums. The meat of the function below comes from that answer:
So, just put the data frame as the first argument of the function, and the number representing the size of your sample, and there you have it! You just cut a random sample out of your data frame.