Sampling Rows from a Data Frame in R

It’s time, yet again, for a simple and useful function I created that helped me at work.  I was looking for a way to sample whole rows of a very large data frame with many columns so that I could build a regression model on a subset of my data.  

While I was acquainted with the sample() function, I realized today that it couldn’t be used direclty to sample whole rows of a data frame.  It’s really just meant for vectors.  So, I looked up how to get a sample of rows from a data frame and found an answer on the R help forums.  The meat of the function below comes from that answer:

So, just put the data frame as the first argument of the function, and the number representing the size of your sample, and there you have it!  You just cut a random sample out of your data frame.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s