Q: How do I take a random sample of a large dataset in the SDA interface

Answer: One option to creating a random sample from one of the large data sets is to use the compute procedure in SDA:

The 'compute' procedure [in SDA] allows you to generate a variable containing random numbers. One possibility would be to create a variable like this:

newvar = duniform(0,99)

Then you could use this variable as a filter in an analysis.

For example, to use a random 10% of the cases, you would specify the selection filter as 'newvar(0-9)' or 'newvar(20-29)' or any set of 10 codes on the random variable.

Solution provided by Tom Piazza, University of California, Berkeley, Nov. 10, 2004


Html by Laine Ruus, Data Library Service, University of Toronto
Created: 2004/11/17