Wednesday, November 24, 2010

MSSQL: Sample Random Data

Recently I was faced by an interesting request. Someone wanted to get sample data from the database for Quality Assurance.

Problem:
Multiple users enter data into the database
A certain percentage must be sampled for QA's validation/verification, i.e. "Sample 10% daily for each user."

Assumptions:
We have the following database tables: Users, Documents, and DocumentsHistory.

Solution:
After some research I was able to find different approaches, like the use of TABLESAMPLE, and the use of NEWID(). I went for the latter, since TABLESAMPLE does not return a consistent number of records.