how to simulate dedup dataset?

Top