Let’s say I have 1M assets and I want to get a random sample of 1000 assets. How can I do this in fetch?

# How to fetch a random subset?

If you know the ids of those 1000 assets, you can store them in an array and use intersects to fetch them. Something like this:

assetIds = [“123”,“234”…]

c3Grid(Asset.fetch({filter:Filter.intersects(‘id’, assetIds)}))

**lpoirier**#3

You can use a md5 hash function in your filter to get a random subset of the size you want.

`Asset.fetch({filter:"md5HashKey(id) % 1000 == 0"})`

MD5 is not perfect but sufficiently random for most applications.

Notes

- The
`1000`

in the filter corresponds to the ratio of 1:1000 you want. - You can change the remainder for a different subset.
- This method also has the advantage of being repeatable (the remainder acts as a seed)
- You don’t have guarantees to get exactly 1000, but very close.

2 Likes

**lpoirier**#4

Additionally, assuming you already have a `source`

and you want to know which bucket it falls into; you want to compute the same MD5 hash result in javascript you can do the following:

`parseInt(MD5.sumString(source.id).slice(0,7), 16)`