The size of data sets that researchers nowadays typically access is limited to dozens, sometimes hundreds, and even more rarely thousands. This significantly reduces the reliability of results. Once millions of diverse data sets are accessible through an easy-to-use platform, finding and narrowing down data can be as simple as entering the correct search criteria (“give me a random set of 12,000 genomes of females, age 20 to 25, and an ethnicity split of 20% caucasian, 40% asian, 40% others”).
The sheer size of the data set makes it much more suitable to really take advantage of one of the most significant technological advances of the past decade: artificial intelligence (AI) deep learning algorithms. AIs need large data sets, and they are good in what humans cannot do well: sift through billions of letters each, for hundreds of millions of records to find patterns, and learn.