Optimizing Data Partitioning for DataParallel Computing.pdf
Performanceofdata-parallelcomputing(e.g.,MapReduce,DryadLINQ)heavilydependsonitsdatapartitions.Solutionsimplementedbythecurrentstateoftheartsystemsarefarfromoptimal.Techniquesproposedbythedatabasecommunitytofindoptimaldatapartitionsarenotdirectlyapplicablewhencomplexuser-definedfunctions