DList: Distributed List Operations

Frank Rehberger

provides
x-ozlib://rehberger/dp/DList.oz
requires

Purpose

This is a set of operations on distributed lists. Currently it supports balancing and distributed, parallel sorting of lists.

Balancing

This operation can be used for load balancing. Data is distributed equally betweeen all participating processes. After execution length of lists of all processes will differ at most one. This operation keeps ordering. Each participating process calls {DList.balance Xs Sync NPEs MyPE ?Ys } where
This operation returns the list Ys within each process. The number of elements differs at most one between all processes.

Sorting

This operation can be used for sorting of distributed lists and makes use of parallelism. This is an implementation of "On the Versatility of Parallel Sorting by Regular Sampling",Jonathon Schaeffer et al., Parallel Computing, vol. 19, pp. 1079-110, 1993. [ps.gz]

Abstract: The algorithm reduces memory and bus contention, which many parallel sorting algorithms suffer from, by using a regular sampling of data to ensure good pivot selection. For n data elements to be sorted and p processors, when n>=p*p*p, the algorithm is within a factor of 2 of achieving ideal load balancing. In practice, there is almost a perfect partitioning of work. On a variety of shared and distributed memory machines, the algorithm achieves better than half-linear speedups.

Each process participating has to apply this function with its local list as argument: { DList.sort Xs Sync NPEs MyPE Op ?Ys }
This operation returns the list Ys within each process. The number of sorted elements differs at most one between all processes. (If number of elements is low, all data will be transfered to process with MyPE==1 where centralized sorting is done. )

Installation

This package is pure OZ code. Get the package from Mogul, extract, change directory and call ozmake --install
Frank Rehberger