This page collects some architecture details and software design notes from the time while RNGTest was converted from OpenMPI to Charm++ parallelism.
Currently Intel's MKL's vector statistical library (VSL) RNGs, RNGSSE's RNGs, and a couple of Random123 RNGs are interfaced. References:
- MKL: https:/
/ software.intel.com/ en-us/ intel-mkl,
- RNGSSE: https:/
/ doi.org/ 10.1016/ j.cpc.2011.03.022 and https:/ / doi.org/ 10.1016/ j.cpc.2013.04.007,
- Random123: http:/
/ www.thesalmons.org/ john/ random123/ releases/ latest/ docs/ index.html
All of these libraries implement a variety of different pseudo random number generators and importantly, all support concurrent sampling from random number streams: MKL supports both block splitting and leap frogging, RNGSSE supports block splitting (jumping ahead and initialization of a large number of independent streams), and Random123 RNGs are counter-based so counters can be initialized and incremented differently on each rank/thread.
Currently TestU01's SmallCrush, Crush, and BigCrush batteries are supported and the design allows for other libraries of statistical tests. Reference for TestU01: http:/
The user selects a number of RNGs (with optionally specifying parameters for each, e.g., seed, sequence length, etc.) as well as a battery of statistical tests. The tests are then run concurrently. RNGs from all RNG libraries can be tested (and thus compared to each other) at the same time.
As the tests of the battery are run, the various RNGs are timed separately, providing a 'generator cost' ranking. The number of passed and failed tests provide a 'generator quality' ranking (if more than one RNGs are tested).
An arbitrary number of RNGs can be subjected to a single battery at a time. The tests are run concurrently using Charm++ on a single machine or across a set of machines distributed over the network. The software design is fully asynchronous yielding 100% CPU utilization at all times, independent of the time taken by the individual tests.
Concurrency via Charm++ requires serializable objects. Charm++ discourages pointers, function pointers, references, while encourages stateless objects (or objects with as little and simple state as possible).
While Charm++ supports migratable objects (chares) that inherit from a pure base class to implement runtime polymorphism via reference semantics, the current design of RNGTest does not rely on Charm++ to do this. Instead we use Sean Parent's concept-based polymorphism which achieves polymorphism without client-side inheritance and enables value semantics. In concept-based polymorphism inheritance in confined to the internals of a "base" class, which, at instantiation, is initialized via a templated constructor by a "derived"-class object. This locality enables an easier reasoning about the code, eliminates the need for client-side heap-allocation, client-side indirection, and in general, leads to tighter and more readable, simpler client-code.
Analogous to the relationship between classes rngtest::
Since the individual statistical tests of TestU01 are very similar in structure, but have some differences, e.g., the C library function used to create the results struct, the number and type of parameters, etc., which require non-trivial initialization, i.e., function pointers and variable number of test parameters, and rngtest::
As discussed above, there are three different dimensions of runtime polymorphism in RNGTest: (1) the RNGs, (2) the statistical tests, and (3) the test suites, all exercising concept-based polymorphism.
In those cases where the instantiation of the object (i.e., which object to instantiate) depends on user input (RNGs and test suites), polymorphism is facilitated by factories. Factories are std::
The factories are implemented using maps via either reference or value semantics. Since concept-based polymorphism enables value semantics, value semantics is used in factories whenever possible. For more information on Boost.factory, see http:/
In the case when all registered objects have to be instantiated, there is no need for a factory, and the base class pointers are initialized by instantiating derived objects held in a std::
Before porting to Charm++ RNGTest used OpenMP for concurrency. The last commit before porting to Charm++ is bfcab8f5 and the commit where the Charm++ port is fully functional is c504a51b. Note that during Charm++ port of rngtest target quinoa does not fully build.
The discussion below details only some of the details encountered during the Charm++ port of RNGTest. More documentation can be found in the source code itself as well as the individual (sometimes rather lenghty) commit messages between the two commits mentioned above.
The above features, facilitating polymorphism, code reuse, e.g., factories, etc., are certainly desirable to keep in a Charm++ port of rngtest. However, the Charm++ implementation imposes additional challenges, especially in the view of the constrains on the global-scope wrappers (used as library-external calls). Some of the following challenges have been identified before porting to Charm++.
- Global-scope trickery, especially with the polymorphic base class RNG pointers, is a challenge as any global-scope object in Charm++ must be initialized in the main chare and must be serializable so that the Charm++ runtime system can migrate them across the network. This means that the vector of base RNG pointers should be re-creatable after migration. This may also mean that its factory must also be in global scope and migratable. How to migrate std::
function is a question. As it turns out Sean Parent's concept-based polymorphism (discussed above) significantly simplifies runtime polymorphism using Charm++: since polymorphism is kept internal to a non-chare class, e.g., rngtest:: StatTest, and the client-code can use value-semantics, the "derived" class rngtest:: TestU01 chare does not need Charm++'s machinery facilitating runtime polymorphism using the PUP::able framework. Though in the final design Charm++'s runtime polymorphism is not used, a working example is in commit 56bdcf73.
- Conveniently holding references to Base is hardly an option with Charm++ objects that should be migratable. Making Base migratable is not a viable option and seems wasteful when only a small part of the data is needed by a class. Passing and storing only what's needed certainly seems like a more walkable route.
- All migratable objects must have as little state as possible. This is for reasons of code complexity (less PUP routines to write and maintain), as well as computational cost (less data to migrate).
- Converting each statistical test in a suite seems like the most straightforward to start with when porting to Charm++. However, if class rngtest::
TestU01 holds nontrivial state, i.e., custom classes, they all require custom PUP routines and must also be polymorphic with base rngtest:: StatTest. Furthermore, once a test was finished there should be a way to pass the control flow back to the invoking suite to evaluate the just-finished test. Does this require the suite be a chare object as well? (That is also polymorphic and holds some nontrivially migratable state.) How much does Charm++ help already with these issues? E.g., PUP routines for STL containers exist, but some are not optimal and don't use the latest C++ standard. As described below, most of these issues are eliminated by a careful design of classes with as little state as possible.
The Charm++ design of RNGTest relies on three interacting Charm++ modules:
- rngtest – Beside global-scope data, this file describes the mainchare Main and a small helper chare, execute, both defined in Main/
- testu01suite – This file describes the chare rngtest::
- testu01 – This file describes the chare rngtest::
TestU01, which is templated on the test properties, and thus lists all possible template specializations, corresponding to the various statistical tests available in the TestU01 library.
Background. In RNGTest both the RNGs and the statistical test suites are provided by third-party libraries. In other words, rngtest is only a glue-code that ties together the RNGs and the statistical tests. Note that neither third-party libraries (neither the test suites nor the RNGs) are aware of Charm++'s fully asynchronous nature. This constrains how the RNGs must be called by the batteries as the RNGs must be passed, as external generators, to the statistical test suites. Since the currently only supported set of batteries (TestU01) are a C library, this is done through global-scope wrappers.
Global-scope wrappers. The RNGs, wrapped through the base tk::