-
Notifications
You must be signed in to change notification settings - Fork 23
CHAI 1.0 Design
- Implementations for various memory models are tightly coupled, complicating maintenance and making changes difficult
- Some memory models cannot be used in the same build (e.g. copy hiding and single memory space models cannot be used together)
- Build time configuration of CHAI is complicated and varies significantly by platform
- Modularize implementations for different memory models so that maintenance, understanding, and change are easier
- Allow all supported memory models to be used in the same build
- Simplify build time configuration of CHAI (should be able to remove variables like CHAI_DISABLE_RM, CHAI_ENABLE_UM, CHAI_ENABLE_PINNED, CHAI_THIN_GPU_ALLOCATE)
- Maintain or improve performance. In particular, avoid inheritance in device and other performance critical code
- Leave the design space open for shared_ptr-like semantics and container semantics
- Leave the design space open for user customization
Instead of containing multiple memory management implementations, some of which are chosen at configuration time and others at run time, ManagedArray will delegate the memory management to its own instance of an ArrayManager. ManagedArray will contain an active pointer, a size, and a handle to an ArrayManager. This will decrease the size of ManagedArray, which is advantageous because it is designed to be passed around by copy.
In this design, ArrayManagerType is an abstraction rather than a concrete type. It takes on the role of PointerRecord and parts of the role of the old ArrayManager class (basically everything except managing the execution context)
This class replaces the "thin" implementation of ManagedArray. It is designed to be used in the case where there is a unified memory space (same address can be used across execution contexts) but synchronization is still required for coherence.
This class replaces the "main" implementation of ManagedArray. It is designed to be used in the case where there are discrete memory spaces, and copying between them is required for coherence.
The old ExecutionSpace enum grew in a confusing way to include both execution spaces and memory types. Also, the term "space" has become a little limiting. For performance reasons, it is important to be able to not only distinguish whether we are executing on a GPU, but which GPU stream is being used. That would allow CHAI to synchronize at the stream level rather than the whole device, which is a very expensive operation. ExecutionContext would work with camp::Resource objects, which would provide that more fine-grained control. It could also be made to work with the current ExecutionSpace enum in the short-term, choosing a default camp resource under the hood.
This class takes on some of the role of ArrayManager, namely managing the current execution context. It also keeps track of which execution contexts are unsynchronized. Concrete ArrayManager implementations communicate synchronization that has occurred or synchronization that is needed to ExecutionContextManager, which then performs the synchronization only if it is needed. This is much more performant than having every ArrayManager instance call cudaDeviceSynchronize()
, for example (e.g. if five ManagedArrays are copied into a RAJA sequential loop, we only want to synchronize the device once, not five times).
This class provides a convenient way of setting the execution context on construction, and restoring the previous execution context on destruction. It is a similar idea to std::lock_guard and would be used the same way, though it does not provide thread safety. Another type of guard could potentially be added to provide thread safety if needed. This class would not be used by the RAJA plugin, but could be helpful in application code when raw data needs to be extracted from a ManagedArray (such as passing a raw pointer to a function).
template <typename ElementType, typename ArrayManagerType>
class ManagedArray {
public:
ManagedArray();
explicit ManagedArray(const ArrayManagerType& manager);
CHAI_HOST_DEVICE ManagedArray(const ManagedArray& other);
// TODO: Add a constructor from a non-const ElementType
void resize(size_t newSize);
void free();
void update();
CHAI_HOST_DEVICE size_t size() const;
CHAI_HOST_DEVICE ElementType* data() const;
CHAI_HOST_DEVICE const ElementType* cdata() const;
CHAI_HOST_DEVICE ElementType& operator[](size_t i) const;
ElementType getElement(size_t i) const;
void setElement(size_t i, const ElementType& value);
private:
ElementType* m_data;
size_t m_size;
ArrayManagerType m_manager;
};