NMF-mGPU implements the Non-negative Matrix Factorization (NMF) algorithm by making use of Graphics Processing Units (GPUs). NMF takes an input matrix (V) and returns two matrices, W and H, whose product is equal to the former (i.e., V ≈ W ∗ H). If V has n rows and m columns, then dimensions for W and H, will be n × k and k × m, respectively. The factorization rank ("k") specified by the user, is usually a value much less than both, n and m.
This software has been developed using the NVIDIA's CUDA (Compute Unified Device Architecture) framework for GPU Computing. CUDA represents a GPU device as a programmable general-purpose coprocessor able to perform linear-algebra operations.
On detached devices with low on-board memory available, large datasets can be blockwise transferred from the CPU's main memory to the GPU's memory and processed accordingly. In addition, NMF-mGPU has been explicitly optimized for the different existing CUDA architectures.
Finally, NMF-mGPU also provides a multi-GPU version that makes use of multiple GPU devices through the MPI (Message Passing Interface) standard.