FEAT 3
Finite Element Analysis Toolbox
Loading...
Searching...
No Matches
LAFEM design description

This page describes ideas and concepts used throughout the whole lafem subsystem.

Overview

LAFEM comprises a wide range of different linear algebra containers and their corresponding operations.

All Containers are derived vom FEAT::LAFEM::Container, thus its page contains a list of all currently available containers.

Although many operations are different for each container, many concepts are common to all containers and shall be described in this document.

Container properties

All Containers have at least the first three template parameters in common, describing memory location, floating point format and indexing data type.

The structs in FEAT::Mem are used to control, if the containers contents will be stored in the cpus main memory or in gpu memory, allocated by the cuda runtime.

The second template parameter is used as the datatype for the allocation of all floating point based arrays.

The third template parameter is used as the datatype for the allocation of all indexing related arrays, e.g. the column index array in a FEAT::LAFEM::SparseMatrixCSR.

LA Operations

All operations are member functions of the related containers. With little exceptions, the object which hosts the function call will contain the calculation result.

Note
Namely the BLAS2-like operations of axpy and apply break this rule by being a member of the corresponding matrix. The BLAS1 reduction operations, which return a single scalar, break this rule, too.

A typical call may look like these:

// compute component product
vector1.component_product(vector1, vector2);
// compute dot product
double d = vector1.dot(vector2);
// compute matrix vector product
system_matrix.apply(y, rhs);

Data handling

All containers provide at least three different member functions for proper conversion and data movement.

  • clone: creates a copy of a container with user defined depth. See FEAT::LAFEM::CloneMode for details.
  • convert: try to reuse existing arrays for shallow copy, convert and transfer where nessesary.
  • copy: copy container contents only, floating point format and indexing datatype must be identical.
Note
There does not exist a copy constructor nor copy assignment operator. This prevents the user from triggering unwanted copy operations and memory transfers.

Container reset

All containers offer the format() and clear() operations:

  • clear: simply frees all allocated arrays and leaves the container in an undefined state.
  • format: resets all non values to zero or any other provided value, but does not free any arrays.

Sparse Layout

Every sparse matrix consists of its non-zero data arrays and a data layout (stored in various different arrays).

A FEAT::LAFEM::SparseLayout object can be retrieved by any matrix via the corresponding layout() member function to create another matrix with the same shared (except FEAT::LAFEM::SparseMatrixCOO and FEAT::LAFEM::SparseVector, which enforce deep copies) data layout.

Thus empty matrices with identical layout can be created easily and memory consumption can be reduced by reusing existing data layout arrays.

Common container typedefs

  • DataType
  • IndexType
  • MemType
  • ContainerType, the base container type
  • ValueType (mainly for blocked containers) <–! maybe add ValueType for all containers (set to DataType) -->

Usage:

typedef DenseVector<Mem::Main, float, Index> VT_main;
typedef typename VT_::template ContainerType<Mem::CUDA, typename VT_main::DataType, unsigned int> VT_cuda;

Common typename abbreviations

  • DT_ - Datatype (float, double)
  • IT_ - Indextype (Index, unsigned long)
  • Mem_ - Memorytype (Mem::Main, Mem::CUDA)
  • VT_ - Vectortype (DenseVector<...>), not to be confused with ValueType typedefs
  • MT_ - Matrixtype, not to be confused with Mem_
  • FT_ - Filtertype (UnitFilter, NoneFilter)

Common Backends

Currently, two memory backends are supported:

  • FEAT::Mem::Main
  • FEAT::Mem::CUDA

To gain optimal performance from external libraries like MKL or CuBLAS, which automagically replace the generic algorithms if available, one needs to match the supported container configurations:

  • MKL: IT_ = unsigned long
  • CUDA: IT_ = unsigned int