Dune Core Modules (2.9.0)
How to support vectorization in Dune classes. More...
Files  
file  interface.hh 
User interface of the SIMD abstraction.  
file  io.hh 
IO interface of the SIMD abstraction.  
Basic interface  
template<class V >  
using  Dune::Simd::Scalar = typename Overloads::ScalarType< std::decay_t< V > >::type 
Element type of some SIMD type. More...  
template<class S , class V >  
using  Dune::Simd::Rebind = typename Overloads::RebindType< std::decay_t< S >, std::decay_t< V > >::type 
Construct SIMD type with different scalar type. More...  
template<class V >  
constexpr std::size_t  Dune::Simd::lanes () 
Number of lanes in a SIMD type. More...  
template<class V >  
decltype(auto)  Dune::Simd::lane (std::size_t l, V &&v) 
Extract an element of a SIMD type. More...  
template<class V , class U >  
constexpr V  Dune::Simd::implCast (U &&u) 
Cast an expression from one implementation to another. More...  
template<class V , class S >  
constexpr V  Dune::Simd::broadcast (S s) 
Broadcast a scalar to a vector explicitly. More...  
template<class M , class V >  
V  Dune::Simd::cond (M &&mask, const V &ifTrue, const V &ifFalse) 
Like the ?: operator. More...  
template<class V >  
V  Dune::Simd::cond (bool mask, const V &ifTrue, const V &ifFalse) 
Like the ?: operator. More...  
template<class V >  
auto  Dune::Simd::max (const V &v1, const V &v2) 
The binary maximum value over two simd objects. More...  
template<class V >  
auto  Dune::Simd::min (const V &v1, const V &v2) 
The binary minimum value over two simd objects. More...  
template<class Mask >  
bool  Dune::Simd::anyTrue (const Mask &mask) 
Whether any entry is true More...  
template<class Mask >  
bool  Dune::Simd::allTrue (const Mask &mask) 
Whether all entries are true More...  
template<class Mask >  
bool  Dune::Simd::anyFalse (const Mask &mask) 
Whether any entry is false More...  
template<class Mask >  
bool  Dune::Simd::allFalse (const Mask &mask) 
Whether all entries are false More...  
template<class V >  
Scalar< V >  Dune::Simd::max (const V &v) 
The horizontal maximum value over all lanes. More...  
template<class V >  
Scalar< V >  Dune::Simd::min (const V &v) 
The horizontal minimum value over all lanes. More...  
template<class V >  
auto  Dune::Simd::mask (const V &v) 
Convert to mask, analogue of bool(s) for scalars. More...  
template<class V1 , class V2 >  
auto  Dune::Simd::maskOr (const V1 &v1, const V2 &v2) 
Logic or of masks. More...  
template<class V1 , class V2 >  
auto  Dune::Simd::maskAnd (const V1 &v1, const V2 &v2) 
Logic and of masks. More...  
Syntactic Sugar  
Templates and functions in this group provide syntactic sugar, they are implemented using the functionality from SimdInterfaceBase, and are not customizable by implementations.  
template<class V >  
using  Dune::Simd::Mask = Rebind< bool, V > 
Mask type type of some SIMD type. More...  
template<class V >  
std::size_t  Dune::Simd::lanes (const V &) 
Number of lanes in a SIMD type. More...  
IO interface  
Templates and functions in this group provide syntactic sugar for IO. They are implemented using the functionality from SimdInterfaceBase, and are not customizable by implementations.  
template<class V >  
auto  Dune::Simd::vio (const V &v) 
construct a stream inserter More...  
template<class V >  
auto  Dune::Simd::io (const V &v) 
construct a stream inserter More...  
Detailed Description
How to support vectorization in Dune classes.
This module describes how a Dune library developer can add support for vectorization to library facilities.
Understanding SIMD types
The (idealized) model of a SIMD type V
used in this abstraction layer is that they are fixedlength vectors of some scalar type S
. Operations and operators that take values of type S
as arguments, except for operator,()
, should be overloaded to support values of type V
too. These operations should apply elementwise. If the operation takes more than one argument, it should accept arbitrary combinations of V
and S
. The exception is the combination of S
on the left hand side and V
on the right hand side of one of the assignment operators, which does not make sense.
The result of a boolean operation is a mask type M
, which is a SIMD type with scalar type bool
with the same number of elements as V
. The result of all other operations is again of type V
, or of some type convertible to V
.
This is very similar to std::valarray
, with the main difference being that std::valarray
is dynamic in size, while for this abstraction the size is static.
Type promotion issues
True SIMD types have an issue with type promotion, which means they cannot behave completely analogous to builtin integral types (this is a nonissue with floating point types). Essentially, operations on true SIMD types cannot promote their arguments, because the promoted types typically require more storage than the original types, meaning an argument that was passed in a single vector register would need multiple vector registers after promotion, which would mean greater register pressure. Also, there would be conversion operations required, which (at least on x86) is not typically the case for promotions of the builtin types. Lastly, with larger types the vector units can typically operate on fewer lanes at a time.
Omitting integral promotions has in many cases no negative impact, because many programmers do not really expect them anyway. There are however cases where they matter, and for illustration I want to explain one that crept up during unit testing.
Here is a simplified (and somewhat pseudocode) version of the test. The test checks the result of unary 
on Vc::Vector<unsigned short>
by comparing the result of unary 
when applied to the complete vector to the result of unary 
when applied to each lane individually.
The test fails in lane 0. On the left side of the ==
, lane(0, vresult)
is (unsigned short)65535
, which is the same as (unsigned short)1
, as it should be. On the right side, lane(0, varg)
is (unsigned short)1
. 
promotes its argument, so that becomes (int)1
, and the result of the negation is (int)1
.
Now the comparison is (unsigned short)65535 == (int)1
. The comparison operator applies the usual arithmetic conversions to bring both operands to the same type. In this case this boils down to converting the left side to int
via integral promotions and the comparison becomes (int)65535 == (int)1
. The result is of course false
and the assertion triggers.
The only way to thoroughly prevent this kind of problem is to convert the result of any operation back to the expected type. In the above example, the assertion would need to be written as assert(lane(l, vresult) == static_cast<unsigned short>(lane(l, varg)));
. In practice, this should only be a problem with operations on unsigned types where the result may be "negative". Most code in Dune will want to operate on floating point types, where this is a nonissue.
(Of course, this is also a problem for code that operates on untrusted input, but you should not be doing that with Dune anyway).
Still, when writing code using the SIMD abstractions, you should be aware that in the following snippet
the exact types of var1
and var2
may be somewhat surprising.
Limitations of the Abstraction Layer
Since the abstraction layer cannot overload operators of SIMD types (that would be meddling with the domain of the library that provides the SIMD types), nor provide its own constructors, there are severe limitations in what the abstraction layer guarantees. Besides the standard types, the first SIMD library supported is Vc, so that is where most of the limitations stem from; see Restrictions in SIMD Abstraction Implementation for Vc.
The biggest limitations are with masks. In Vc masks support a very restricted set of operations compared to other SIMD types, so in what follows we will distinguish between masks with a very small set of operations and between vectors with a larger set of operations.
Here is a compact table of the limitations as a quick reference, together with suggested workarounds for the constructs that don't work. s
denotes a scalar object/expression (i.e. of type double
or in the case of masks bool
). v
denotes a vector/mask object/expression. sv
means that both scalar and vector arguments are accepted. V
denotes a vector/mask type. @
means any applicable operator that is not otherwise listed.
Notes:
 [1] In Vc, maskmask
==
and!=
operations exist, but the result is of typebool
, i.e. a scalar.  [2]
++
(either kind) on bools is deprecated by the standard. Our test suite does not check for it on masks, but it was supported by Vc masks at some point.  [3] Contrary to the other operators, the expected result for
(sv1, sv2)
is exactlysv2
, no broadcasting applied.  [4] Try to avoid the use of
operator,
unless both operands are builtin types if possible. Libraries had a tendency to overloadoperator,
to provide for things like container initialization before C++11, and these overloads may still be present in the library you are using and replace the default meaning ofoperator,
.
Support levels:
y
: operation generally works; some instances of the operation may not apply*N*
: operation generally does not work; some instances of the operation may not applyn/a
: operation does not apply (i.e. bitwise operations to floatingpoint operands,
(and in the future possibly++
) to boolean operands, assignment operators to scalar left hand sides)
Typedef Documentation
◆ Mask
using Dune::Simd::Mask = typedef Rebind<bool, V> 
Mask type type of some SIMD type.
 Template Parameters

V The SIMD (mask or vector) type. const
,volatile
or reference qualifiers are automatically ignored.
The mask type is kind of a SIMD vector of bool
with the same number of lanes as V
. It results from comparison operations between values of type V
. It is only "kind of" a SIMD vector, because the guaranteed supported operations are extremely limited. At the moment only the logical operators &&
, 
and !
and the "bitwise" operators &
, ^
and 
between masks are supported, and even with those operators you cannot rely on automatic broadcasting of bool
values.
 Note
 In particular, masks do not support comparison. As a workaround you can use
^
instead of!=
and!(m1 ^ m2)
instead ofm1 == m2
. (The reason why comparison is not supported is because in Vc==
and!=
between masks yield a singlebool
result and not a mask.)
This is an alias for Rebind<bool, V>
.
◆ Rebind
using Dune::Simd::Rebind = typedef typename Overloads::RebindType<std::decay_t<S>, std::decay_t<V> >::type 
Construct SIMD type with different scalar type.
 Template Parameters

S The new scalar type V The SIMD (mask or vector) type.
The resulting type a SIMD vector of S
with the same number of lanes as V
. const
, volatile
or reference qualifiers in S
and V
are automatically ignored, and the result will have no such qualifiers.
Implementations shall rebind to LoopSIMD<S, lanes<V>()>
if they can't support a particular rebind natively.
Implemented by Overloads::RebindType
.
◆ Scalar
using Dune::Simd::Scalar = typedef typename Overloads::ScalarType<std::decay_t<V> >::type 
Element type of some SIMD type.
 Template Parameters

V The SIMD (mask or vector) type. const
,volatile
or reference qualifiers are automatically ignored.
Not all operations that access the element of a vector return (a reference to) the scalar type – some may return proxy objects instead. Use autoCopy()
to make sure you are getting a prvalue of the scalar type.
Implemented by Overloads::ScalarType
.
Function Documentation
◆ allFalse()
bool Dune::Simd::allFalse  (  const Mask &  mask  ) 
◆ allTrue()
bool Dune::Simd::allTrue  (  const Mask &  mask  ) 
Whether all entries are true
Implemented by Overloads::allTrue()
.
References Dune::Simd::mask().
Referenced by Dune::BiCGSTABSolver< X >::apply(), Dune::RestartedGMResSolver< X, Y, F >::apply(), Dune::RestartedFlexibleGMResSolver< X, Y, F >::apply(), and Dune::IterativeSolver< X, Y >::Iteration< CountType >::step().
◆ anyFalse()
bool Dune::Simd::anyFalse  (  const Mask &  mask  ) 
◆ anyTrue()
bool Dune::Simd::anyTrue  (  const Mask &  mask  ) 
Whether any entry is true
Implemented by Overloads::anyTrue()
.
References Dune::Simd::mask().
Referenced by Dune::Simd::Overloads::allFalse(), Dune::Simd::Overloads::allTrue(), and Dune::Simd::Overloads::anyFalse().
◆ broadcast()

constexpr 
Broadcast a scalar to a vector explicitly.
Implemented by Overloads::broadcast()
This is useful because the syntax for broadcasting can vary wildly between implementations.
 Note
 One of the few functions that explicitly take a template argument (
V
in this case).
◆ cond() [1/2]
V Dune::Simd::cond  (  bool  mask, 
const V &  ifTrue,  
const V &  ifFalse  
) 
Like the ?: operator.
Overload for plain bool masks, accepting any simd type
Implemented by Overloads::cond()
.
References Dune::Simd::mask().
◆ cond() [2/2]
V Dune::Simd::cond  (  M &&  mask, 
const V &  ifTrue,  
const V &  ifFalse  
) 
Like the ?: operator.
Equivalent to
Implemented by Overloads::cond()
.
References Dune::Simd::mask().
Referenced by Dune::GradientSolver< X >::apply(), Dune::CGSolver< X >::apply(), Dune::BiCGSTABSolver< X >::apply(), Dune::MINRESSolver< X >::apply(), Dune::RestartedGMResSolver< X, Y, F >::apply(), Dune::Simd::Overloads::max(), and Dune::Simd::Overloads::min().
◆ implCast()

constexpr 
Cast an expression from one implementation to another.
Implemented by Overloads::implCast()
Requires the scalar type and the number of lanes to match exactly.
This is particularly useful for masks, which often know the type they were derived from. This can become a problem when doing a conditional operation e.g. on some floating point vector type, but with a mask derived from some index vector type.
 Note
 One of the few functions that explicitly take a template argument (
V
in this case).
◆ io()
auto Dune::Simd::io  (  const V &  v  ) 
construct a stream inserter
 Template Parameters

V The SIMD (mask or vector) type.
Construct an object that can be inserted into an output stream. For onelane vectors, behaves the same as scalar insertion. For multilane vectors, behaves as the inserter returned by vio()
: insertion prints the vector values separated by a comma and a space, and surrounded by angular brackets.
Referenced by Dune::BiCGSTABSolver< X >::apply(), and Dune::InverseOperator< X, Y >::printOutput().
◆ lane()
decltype(auto) Dune::Simd::lane  (  std::size_t  l, 
V &&  v  
) 
Extract an element of a SIMD type.
 Parameters

l Number of lane to extract v SIMD object to extract from
 Returns
 If
v
is a nonconst
lvalue, a referenceScalar<decay_t<V>>&
, or a proxy object through which the element ofv
may be modified. Otherwise,v
is aconst
lvalue or an rvalue, and the result is a prvalue (a temporary) of typeScalar<decay_t<V>>
.
Implemented by Overloads::lane()
.
Referenced by Dune::Simd::Overloads::implCast(), Dune::Simd::Overloads::max(), and Dune::Simd::Overloads::min().
◆ lanes() [1/2]

constexpr 
Number of lanes in a SIMD type.
 Template Parameters

V The SIMD (mask or vector) type. const
,volatile
or reference qualifiers are automatically ignored.
Implemented by Overloads::LaneCount
.
Referenced by Dune::Simd::Overloads::implCast(), Dune::Simd::Overloads::max(), and Dune::Simd::Overloads::min().
◆ lanes() [2/2]
std::size_t Dune::Simd::lanes  (  const V &  ) 
Number of lanes in a SIMD type.
 Template Parameters

V The SIMD (mask or vector) type.
The value of the parameter is ignored; the call is simply forwarded to lanes<V>()
.
◆ mask()
auto Dune::Simd::mask  (  const V &  v  ) 
Convert to mask, analogue of bool(s) for scalars.
Implemented by Overloads::mask()
.
Referenced by Dune::Simd::allFalse(), Dune::Simd::allTrue(), Dune::Simd::anyFalse(), Dune::Simd::anyTrue(), Dune::Simd::cond(), Dune::Simd::Overloads::maskAnd(), and Dune::Simd::Overloads::maskOr().
◆ maskAnd()
auto Dune::Simd::maskAnd  (  const V1 &  v1, 
const V2 &  v2  
) 
Logic and of masks.
Implemented by Overloads::maskAnd()
.
◆ maskOr()
auto Dune::Simd::maskOr  (  const V1 &  v1, 
const V2 &  v2  
) 
Logic or of masks.
Implemented by Overloads::maskOr()
.
◆ max() [1/2]
Scalar<V> Dune::Simd::max  (  const V &  v  ) 
The horizontal maximum value over all lanes.
Implemented by Overloads::max()
.
References Dune::Simd::max().
◆ max() [2/2]
auto Dune::Simd::max  (  const V &  v1, 
const V &  v2  
) 
The binary maximum value over two simd objects.
Implemented by Overloads::max()
.
Referenced by Dune::RestartedGMResSolver< X, Y, F >::apply(), and Dune::Simd::max().
◆ min() [1/2]
Scalar<V> Dune::Simd::min  (  const V &  v  ) 
The horizontal minimum value over all lanes.
Implemented by Overloads::min()
.
References Dune::Simd::min().
◆ min() [2/2]
auto Dune::Simd::min  (  const V &  v1, 
const V &  v2  
) 
The binary minimum value over two simd objects.
Implemented by Overloads::min()
.
Referenced by Dune::Simd::min().
◆ vio()
auto Dune::Simd::vio  (  const V &  v  ) 
construct a stream inserter
 Template Parameters

V The SIMD (mask or vector) type.
Construct an object that can be inserted into an output stream. Insertion prints the vector values separated by a comma and a space, and surrounded by angular brackets.