File openmp.cpp
-
namespace wasm
SYSCALL NUMBERING
Have a look in the sysroot at include/bits/syscall.h to determine the system call numbering.
Enums
Functions
-
static std::shared_ptr<faabric::transport::PointToPointGroup> getExecutingPointToPointGroup()
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_thread_num", I32, omp_get_thread_num)
- Returns
the thread number, within its team, of the thread executing the function.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_num_threads", I32, omp_get_num_threads)
- Returns
the number of threads currently in the team executing the parallel region from which it is called.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_max_threads", I32, omp_get_max_threads)
This function returns the max number of threads that can be used in a new team if no num_threads value is provided.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_level", I32, omp_get_level)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_max_active_levels", I32, omp_get_max_active_levels)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_set_max_active_levels", void, omp_set_max_active_levels, I32 maxLevels)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_push_num_threads", void, __kmpc_push_num_threads, I32 loc, I32 globalTid, I32 numThreads)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_set_num_threads", void, omp_set_num_threads, I32 numThreads)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_global_thread_num", I32, __kmpc_global_thread_num, I32 loc)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_wtime", F64, omp_get_wtime)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_barrier", void, __kmpc_barrier, I32 loc, I32 globalTid)
Synchronization point at which threads in a parallel region will not execute beyond the omp barrier until all other threads in the team complete all explicit tasks in the region. Concepts used for reductions and split barriers.
- Parameters
loc –
global_tid –
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_critical", void, __kmpc_critical, I32 loc, I32 globalTid, I32 crit)
Enter code protected by a
criticalconstruct. This function blocks until the thread can enter the critical section.- Parameters
loc – source location information.
global_tid – global thread number.
crit – identity of the critical section. This could be a pointer to a lock associated with the critical section, or some other suitably unique value. The lock is not used because Faasm needs to control the locking mechanism for the team.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_end_critical", void, __kmpc_end_critical, I32 loc, I32 globalTid, I32 crit)
Exits code protected by a
criticalconstruct, releasing the held lock. This function blocks until the thread can enter the critical section.- Parameters
loc – source location information.
global_tid – global thread number.
crit – compiler lock. See __kmpc_critical for more information
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_flush", void, __kmpc_flush, I32 loc)
The omp flush directive identifies a point at which the compiler ensures that all threads in a parallel region have the same view of specified objects in memory. Like clang here we use a fence, but this semantic might not be suited for distributed work. People doing distributed DSM OMP synch the page there.
- Parameters
loc – Source location info
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_master", I32, __kmpc_master, I32 loc, I32 globalTid)
Note: we only ensure the master section is run once, but do not handle assigning to the master section.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_end_master", void, __kmpc_end_master, I32 loc, I32 globalTid)
Only called by the thread executing the master region.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_single", I32, __kmpc_single, I32 loc, I32 globalTid)
Test whether to execute a single construct. There are no implicit barriers in the two “single” calls, rather the compiler should introduce an explicit barrier if it is required.
- Parameters
loc –
globalTid –
- Returns
1 if this thread should execute the single construct, zero otherwise.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_end_single", void, __kmpc_end_single, I32 loc, I32 globalTid)
See comment on __kmpc_single
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_fork_call", void, __kmpc_fork_call, I32 locPtr, I32 nSharedVars, I32 microtaskPtr, I32 sharedVarPtrs)
The LLVM version of this function is implemented in the openmp source at: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/kmp_csupport.cpp
It calls into __kmp_fork call to do most of the work, which is here: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/kmp_runtime.cpp
The structs passed in are defined in this file: https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/kmp.h
Arguments:
locPtr = pointer to the source location info (type ident_t)
nSharedVars = number of non-global shared variables
microtaskPtr = function pointer for the microtask itself (microtask_t)
sharedVarPtrs = pointer to an array of pointers to the non-global shared variables
NOTE: the non-global shared variables include:
those listed in a shared() directive
those listed in a reduce() directive
-
template<typename T>
void for_static_init(I32 schedule, I32 *lastIter, T *lower, T *upper, T *stride, T incr, T chunk)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_for_static_init_4", void, __kmpc_for_static_init_4, I32 loc, I32 gtid, I32 schedule, I32 lastIterPtr, I32 lowerPtr, I32 upperPtr, I32 stridePtr, I32 incr, I32 chunk)
The functions compute the upper and lower bounds and strides to be used for the set of iterations to be executed by the current thread.
The guts of the implementation in openmp can be found in __kmp_for_static_init in runtime/src/kmp_sched.cpp
See sched_type for supported scheduling.
- Parameters
loc – Source code location
gtid – Global thread id of this thread
schedule – Scheduling type for the parallel loop
lastIterPtr – Pointer to the “last iteration” flag (boolean)
lowerPtr – Pointer to the lower bound
upperPtr – Pointer to the upper bound of loop chunk
stridePtr – Pointer to the stride for parallel loop
incr – Loop increment
chunk – The chunk size for the parallel loop
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_for_static_init_8", void, __kmpc_for_static_init_8, I32 loc, I32 gtid, I32 schedule, I32 lastIterPtr, I32 lowerPtr, I32 upperPtr, I32 stridePtr, I64 incr, I64 chunk)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_for_static_fini", void, __kmpc_for_static_fini, I32 loc, I32 gtid)
Called to start a reduction.
-
void endReduceCritical(faabric::Message *msg, bool barrier)
Called to finish off a reduction.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_reduce", I32, __kmpc_reduce, I32 loc, I32 gtid, I32 numReduceVars, I32 reduceVarsSize, I32 reduceVarPtrs, I32 reduceFunc, I32 lockPtr)
This function is called to start the critical section required to perform the reduction operation by each thread. It will then call __kmpc_end_reduce (and its nowait equivalent), when it’s finished.
It seems that in our case, always returning 1 for both kmpc_reduce and kmpc_reduce_nowait gets the right result.
In the OpenMP source we can see a more varied set of return values, but these are for cases we don’t yet support (notably teams): https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/kmp_csupport.cpp
Note that the reduce vars passed into this function are the LOCAL copies on the thread’s own stack used to hold intermediate results. There is apparently no way to get a reference to the final destination of the reduction result in this function, that is only known in kmpc_fork_call.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_reduce_nowait", I32, __kmpc_reduce_nowait, I32 loc, I32 gtid, I32 numReduceVars, I32 reduceVarsSize, I32 reduceVarPtrs, I32 reduceFunc, I32 lockPtr)
See __kmpc_reduce
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_end_reduce", void, __kmpc_end_reduce, I32 loc, I32 gtid, I32 lck)
Finalises a blocking reduce, called by all threads.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__kmpc_end_reduce_nowait", void, __kmpc_end_reduce_nowait, I32 loc, I32 gtid, I32 lck)
Finalises a non-blocking reduce, called by all threads
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_get_num_devices", int, omp_get_num_devices)
Get the number of devices (different CPU sockets or machines) available to that user
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "omp_set_default_device", void, omp_set_default_device, int defaultDeviceNumber)
Switches between local and remote threads.
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__atomic_load", void, __atomic_load, I32 a, I32 b, I32 c, I32 d)
- WAVM_DEFINE_INTRINSIC_FUNCTION (env, "__atomic_compare_exchange", I32, ___atomic_compare_exchange, I32 a, I32 b, I32 c, I32 d, I32 e, I32 f)
-
void ompLink()
-
static std::shared_ptr<faabric::transport::PointToPointGroup> getExecutingPointToPointGroup()