Profiling

Microbenchmarks

Microbenchmarks are useful for assessing the raw function throughput or isolation overhead of Faasm, and can be run using the microbenchmark_runner target.

Once built, usage is:

microbenchmark_runner <spec_file> <out_file>

Where the spec_file specifies which functions to run and for how many iterations. This is a CSV of the format:

<user_a>,<func_a>,<n_runs>,<input_data>
<user_b>,<func_b>,<n_runs>,<input_data>

For example,

demo,hello,100,
demo,echo,200,this is input data

The runner will write the results to the output file in the form:

<user>,<function>,<return_value>,<run_time_us>,<reset_time_us>

E.g.

demo,hello,0,25.4,0.12
demo,hello,0,29.0,0.18

These can then be parsed and plotted, as is done in the experiment-microbench repo.

Using Vector

To get a quick overview of how things are performing you can use Vector and Performance Copilot.

Note that at the time of writing (04/2021) the Vector docs talk about pcp-webapi, but it’s been replaced by pmproxy which is part of the main pcp bundle.

To set up:

Install PCP (apt install pcp -y)
Check that pmproxy is running and listening on 44323 (e.g. tail /var/log/pcp/pmproxy/pmproxy.log). May require a restart
Run the Vector container in our dev set-up, e.g. docker compose -f docker-compose-dev.yml vector
Go to http://localhost:80 in your browser
Add a connection to host localhost on port 44323
Select the faasm-cli container

You should then be able to use Vector to get some high-level performance metrics related to whatever you’re running in the faasm-cli container (e.g. some stress-testing script).

Using `perf`

It’s easiest to run perf on an out-of-container-build (see the dev docs on how to set this up).

Then you can set perf to run without sudo:

sudo sysctl -w kernel.perf_event_paranoid=-1
sudo sysctl -w kernel.kptr_restrict=0

And a standard profiling run:

perf record --call-graph dwarf func_runner demo echo

Off-CPU profiling

Off-CPU profiling can be done with Hotspot.

See their docs.

Flame graphs

There is a task in the Faasm CLI for creating Flame graphs which automatically include the disassembled WebAssembly function names.

Note that this requires the custom LLVM build described above.

# Make sure you can run and disassemble the functions
inv dev.cc func_runner
inv dev.cc func_sym

# Run the flame graph task (which will run perf, replace symbols etc.)
inv flame demo echo --reps=5000 --data="foobar"

# Open the flame graph in your browser
firefox flame.svg

You can use the search feature in the flame graph to find things related to wasm by searching (Ctrl+F) for wasm.

If you want to do custom set-up of a specific function, you can write an adapted version of the func_runner, to run your function, then pass it in as a command to inv flame:

inv dev.cc my_runner

inv flame <user> <func> --cmd="my_runner <args>"

firefox flame.svg

Profiling WebAssembly code with `perf`

You can use perf with a standard Faasm build, but this may have large gaps with perf.<PID>.map. This is because wasm code will have been built using the LLVM JIT libraries, which don’t include the perf events that we need.

To rebuild LLVM with the right flags, you can run:

./bin/build_llvm_perf.sh

Now you can rebuild the parts of Faasm you’re profiling, e.g.

# The --perf here to switch on the build against the custom LLVM
inv dev.cmake --perf --clean
inv dev.cc func_runner

Then do a profiling run with:

# Standard CPU profiling of demo/hello (too short for meaningful profile)
perf record -k 1 -F 99 -g func_runner demo hello

# Inject the JIT dumps into perf data
perf inject -i perf.data -j -o perf.data.jit

# View the report
perf report -i perf.data.jit

WebAssembly functions will be output with names like functionDef123, see the development docs on how to map these back to names in the source (using inv disas).

Note that if the perf notifier isn’t working, check that the code isn’t getting excluded by the pre-processor by looking at the WAVM LLVMModule.cpp file and grepping for WAVM_PERF_EVENTS.

You can also check the diff of the Faasm WAVM fork to see the changes that were made.

Once this is done you can use perf with JIT symbols as described here.

Execution Graphs

Faasm supports generating execution graphs with details about how many instances of each function ran, which host they where scheduled, for how long they ran, among others.

To generate an execution graph, first invoke the function whose execution you want to plot with the asynch and graph flag set.

inv invoke demo hello --asynch --graph

The call will return a call id. With the id you can query the function status:

inv invoke.status --call-id <CALL_ID>

Once the call has SUCCEDED you may generate the execution graph with:

inv invoke.exec-graph --call-id <CALL_ID>