Running PREEMPT-RT on the Bubblegum-96

Introduction

Real time performance is crucial in motion control applications, which is one of my main research concerns. To achieve this on Linux, the patch PREEMPT-RT can be used to make the whole kernel preemptable (can be interrupted to yield time to time-critical applications).

The Bubblegum-96 is a high-performance 64-bit ARM development board by Actions Semiconductors. However, the vendor support from the manufacturer, uCRobotics, is terrible. The user community is inactive, too. Thus I have to be on my own to implement functionalities usually provided by the vendor.

Method and Issues

The vendor kernel is a highly patched 3.10.99 and none of these are mainlined. To apply RT patches, I need at least 3.10.100, which means two subsequent patches have to be applied, first the 3.10.99-100 patch, then the 3.10.100-rt110 patch.

Moreover, the PREEMPT_FULL_RT option is not available on ARM64 platforms with the 3.10 tree, thus I can only use up to level 4 Basic RT System of the RT options.

The process is not straightforward. In short, I have to first git apply -v <patch> to see the conflicting files. After that, a git blame will point out to the culprit. If the culprit is a vendor patch, a git revert is needed (later we will reapply these patches).

After reverting the vendor patches, patching becomes a tedious process of manually editing the patch files. Conflicts has to be reviewed logically, especially in the scheduling part(*sched*.c files). This is simplified much with Emacs's diff mode, which automatically fixes the offsets in the patch file, otherwise you will get a patch corrupted error message if you just use an ordinary text editor.

When your git apply -v shows no failed hunks, you can remove the -v argument and really apply the patch.

After the two patches are applied, we can reapply the reverted commit using git cherry-pick. We also need to resolve conflicts manually.

That's it! Now you have a fully baked RT-kernel :)

Benchmarks

Before testing the latency, be sure to turn off DVFS! Dynamic frequency scaling will totally ruin your real-time performance. This is done by:

sudo cpufreq-set -g performance
sudo cpufreq-set -f 1.6GHz

Now we can start the benchmarks:
Without background stress:

➜  rt-tests git:(master) ✗ sudo ./cyclictest -a -t -n -p99
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.38 0.64 0.81 1/145 19377

T: 0 (17893) P:99 I:1000 C: 125668 Min:      0 Act:   19 Avg:   14 Max:     158
T: 1 (17894) P:99 I:1500 C:  83778 Min:      0 Act:   19 Avg:   14 Max:     150
T: 2 (17895) P:99 I:2000 C:  62834 Min:      0 Act:   19 Avg:   14 Max:     117
T: 3 (17896) P:99 I:2500 C:  50267 Min:      0 Act:   19 Avg:   14 Max:      86

System under high load by stress -c 4 -i 1 -m 1 --vm-bytes 128M -t 100s

➜  rt-tests git:(master) ✗ sudo ./cyclictest -p 99 -t 4 -n -a
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 7.77 3.95 2.01 9/160 23279

T: 0 (21233) P:99 I:1000 C: 285193 Min:      0 Act:   16 Avg:   15 Max:     342
T: 1 (21234) P:99 I:1500 C: 190122 Min:      0 Act:   16 Avg:   16 Max:     320
T: 2 (21235) P:99 I:2000 C: 142596 Min:      0 Act:   16 Avg:   17 Max:     254
T: 3 (21236) P:99 I:2500 C: 114077 Min:      0 Act:   16 Avg:   15 Max:     190

You can see the latency is well below 400us, which is well suitable for normal control purposes.

Code

See https://github.com/ProfFan/bubblegum96-linux-rt.