Machinekit

Machinekit

Threads

TODO: newthread documentation

Latency Tuning

newthread cgname=/rt

On some SMP systems, a HAL thread’s latency measurements (using latency-test) have been seen to improve by an order of magnitude when it is run by itself on its own CPU and L2 cache.

While the author has data on few CPUs, a well-tested example is the Intel J1900.
This CPU has four cores, with the first two and the second two each sharing a L2 cache.
Running an RT_PREEMPT HAL thread normally, without dedicated CPU, latency reached a maximum of about about 40us; dedicating only CPU 2 or 3 yielded the same results.

Dedicating both CPUs 2 and 3, which effectively dedicated one of the L2 caches to the thread, maximum latency reached about 6us, with similar results for POSIX (non-RT) threads.

Running a HAL thread on particular CPUs is done with a Linux cgroup cpuset in two parts.

The cpuset must first be set up outside of HAL, assigning CPUs to it and preventing Linux from scheduling processes in the cpuset.
Then in the HAL configuration, the thread must be configured to run on the cpuset.

Setting up a cpuset

The set of CPUs to dedicate to HAL threads must be decided. Ideally, just one CPU should be dedicated to one HAL thread; however, sharing cache may hurt latency (as in the above Intel J1900 example).

Consult CPU documentation and run tests to determine whether additional CPUs must be dedicated to the cpuset.

Determine the string specifying the CPU list. CPU numbers start from zero, and group in ranges and lists. Valid examples include 3, 2-3, 2,3, 5-7,2.

Determine the cgroup name. A good choice is /rt.

The following examples assume a 4-core CPU with CPUs 2,3 in the /rt cpuset.

The dedicated cpuset may be initialized outside of HAL using one of the following two known methods.

Dedicated cpuset using partrt script

The /rt cpuset with dedicated CPUs may be set up manually, in one shot, with the partrt script from the rt-tools project.

Build and install that project, then run partrt create \#2,3.

If this succeeds, partrt list should show two groups, including the rt group with configured CPUs, and an nrt group with all other CPUs.

This is easily undone with partrt undo.

Dedicated cpuset using kernel isolcpus cmdline

CPUs may be isolated from the Linux scheduler at boot time using the isolcpus kernel command line argument, and setting up the cpuset during system init.

This approach is more complex, but is automatic and may do a better job of isolating CPUs than partrt (this claim needs verification).

On Debian, edit /etc/default/grub and add the CPU list to the grub command line:

GRUB_CMDLINE_LINUX_DEFAULT="[...] isolcpus=2,3"

Update the grub configuration with sudo update-grub. After a reboot, the argument should be visible in /proc/cmdline.

The /rt cgroup may be set up in system init scripts. In Debian, install the cgroup-tools package. Create /etc/cgconfig.conf with the following contents:

group rt {
    perm {
        task {
            uid = root;
            gid = root;
        }
        admin {
            uid = root;
            gid = root;
        }
    }
    cpuset {
        cpuset.cpus = 2,3;
        cpuset.mems = 0;
        cpuset.cpu_exclusive = 1;
        cpuset.sched_load_balance = 0;
    }
}

Create the /lib/systemd/system/cgred.service file:

[Unit]
Description=Control Group configuration service

# The service should be able to start as soon as possible,
# before any 'normal' services:
DefaultDependencies=no
Conflicts=shutdown.target
Before=basic.target shutdown.target

[Service]
Type=oneshot
RemainAfterExit=yes
Delegate=yes
ExecStart=/usr/sbin/cgconfigparser -l /etc/cgconfig.conf -s 1664
ExecStop=/usr/sbin/cgclear -l /etc/cgconfig.conf -e

[Install]
WantedBy=sysinit.target

Then configure to start automatically with

sudo systemctl enable cgred.service

After reboot, lscgroup cpuset:/ should show the /rt cgroup, and cgget -g cpuset /rt should show the configured cpuset.cpus.

Disable these by removing isolcpus from /etc/default/grub and re-running sudo update-grub, running

sudo systemctl disable cgred.service

and rebooting.

Verify that isolcpus is absent from /proc/cmdline and running lscgroup cpuset:/ shows no cpuset:/rt cgroup.

Running a HAL thread

Once the cpuset is configured, run a HAL thread in the cpuset in the HAL configuration with newthread cgname=/rt.

Testing

Run the latency-test script in the cgroup with

CGNAME=/rt latency-test

Run ps -Leo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,comm,args, checking the PSR column.

The only threads running on the /rt cpuset CPUs should be some number of kernel threads (surrounded by square braces in the COMMAND column, e.g. [kworker/2:0]) and the RT HAL threads (rtapi:0 in the COMMAND column).

newthread cpus=3

TODO: Document the cpus thread affinity argument, how it compares with cpusets, and how it interacts with cpusets.