This subsection contains a slightly cleaned up version of a message from John Morris to the emc-developers list, 20130206, part of an ongoing dialogue under the subject-line: Future directions of LinuxCNC development [and RTAPI restructuring tutorial]. Any errors in transcription can be blamed on me, Kent Reed aka CNCDreamer.|
Following is a bit of a deep dive into what the RTAPI 'cleanups' are
about in this email. Apologies for the great length, but I implore
anyone both suspicious of the RTAPI work and also influential in setting
the direction of LinuxCNC's development to at least scan through so it
gets a fair shake.
I hope to show that the 'cleanups' really are just cleanups,
well-contained, introduce almost no new code or make semantic changes to
any existing code, do not touch the API, as Michael said preserve the
design and flow, and are conceptually simple.
At the same time, there are a number of benefits that I hope will be
evident. Adding new RT technologies is greatly simplified. Bugs in
what was highly duplicated boilerplate now need only be fixed in a
single place. And now there is the unexpected opportunity to create a
'unified binary' that will allow a single LinuxCNC binary to run on
multiple [ed: in the sense of different] thread systems without recompilation.
Idea behind the cleanups:
The current mainline code supports three thread systems: RTAI, sim
('posix' after the cleanups) and RTLinux. Any thread system
implementation must define several functions declared, documented and
grouped by function in src/rtapi/rtapi.h:
Looking inside the old RT implementations, it is clear that one was
written first, and following implementations simply copied
<threads>_rtapi.c and <threads>_ulapi.c into new files and replaced the
RT system-specific code, a small portion of the overall code. A 'diff
-u rtl_rtapi.c rtai_rtapi.c | diffstat' shows that 590 of ~1750 lines
differ between the two rtapi implementations, and ~200 of ~840 lines in
ulapi (the important differences are actually much smaller).
This same pattern holds for sim and the new PREEMPT_RT, Xenomai user and
Xenomai kernel thread systems. Furthermore, a close look back at
rtapi.h shows that many functions like shmem, timing and messaging
functions are duplicated yet again between RTAPI (kernel side) and ULAPI
(userland) within a single thread system. That's a lot of duplication
in two dimensions.
When Drill Sergeant Haberler barked the order to start cleaning up our
thread implementations (he held me responsible for the PREEMPT_RT code
after I brought it up to date last May), I couldn't see how to clean up
'just a little' while leaving this mess in place. The initial example
seemed easy (wrong), so I committed to cleaning the whole RTAPI system.
Basic organization of cleanups
Examine these two trees in Michael's git repo representing before and
after snapshots of the work, almost all done within the src/rtapi directory.
Before cleanups (with Xenomai and PREEMPT_RT threads):
After cleanups (today):
You'll first notice new files named after the groupings in rtapi.h. The
main rtapi function definitions, stripped of thread-specific code, have
been removed from the respective thread system files and merged into
generic files of more manageable size; e.g. messaging functions are now
in rtapi_msg.c and shared memory functions in rtapi_shmem.c.
Each of the thread systems has its own .c and its own .h file, e.g.
rtai-kernel.c and rtai-kernel.h (posix threads, formerly 'sim', are now
merged with rt-preempt-user). These source files retain only the thread
system-specific code, most in what I call 'hook functions' that are
called from the generic code. Formerly duplicated boilerplate is left
in the generic rtapi_<function>.c files.
Places where thread-specific code is stripped out of the common code
instead contain a call to the appropriate thread-specific 'hook
function'. For example, the generic rtapi_clock_set_period() function
(declared in rtapi.h and defined in rtapi_time.c) contains a call to
rtapi_clock_set_period_hook() that each kernel threads system must
define in its respective .c file. Because the boilerplate was virtually
identical, in every thread system the RT system-specific code was always
in the same place in each function, so this method of using hooks is
very clean, with zero or one hook per function, and one exception
And that's the basic idea. In many functional groups nearly all code
was the same and hooks were not needed at all. The messaging functions,
for instance, were identical for all systems except RTAI, which has its
own printk() function. In this case where a hook overcomplicates, the
build is controlled by generic macros that the exceptional thread system
header file may set ('#define RTAPI_PRINTK rt_printk' in this case);
otherwise a suitable default is used.
Here's a quick walk-through of how it works using rtapi_task_start() as
an example. This function is declared in rtapi.h (see line 477, 'After'
version) that all thread systems must implement. In all three kernel
thread styles (RTAI, RTLinux and Xenomai-kernel), the function starts
with nearly identical boiler plate that validates task ID and state,
checks for a running timer, and sets up pointers. RT system-specific
code then starts the task, and is finally followed by more almost
identical boiler plate. It is instantly obvious after an eyeball that
most of the code is generic: see xenomai_kernel_rtapi.c:691,
rtai_rtapi.c:764, and rtl_rtapi.c:837.
In the 'after' tree, the rtapi_task_start() definition has been moved to
rtapi_task.c:325. The boilerplate is intact, but on line 346 the thread
system code is replaced with a call to rtapi_task_start_hook(). This is
defined in rtai-kernel.c:156 and xenomai-kernel.c:256. RTLinux was left
out since I don't have that kernel in my menagerie, but it would be a
few hours' work to sift new rtlinux-kernel.[ch] files.
It doesn't simply please my aesthetic senses to chop out 4k lines of
duplicated code! The resulting merged boilerplate and separated thread
system-specific code naturally form a clean break that makes it easy to
see what minimal code is needed to add a new RT system. The RTAI code
is only 275 lines, compared to about 1700 lines of more dense and
daunting pre-cleanup code.
This clean split is also what made it apparent that a 'universal binary'
is just a small step away. All that's needed, as Michael wrote, is RT
kernel detection code (done), the means to load the right shared lib or
kernel module (done), and to define a struct for the interface to
replace #ifdef with runtime conditionals and provide pointers to the
hook functions instead of calling directly. I am particularly fond of
the unified binary feature because it simplifies packaging and doesn't
require the user to choose the correct package that matches the
One extra benefit I found while debugging my initial mistakes, it became
clear that any bugs found in the common code can now be fixed in one
place. No more need to manually apply fixes across several copies of
boilerplate. Related, a great many useful debugging statements and
comments that existed in some copies but not others were integrated into
the common code.