[Home]TroubleShooting

LinuxCNCKnowledgeBase | RecentChanges | PageIndex | Preferences | LinuxCNC.org

Troubleshooting

1. LinuxCNC doesn't run - wrong kernel
2. LinuxCNC doesn't run - missing lapic
3. Checking the RealTime subsystem
3.1. RT device files
4. Unexpected realtime delay; check dmesg for details
4.1. RTAI Latency test
4.1.1. On-board video
4.1.2. On-board audio
4.1.3. APM and ACPI bios settings
4.1.4. AMD APU UEFI settings
4.1.5. AMD APU kernel command line parameters
4.1.6. NVidia and ATI graphics cards
5. Display Issues
5.1. Using Vesa Drivers
5.2. Installing Software-based OpenGL
6. Also take note of hardware that isn't plugged in all the time...
7. Some Intel boards have issues with the SMI (System Management Interrupt).
8. PC speaker module (pcspkr)
9. Additional potential source of latency problems
10. Parallel port no longer works in EMC 2.0.1 or later (hal_parport: Device or resource busy)
11. Parallel port no longer works in EMC 2.0.1 or later (emc starts but motors don't turn)
12. Stepper motors lose steps
13. Mesa 5i20 FPGA firmware and/or driver won't load
14. Printing to parallel-port printers does not work
15. Testing Parallel Port Outputs
16. USB thumb drive (flash drive) FAT corrupted
17. Dedicating a CPU Core via the isolcpus Boot parameter
18. Other tricks to improve Real Time performance documented elsewhere in this wiki

1. LinuxCNC doesn't run - wrong kernel

Error messages like this:
....
Realtime system did not load
....
Debug file information:
insmod: can't read
'/usr/realtime-2.6.24-19-generic/modules/rtai_hal.ko': No such file or directory
This error means the wrong kernel has been booted (in this case 2.6.24-19-generic). Always make sure you have a rtai patched kernel selected at boot time (2.6.32-122-rtai for Lucid/10.04, 2.6.24-16-rtai for Hardy/8.04, 2.6.15-magma for Dapper/6.06).

2. LinuxCNC doesn't run - missing lapic

This happens most usually on Ubuntu Lucid/10.04. Error messages like this:
....
Realtime system did not load
....
Debug file information:
insmod: error inserting '/usr/realtime-2.6.32-122-rtai/modules/rtai_hal.ko': -1 Operation not permitted

Additionally the following lines are found in dmesg:

....
[    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic" 
....
[   54.798391] RTAI[hal]: ERROR, LOCAL APIC CONFIGURED BUT NOT AVAILABLE/ENABLED.

This means your computer has a Local APIC, but it is not enabled. RTAI expects it to be present and enabled to be able to work. You can force it to on by editing:

in /etc/default/grub, change

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

to 

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash lapic"

afterwards run "sudo update-grub" and restart.

3. Checking the RealTime subsystem

LinuxCNC uses a RealTime operating system in order to ensure precise timing of i/o signals and trajectory calculations. Error messages like one of the following may indicate that the realtime operating system is unreachable:
Can't write to /dev/rtai_shm - aborting
RTAPI: ERROR: could not open shared memory
ERROR: Could not load 'rtapi'
insmod: error inserting '/lib/modules/2.6.12.6-magma/rtai/rtai_up.ko': 
-1 Operation not permitted
A script is provided with LinuxCNC to load and unload all the realtime modules, and is much quicker for troubleshooting than starting and stopping all of them individually. To use it, type:
 /etc/init.d/realtime 

3.1. RT device files

There are several device files required for Linux applications such as LinuxCNC to talk to the RealTime OS. These device files have a reputation for getting deleted by udev. Make sure that these files exist:

ubuntu:$ ls -l /dev/rt*
lrwxrwxrwx  1 root root 8 2006-05-23 23:34 /dev/rtai_shm -> RTAI_SHM
crw-rw----  1 root audio 10, 135 2006-05-14 16:12 /dev/rtc

ubuntu:$ ls -l /dev/RTAI_SHM 
crw-rw-rw-  1 root root 10, 254 2006-05-23 23:34 /dev/RTAI_SHM

Or, your system may look something like this:

fedora]$ ls -l /dev/rt*
crw-rw-rw-  1 root root  10, 254 Jul 18  2005 /dev/rtai_shm
crw-r--r--  1 root root  10, 135 Jun  4 16:21 /dev/rtc
crw-rw-rw-  1 root root 150,   0 Jul 18  2005 /dev/rtf0
crw-rw-rw-  1 root root 150,   1 Jul 18  2005 /dev/rtf1
crw-rw-rw-  1 root root 150,   2 Jul 18  2005 /dev/rtf2
crw-rw-rw-  1 root root 150,   3 Jul 18  2005 /dev/rtf3
crw-rw-rw-  1 root root 150,   4 Jul 18  2005 /dev/rtf4
crw-rw-rw-  1 root root 150,   5 Jul 18  2005 /dev/rtf5
crw-rw-rw-  1 root root 150,   6 Jul 18  2005 /dev/rtf6
crw-rw-rw-  1 root root 150,   7 Jul 18  2005 /dev/rtf7
crw-rw-rw-  1 root root 150,   8 Jul 18  2005 /dev/rtf8
crw-rw-rw-  1 root root 150,   9 Jul 18  2005 /dev/rtf9
If these files do not exist, you can create them, for example: sudo mknod /dev/RTAI_SHM c 10, 254;

4. Unexpected realtime delay; check dmesg for details

Starting in LinuxCNC (then EMC2) version 2.0.4, the number of CPU cycles between invocations of the real-time motion thread is tracked. If you see this message, it usually indicates that some element of your hardware is incompatible with the real-time software. See the next section for more information about possible causes and remedies.

4.1. RTAI Latency test

To specifically test the operation of the RealTime kernel, use the kernel latency test supplied with RTAI.
DO NOT TRY TO RUN LinuxCNC WHILE THE TEST IS RUNNING

For newer versions there is a graphical latency test and you can just click to start it, but on Ubuntu Dapper/6.06 this is accomplished by doing this:

sudo mkdir /dev/rtf; sudo mknod /dev/rtf/3 c 150 3;
sudo mknod /dev/rtf3 c 150 3; 
cd /usr/realtime*/testsuite/kern/latency; ./run

and then you should see something like this:

ubuntu:/usr/realtime-2.6.12-magma/testsuite/kern/latency$ ./run 
*
*
* Type ^C to stop this application.
*
*

## RTAI latency calibration tool ##
# period = 100000 (ns) 
# avrgtime = 1 (s)
# do not use the FPU
# start the timer
# timer_mode is oneshot

RTAI Testsuite - KERNEL latency (all data in nanoseconds)
RTH|    lat min|    ovl min|    lat avg|    lat max|    ovl max|   overruns
RTD|      -1571|      -1571|       1622|       8446|       8446|          0
RTD|      -1558|      -1571|       1607|       7704|       8446|          0
RTD|      -1568|      -1571|       1640|       7359|       8446|          0
RTD|      -1568|      -1571|       1653|       7594|       8446|          0
RTD|      -1568|      -1571|       1640|      10636|      10636|          0
RTD|      -1568|      -1571|       1640|      10636|      10636|          0

There should be no overruns, and "lat max" should probably be below your BASE_PERIOD setting.

Some issues with hardware that cause the RT kernel to not work correctly:

4.1.1. On-board video

On-board video on some systems can sometimes cause latency problems. However, on many systems it is not a problem. If you have an alternative graphics card available then it might well be worth experimenting with using it.

Avoid anything nvidia. Old matrox (millenium, G400, G450 era) work great. Some older ATI work great, not so sure about recent stuff.

4.1.2. On-board audio

Disable it if you have no plans to use audio. Even if you don't use it by playing music or videos, system noises such as bells or notifications can invoke the audio drivers.

4.1.3. APM and ACPI bios settings

Turn everything off that you can. Any power saving, anything related to suspending, cpu frequency scaling, etc. Enabling ACPI could be the only way to get access to local APIC timer (much lower latency than PIC), as new mainboards do not come with legacy support of MPTABLEs. Disable C1E power-saving feature in BIOS (could save about ~10-15ms on recent CPUs), this feature is activated regardless of ACPI or APM, thus needs to be disabled independently. See [Wikipedia] for more information on this.

4.1.4. AMD APU UEFI settings

CPU Frequency - Set to Highest Stable or Lower

Disable - Core Performance Boost, Turbo CPB, Cool&Quiet, SVM Mode, C6 Mode, APM

All power management and core speed stepping should be off. The CPU cores should be kept running at their max frequency or latency will be affected by the slowing or stopping of the cpu cores when idle time is detected.

4.1.5. AMD APU kernel command line parameters

As of this writing (Aug. 2014) the hardware accelerated driver for the AMD APU's with the Northern Islands (HD 6XXX) and newer graphics cores causes large latency spikes. The fix for the time being is to disable the hardware accelerated driver and use llvmpipe instead. Disabling kernel mode settings in the kernel command line will make it fallback to using llvmpipe.

The kernel command line parameter to disable the hardware driver is:

radeon.modeset=0

Another fix for poor latency on AMD APU's is adding the isolcpus=x kernel parameter with RTAI kernels. For more information, see: The Isolcpus Boot Parameter And GRUB2.

4.1.6. NVidia and ATI graphics cards

Your machine should have smooth movement while opening and closing windows on linux. If not there may be an issue with the RT kernel, or the way it works together with your setup. First try using the open-source driver (e.g., "nouveau" (formerly "nv") instead of "nvidia") and then try using the unaccelerated driver ("vesa"). You can also try the latest proprietary driver appropriate for your system (if one is offered), but be sure to check it for acceptable performance on the latency test.

5. Display Issues

If you see part of the Axis tool path display missing, or just plain weird, or if the mouse cursor is a blob, or even if the whole machine freezes while showing video problems, you may be seeing the effects of incompatibility between your display driver and LinuxCNC.

There are two possible solutions. Either edit the xorg.conf file to use the generic slow but reliable Vesa driver (LinuxCNC is not graphics intensive, this won't matter), or an alternative software-only OpenGL? driver (see Installing Software-based OpenGL?, below.)

5.1. Using Vesa Drivers

To edit xorg.conf you need to have one. This file is optional in Ubuntu Lucid/10.04. if you can not see an xorg.conf in /etc/X11/ then one can be created by the command:
sudo Xorg -configure

Note:- You cannot use this command with an existing X Server running. In order to generate xorg.conf you need to switch to a non-graphical console using the key combination CTRL + ALT + F1 (F2..F3..etc) Use the key combination CTRL + ALT + F7 to switch back to the graphical console once the X server is running again.

Now execute the following commands:

This command will stop the X server.

sudo service gdm stop

Now we need to generate the xorg.conf file:

sudo Xorg -configure

This has generated the file in ~/xorg.conf.new. We need to tell the X server to use it, so we have to put this file inside /etc/X11/

sudo mv ~/xorg.conf.new /etc/X11/xorg.conf

Now try to restart the X server and see what happens:

sudo service gdm start

So long as X starts again you can now edit /etc/X11/xorg.conf and then re-boot to see the changes. If X fails to start, changing the name will allow you to re-start the X server on the old settings.

sudo mv /etc/X11/xorg.conf /etc/X11/xorg.conf.failed

If you do have an existing xorg.conf then make a copy of it so that you can undo the changes:

sudo cp /etc/X11/xorg.conf /etc/X11/xorg.conf.bak

Then open xorg.conf using this terminal command

sudo gedit /etc/X11/xorg.conf
and look down until you find the following lines.

     Section "Device
            Identifier   "Configured Video Device"

Add the following line immediately below this.

            Driver "vesa" 

Alternatively you might find that a similar line contains a reference to an existing video driver (for example "r128" for a Rage 128) edit that by swapping the name of the driver to "vesa"

Save the file and restart the X system by holding down the keyboard keys <control> and <alt> and tapping <Backspace>. You'll have to log in again when X restarts.

If the display does not come up properly and you are dropped into text mode you should be able to log in using your name and password and issue the following commands.

sudo rm /etc/X11/xorg.conf
sudo cp /etc/X11/xorg.conf.bak /etc/X11/xorg.conf
You may be able to restart the X server using this command.
sudo gdm
If not you'll have to reboot and try the edit again. As always, YMMV.

Note:- If you have problems with the vesa driver default resolution being too low, users have found that explicitly specifying the refresh rates allowed them to achieve the desired resolution, e.g.,

....
Section "Monitor"
	Identifier	"Configured Monitor"
	HorizSync	30-60
	VertRefresh	60-75
EndSection
...

Important - do not guess these, or you will damage your monitor, use xrandr to get the exact figures

5.2. Installing Software-based OpenGL?

Sometimes Axis locks on startup. It's caused (in most cases) with interference of RTAI and hardware accelerated OpenGL? rendering. One workaround is to use VESA video driver as mentioned above. But some TFT monitors refuse to work with default refresh rate of VESA driver. As an alternative to fiddling with settings of the Vesa driver is to install an alternative software-only OpenGL? rendering package is to install the libgl1-mesa-swx11 package It replaces libgl1-mesa-glx. As the ubuntu-desktop metapackage depends on libgl1-mesa-glx and a number of other packages it will be removed. This does not matter for emc2.
sudo apt-get install libgl1-mesa-swx11

6. Also take note of hardware that isn't plugged in all the time...

There have been issues with usb key chain drives not playing well. Recent versions of LinuxCNC do latency tests on the fly, which will help in finding uncooperative hardware.

7. Some Intel boards have issues with the SMI (System Management Interrupt).

The way to address this is described in FixingSMIIssues.

8. PC speaker module (pcspkr)

PC speaker module causes some problems for some systems. To prevent to load pcspkr in Debian Etch (and maybe in Ubuntu):
su -
echo install pcspkr /bin/true >/etc/modprobe.d/rtai
rmmod pcspkr

9. Additional potential source of latency problems

Discussed [here] .

10. Parallel port no longer works in EMC 2.0.1 or later (hal_parport: Device or resource busy)

Those who are not using the official emc2 packages for Ubuntu may encounter this error when starting EMC:
 insmod: error inserting '/home/jepler/src/emc2/rtlib/hal_parport.ko': -1 Device or resource busy

If you encounter this problem, you must make sure that the Linux kernel module parport_pc is not loaded at boot time. On Ubuntu systems, you can do this by creating a file in /etc/modprobe.d/ with the one line

 install parport_pc /bin/true

The official packages put this line in the file /etc/modprobe.d/emc2. Different Linux distributions may have a different method to stop parport_pc from loading.

11. Parallel port no longer works in EMC 2.0.1 or later (emc starts but motors don't turn)

Some users have reported that in EMC 2.0.1, EMC appears to start, but no signals ever appear on the parallel port (the motors don't turn).

To fix this problem, upgrade to EMC 2.0.3 or newer. Then, add the command

 loadrt probe_parport
before the line loading hal_parport in your hal file.

In EMC 2.0.1, the Linux parallel port driver must be completely disabled for EMC's hal_parport driver to load. However, as we learned after the release of 2.0.1, this disabled certain parallel ports that are "PNP" (plug and play) devices. The new probe_parport realtime module performs the probing for one type of PNP port. If probe_parport doesn't allow your card to work, please contact the EMC developers.

12. Stepper motors lose steps

See TweakingSoftwareStepGeneration.

13. Mesa 5i20 FPGA firmware and/or driver won't load

First, you gotta have a 5i20 board ;-)

Assuming that's not the issue, one user reported that his board wasn't enabled by the BIOS at boot time. Running "lspci -v" revealed this (info for other devices has been snipped):

0000:00:0e.0 Bridge: PLX Technology, Inc. PCI <-> IOBus Bridge Hot Swap
	Subsystem: PLX Technology, Inc.: Unknown device 3131
	Flags: medium devsel, IRQ 9
	Memory at feddfc00 (32-bit, non-prefetchable) [disabled] [size=128]
	I/O ports at fc00 [disabled] [size=128]
	I/O ports at f400 [disabled] [size=256]
	I/O ports at f800 [disabled] [size=256]
	Memory at fede0000 (32-bit, non-prefetchable) [disabled] [size=64K]
	Memory at fedf0000 (32-bit, non-prefetchable) [disabled] [size=64K]
	Capabilities: <available only to root>

The important item is "[disabled]" in the "I/O ports" and "Memory" lines. The user was able to solve the problem by setting "Plug-N-Play OS" in the BIOS setup screen to "NO" and rebooting. Googling finds that "NO" is probably the correct answer to that BIOS question anyway, see [this].

14. Printing to parallel-port printers does not work

By default, the LinuxCNC/EMC2 package for Ubuntu disables the linux parport driver entirely, because the linux parport driver interferes with LinuxCNC/EMC2 hardware drivers that use the parallel port.

To reenable the linux parport driver and disable LinuxCNC/EMC2 hardware drivers that use the parallel port, remove the line "install parport_pc /bin/true" from the file /etc/modprobe.d/emc2 , and reboot or otherwise cause the parport-related linux kernel modules to be loaded.

If your system has multiple parallel ports, you can also use some for linux and the others for LinuxCNC/EMC2. In this case, remove the "install" line as above and add an "options parport_pc" line to /etc/modprobe.d/emc2 which gives the correct I/O addres(es) for the ports to be used by Linux. For the format of the options line, see the [linux kernel documentation].

15. Testing Parallel Port Outputs

Not sure if a parallel port output is working on the hardware level?

Turn off power to the steppers or unplug the motors first.

Either use the Parallel Port Tester config files or:

From the Hal Configuration Screen drill down to Pins/parport/0 . Now you see all the pins for your parallel port. You must unlink the pin to manually turn it on and off if it is linked. In the Test HAL command:

unlinkp parport.0.pin-nn-out
Where nn is 01 02 etc. Do this for each pin you want to test. The unlinkp is unlink pin. Now to toggle each pin on and off use the command:
setp parport.0.pin-nn-out 1

The setp is set pin. The 1 turns it on, a 0 turns it off.

16. USB thumb drive (flash drive) FAT corrupted

I fixed 'unexpected realtime delays' caused by a USB thumb drive.

The error was repeatable during 3 reboots, dmesg showed no problems, but messages reported that when the thumb drive was mounted there were errors detected, and suggested running e2fsck.

This thumb drive had 2 partitions, one was ext2 and the other was vfat.

I ran e2fsck on the ext2 partition, then inspected (but didnt change) the vfat partition. I used fsck.vfat to look at the vfat partition, but didn't let it change anything. (Reason, some old memory of 'always let the native OS fix drive errors'.) Then I let Vista check & repair the vfat partition.

Now I can mount and unmount the device, read & write to it, play music from it, read pdfs from it, and no unexpected realtime delays occur.

tom3p 11jan2010

17. Dedicating a CPU Core via the isolcpus Boot parameter

Its has been seen that in systems with more that one cpu or core, RTAI uses only the highest number CPU (starting at 0). Setting the kernel parameter isolcpus=highest_cpu_number reserves this CPU for RTAI real-time processes. This can improve real-time performance in some cases. The Isolcpus Boot Parameter And GRUB2 explains more about this.

18. Other tricks to improve Real Time performance documented elsewhere in this wiki

Read "How to improve RealTime performance" here


LinuxCNCKnowledgeBase | RecentChanges | PageIndex | Preferences | LinuxCNC.org
This page is read-only. Follow the BasicSteps to edit pages. | View other revisions
Last edited January 16, 2016 3:01 am by Andyough (diff)
Search:
Published under a Creative Commons License