Over most of 2011 I've worked on remapping codes in the interpreter, and several other issues. This has now reached steady state and a merge could be considered, barring outside review (done-merged into master; dust still settling). Note I'm not saying it has no bugs - but I'm taking on a maintainer promise to see it through. It is a large number of commits - currently some 763 commits on top of master. Many of those are cleanups; a few times backing out a dead end, and some commit messages/reasons I'm simply embarrassed by. I'm hesitant to put a better spin on history by rewriting history though - I'll take a ridiculous history over a fudged history any time.
The number of regression tests has been largely expanded, to currently 69 or so, most of them exercising the new features, some of them verifying bug fixes. They might be educational reading.
Configurations to explore would be under configs/sim/remap . Those aren't listed in the configuration selector - cd to the directory of the demo and run 'emc <the ini file>'.
A simulator build is good enough to exercise them.
If you are building yourself by pulling from git.linuxcnc.org, you need to 'sudo apt-get install libboost-python-dev' .
This is what the example configs do:
Since this work contains several areas of changes, here's a rundown and their status. It's also meant as an aide to reviewers (hint, hint, handwave!):
Remapping basically means 'you can define the behaviour of a code by means of an NGC Oword procedure'.
This entails two features:
The second item was the original motivation - namely to make toolchanging more configurable. It turned out that solving the general problem was more useful (and at times simpler) than just patching up Tx and M6, so I went for that.
Please note that the current support for user-defined toolchange in master (M6_COMMAND, T_COMMAND in the ini file) is badly broken and must be replaced anyway. It does not deal with queuebusters properly, which I didn't sufficiently understand at the time.
Remapping calls for glue code between the interpreter and the NGC procedure, which is handled by Embedded Python. This is new in EMC2 - there are several Python *extension* modules (Python is the main program, some C code called through a module; Embedded Python is the other way around: A C++ program is extended by Python methods). It is a very powerful extension method (apt-get would say: it has 'Super Cow Powers' ;-) but it isn't for the faint of heart. Towards that end I'm looking into some 'standard Python glue' and establishing standard calling conventions at least for the builtin codes, to make that part easier to use. I think in most cases folks wouldn't need to touch upon Python glue code even if it's used.
The most massive changes wrt remapping are in interp_oword.cc and rs274ngc_pre.cc, plus some new files (interp_remap.cc).
Embedded Python brings in a new build dependency, libboost-python1.40-dev. The code which exposes the interpreter and canon internals to Python is interpmodule.cc and canonmodule.cc respectively. Compiling interpmodule.cc takes very long - this is not an error; the massive use of C++ templates by Boost.Python makes g++ breathe heavily.
The homebrew data structures, in particular named params, Oword label offsets etc have been replaced by STL containers. That means that some fixed-size arrays are now gone; there is, for instance, no more hard limit on the number of Oword labels in a program. As a side effect struct _setup is quite a bit smaller now.
Banging on malloc() and free() is asking for trouble, or memory leaks for that matter. Towards that end all immutable strings are treated differently now: they are stored in a unique string table (see strstore()), which means string compare of immutable strings becomes a pointer comparison.
I would have liked to use the unordered_hash container for symbols - std::map is a bit of an overkill because sorted access isn't needed. But that requires compiling with the --c0xx flag, and that I was hesitant to do. However, changing to a different container is now trivial: an extra include, adapt declaration, done.
SELECT_POCKET() has been extended to convey both pocket AND tool number. This is based on my view that passing pockets around as a primary key into the tool table is a design defect which should be overcome eventually.
PLUGIN_CALL() and IO_PLUGIN_CALL() are experimental. INTERP_ABORT() probably will be deleted (done). This is superseded by the '(abort, msg)' feature (great idea, Chris!).
INI file variables can now be referenced read-only by the interpreter directly, without going through some pathetic HAL kludge (like setting a pin from an INI variable in a HAL file, and reading that pin with M66). The syntax is #<_ini[section]name>. Since this is potentially backwards incompatible (remote chance, but present) this has been made configurable. See the FEATURES ini variable. The same goes for reading HAL pins with #<_hal[component.pin]> similar as used by halscope, halshow and halcmd.
The generic embedded Python support is in src/emc/pythonplugin. This is used by the interpreter, and task (experimentally).
There have been long discussions on the role of iocontrol. My take is that it is on the way out and should be replaced by Embedded Python as outlined in this branch (see the configs/sim/remap/iocontrol-removed config how that works).
However, this really makes only sense in the context of further work items, which I plan to work on (see below). Only then this feature starts making real sense. Since it is configurable and deactivated by default, it does no harm (ha!), which is why I suggest leaving it in.
Some of the features I wasn't sure about, some are potentially backwards incompatible. So I made them configurable - see http://emc.mah.priv.at/docs/remap/html/remap/structure.html#_optional_interpreter_features_ini_file_configuration.
The standalone interpreter, aka 'sai' (src/emc/sai), which is used mostly for regression tests, now sports a '-i <inifile>' option. That was was essential for remap regression tests, and reduces the excuse potential for not writing regression tests 'because those can't be tested with rs274'.
All of the remapping stuff is documented, but I think it's not in final shape - I guess this needs to be split into a 'Configuring Toolchange' and 'Remapping reference' section. The current single file mixes too many themes at too much varying level of detail. For the milltask plugin currently only headlines exist.
The way interpreter internals and canon are exposed currently requires reading of interpmodule.cc and canonmodule.cc. But then I'm in good company wrt to the level of documentation of emcmodule.cc and gcodemodule.cc ;-)
I keep formatted documentation around at http://emc.mah.priv.at/docs/remap/html/remap/structure.html .
Working on this for a while I've tripped over a few fundamental issues with EMC2 which I think should be addressed, and which I plan to address. Warning: longish flames follow. This dives a bit into internals.
The interpreter currently is used in a 'read a line, then execute that line' fashion, and the return code is inspected to decide what to do - in particular, how to handle errors and queuebuster operations. Then task 'deduces' what the state of the interpreter is.
This was a great idea when the interpreter had no o-word control structures (if/then/else, subs etc) because from a language perspective each line (aka block) was a self contained item. In lanuage terms (barring expressions, but those are intra-line), RS274NGC was pretty much a regular language.
That changed by introduction of context-free features, like Oword control structures, which require a stack automaton as a recognizer. Suddenly, you couldn't read()/execute() by just by looking at an isolated block, you had to look at the stack as well (and task needs to peep into the interpreter internals, like _setup.call_level).
That means that the read-block/execute-block usage model became inadequate at that point. However, either that issue wasn't recognized, or ignored - the consequence was a set of kludges around the issue. To see why this was a bunch of kludges, just see the bugtracker and search for Oword and MDI bugs - most of those bugs directly derive from this deficiency.
Fact is: the read-block/execute-block usage model is completely broken - it is upside down. Instead of reading a block, and trying to figure out what to do next and what the interpreter's state could be at that point, it should be inverted:
The interpreter should be given a file, or a string to execute, and 'execute it on its own' until done, or some action on using the code is needed. Properly done, this could be either a coroutine-like approach, or making the interpreter a thread in the first place - the approach I favor.
When the interpreter hits a queuebuster, or terminates on an error, it should signal that by setting a condition variable like in pthreads, describe its state, and wait. This would in effect completely linearize, and simplify interpreter execution.
Introducing threads in a non-threaded program always is problematic, and at the interplist needs to be protected. However, this would dramatically simplify the handling of control structures and queuebusters in the interpreter, and with it reduce the potential for bugs.
Introducing remapping just added to the extent of the problem. I think I have worked around them. Still I think this should be remedied.
The interpreter is a C++ class. It would be useful to have more than one instance at hand at times; for instance, have a separate instance for handling MDI commands, or executing a remapped code for that matter.
The way interpreter state is structured is - well - 'less than enlightened' with its (mostly static!) struct _setup, which pretty much defeats useful instantiation (for instance, sharing O-word execution state between instances is a complete showstopper for instantiation - execution of code in one instance tramples upon the state of the other instance).
this is intertwined with the next issue:
emcStatus is really the peek on the EMC2 world model. It is suitable for hard realtime, but it is severely limited by two issues:
This is why several items of what would properly belong into the world model are left out and are therefore per-instance variables in different processes, although they should have been shared to start with.
Impact examples are:
The work items I see here are:
The second to last item sounds complex, but isn't, in my view. There is a good candidate, which is redis (see www.redis.io). It retains the distributed setup property and could be even made to work over RCS channels. Btw this doesn't necessarily mean a lot of code changes to the interpreter; a decent class design with appropriate assignment operators will go a long way. And it's really fast and low overhead. Here's a 5-slide introduction: http://www.slideshare.net/jasoncbooth/exploring-redis - a more thorough intro (61 slides) is here http://www.slideshare.net/phpguru/redis-101-10043219 .
I've done an experimental stab at refactoring _setup along the lines of top two item above. It is nasty work. However, the good news is that a) quite a bit of it can be automated by putting some thought into refactoring scripts, eg. for Eclipse C++ refactoring, and b) it lends itself very much to regression tests. I would guess the work for items 1+2 to be 1-2 weeks full time.
I am entertaining opinions (especially by the EMC2 Board of Directors), in particular on the work items.