** a desired property of some state does not yet hold (e.g., probe result) |
** a desired property of some state does not yet hold (e.g., probe operation not yet finished, hence probe position still invalid) |
|
categories are currently in sync. At program start, there's an initial sync, so we have |
categories are currently in sync. At program start or interpreter initialization, the 'current cache condition is set to 'all state references invalid: valid = {} This will cause an initial sync, so post-sync we have |
initial sync. # start cache condition now: s = {'byprobe','bytoolchange'}. |
initial sync. # start cache condition now: s = {'byprobe','bytoolchange', 'currentpoint'}. |
inspect current block and determine set of preconditions p = {...} |
inspect current block and determine set of preconditions p = {...} |
Post-sync we have: s = {'currentpoint','byprobe','bytoolchange'} so the execution succeeds |
Post-sync we again have s = {'byprobe','bytoolchange','currentpoint'} so the execution succeeds since 'currentpoint' is in the valid set. |
* expressions: for each parameter referenced in the expression, add the taint category to the precondition set. (above 'o100' example: the reference to #5070 adds 'byprobe' to the precondition set) * The precondition of a block is the union set of the preconditions of all words in the block, and all expressions. |
* expressions: for each parameter referenced in the expression, add the taint category to the precondition set. (above 'o100' example: the reference to #5070 adds 'byprobe' to the precondition set. |
|
* ANY SYNC OPERATION HAPPENS ONLY IN A STATE 'getter', and as such is in principle language-independent |
* ANY SYNC OPERATION HAPPENS ONLY IN A STATE 'getter', and as such is in principle language-independent. ** Assuming the state 'getter' can block until sync complete, the interpreter has no need to know what a sync() is. It can be coded in a completely linear fashion. ** This would suggest that state management would be done with a task of its own, interacting with the world model as needed. ** The preview instance of the interpreter has no need to sync with the world model, and as such call the state getter with a parameter indicating 'no sync needed', or link to a simplified state getter which chooses to ignore taint sets and syncing. ** an interesting side effect of accumulating the current taint set in the interpreter is the fact that the resulting taint set at the end of the preview run will allow to tell whether the preview path will coincide with the path generated by the actual program run, or may deviate from it - if the result taint set is empty, they must coincide Memory requirementsset size: given that all features including save/restore are recorded here an unsigned32 is fine with lots of reserve. * the current-cache-validity descriptor: a set * per parameter: a set; Note named parameters already have an attribute mask; numbered params dont. * per M/G-code: two sets: the precondition, and the taint set. probably best held in interp_array.cc . * per remapped code: as for M/G-code. An ini parameter on the REMAP= line. * per block: the union of preconditions and taint sets of all codes in the block. Processing requirements* during parsing a block: on a per-block basis, accumulate the preconditions and taint sets in the corresponding per-block sets. An unsigned32 'or' operation. * during parsing an expression: collect the taint set of the expression. An unsigned32 or operation. * modal state: depending on modal state, preconditons might need to be added to the current block, for instance 'G91 relative mode' adds the 'current point valid' preconditon. * during block evaluation: if a precondition is missing, a sync MAY be generated in advance. See 'Processing tradeoffs' below. * parameter reference: if sync is done at the block level: no change - direct reference to underlying float possible. If reference-driven, a parameter reference can cause a sync and hence must go through getter method. * parameter assignment: no change, I dont see any tainted r/w parameters around, so at this point a setter isnt needed. It might make sense long term. Since it can be inlined it could be very lightweight. * interpreter init: clear the current-cache-validity set. Processing tradeoffsI see the following ways to deal with syncing tainted values: # automatic minimal sync (only sync whats needed right now - more syncs but less information synced per operation) # automatic full sync (less syncs but all possibly tainted parameters synced) # sync explicitly triggered by interpreter when a block is evaluated Options 1 and 2 mean 'the interpreter has no need to know about tainted parameters'; this would be the future as any new interpreter could choose to just ignore the problem, but it needs to go through getter methods. Option 3 is likely the easiest to do in the current interpreter code. In this case, getters are optional. In terms of number of syncs generated, 2 and 3 are identical. not sure whether the difference between 1 and 2 is worth making configurable. SteppingThe current interpreter doesnt have a lot of state introspection capabilites, and none are used during stepping, but other interpreter code bases might. Since performance is not an issue while stepping the interpreter, it could just as well sync at every step. The only change needed to achieve this is to set the current-cache-validity set to 'all values tainted' before a block is evaluated, which will cause a sync before every block evalutaion. There is no point in doing that when using the automatic scheme (options 1 or 2). |
* the implementation cost is low: two bit masks per block, one bit mask per parameter |
Note added 8-8-2012It just occurred to me that in the algorithm outline above, I implicitly assumed a strict separation of parse and execute phases for a block; during parse one would determine precondition and taint sets, sync if required, only then execute the block. This assumed separation in the current rs274ngc code doesnt exist, in particular wrt expressions: expressions and assignments are evaluated at parse time. This implies that either syncing must happen on the fly during reference to stale parameters during expression evaluation, or expression handling is separated into a parse tree generation during read() and parse tree evaluation during execute(). |
Thinking about the future of linuxcnc, it is helpful to revisit the term 'queuebuster operation'. The result is a bit surprising - there are no 'queuebuster operations', there are only 'queuebuster state references'. I sketch a different, more flexible, less error prone model which is data-driven, not operation-driven.
It turns out this model supports any language in the frontend, without requiring that yet-to-be-defined language having to explicitly handle 'queuebusters'.
Standing back a bit we see:
So this is really a cache consistency problem. A cache always brings a tradeoff between consistency and performance. Under this angle 'readahead' really means 'generate blocks as long as the state cache is valid'.
The key term here is 'reference': that is, when state is referred to, either implicitly or explicitly.
Some machine operations 'taint' certain parameters. For instance, 'probe' taints the current point, and parameters #5061 up to #5070.
Tainting here means: 'the result of the affected parameter cannot be determined at this point (readahead time)'. Not every change to a parameter taints it: for instance, for endpoint calculation, if the starting point is known, then the endpoint will be known to match between interp and machine, so for instance a move does not taint the current point..
Lets look at an example program under the angle 'what information gets tainted by which operation':
N10 G90 N20 G38.2 (taints current point, i.e. #5420-#5428, plus #5061 and #5070) N30 G91 N40 G0 X5 (implicit state reference to current point, #5420-#5428) N50 o100 if [#5070 EQ 1] (explicit state reference to #5070)
By tracking the tainting behavior of an instruction, it is possible to automatically determine that a synchronization of the tainted state is required, and where that must happen the latest. In the above example, syncing #5420-#5428 must happen before N40 is executed, and syncing #5070 must happen before N50 is executed.
If we classify interpreter state with respect to tainting we find:
Going through the #5xxx parameters at http://www.linuxcnc.org/docs/devel/html/gcode/overview.html#_numbered_parameters_a_id_sub_numbered_parameters_a we find:
Since the current point is often used, we define its own category for it:
So we really have only 4 tainting categories: never, by probe, by input read, by toolchange. For further discussion, we better rename 'by input read' to mean 'assume as always tainted' . Because it is so common, we add a fifth category: 'current point' (current point valid)
When a category gets tainted, all parameters which are in that category become stale cache values. When any stale parameter is needed as reference, it needs to be refreshed 'by sync'.
Let's assume we want to introduce a new parameter, 'execution time', and give it #5500 for now. We want it to reflect the execution time, in seconds, when the referencing statement is executed. So a use case would be:
(debug, we are now #5500 seconds into the program)
Clearly, this is a parameter which must be considered 'always tainted' - it is a queuebuster parameter. So to create meaningful output, readahead must stop before this block, wait until the machine 'executes' this statement, signaling that point in execution has been reached. Then it would be ok to determine the elapsed time, set #5500, and continue.
So we'd add #5500 to the 'always tainted' category above.
I chose the 'execution time' example on purpose - there is no corresponding G-code 'operation', there is just a parameter reference, and *still* we have a queuebuster?
The important conclusion is: readahead needs stopping *when stale state is referenced*, not when 'some queuebuster operation is executed'.
The inverse is also true: Not all programs containing our former 'queuebuster operations' actually need syncing - only those referencing stale state do.
That realization suggests that 'queuebusting' better be handled at the state reference level, not at the operation model.
For each block, we can determine the static pre- and postconditions with respect to state references at parse time:
Then we have a 'current cache condition' which describes which state categories are currently in sync. At program start or interpreter initialization, the 'current cache condition is set to 'all state references invalid:
valid = {}
This will cause an initial sync, so post-sync we have
valid = {'byprobe','bytoolchange', 'currentpoint'}.
We do not record the 'never' category. Also, we do not add the 'always' category.
This would be executed as the blocks list is generated, which could be parse time or later.
initial sync. # start cache condition now: s = {'byprobe','bytoolchange', 'currentpoint'}. while not end of program inspect current block and determine set of preconditions p = {...} redo: i = intersect( p, s) m = i - p # determine missing categories if m != {} # current block references some tainted state sync(m) wait for sync to succeed s = union(s,m) goto redo: q = {...} # set of categories tainted by current block s = s - p # invalidate these in current valid set
;s = {'byprobe','bytoolchange','currentpoint'} through initial sync G90 ;p = {}, q = {} # G90 has no implicit state references, no categories tainted
;s = {'byprobe','bytoolchange','currentpoint'} G38.2 ;p = {}, q = {'byprobe','currentpoint'}
;s = {'bytoolchange'} G91 ;p = {}, q = {} # G91 has no implicit state references, no categories tainted
;s = {'bytoolchange'} ; since in relative mode, current point needs to be valid to determine endpoint G0 X5 ;p = {'currentpoint'}, q = {}
--> s is missing the 'currentpoint' category here, so a sync is needed. Post-sync we again have s = {'byprobe','bytoolchange','currentpoint'} so the execution succeeds since 'currentpoint' is in the valid set.
; s = {'currentpoint','byprobe','bytoolchange'} o100 if [#5070 EQ 1] ;p = {'byprobe'}, q = {} ; no sync needed since 'byprobe' category valid
(not fully thought through, but straightforward and really simple):
set size: given that all features including save/restore are recorded here an unsigned32 is fine with lots of reserve.
I see the following ways to deal with syncing tainted values:
Options 1 and 2 mean 'the interpreter has no need to know about tainted parameters'; this would be the future as any new interpreter could choose to just ignore the problem, but it needs to go through getter methods.
Option 3 is likely the easiest to do in the current interpreter code. In this case, getters are optional.
In terms of number of syncs generated, 2 and 3 are identical. not sure whether the difference between 1 and 2 is worth making configurable.
The current interpreter doesnt have a lot of state introspection capabilites, and none are used during stepping, but other interpreter code bases might.
Since performance is not an issue while stepping the interpreter, it could just as well sync at every step. The only change needed to achieve this is to set the current-cache-validity set to 'all values tainted' before a block is evaluated, which will cause a sync before every block evalutaion. There is no point in doing that when using the automatic scheme (options 1 or 2).
It just occurred to me that in the algorithm outline above, I implicitly assumed a strict separation of parse and execute phases for a block; during parse one would determine precondition and taint sets, sync if required, only then execute the block.
This assumed separation in the current rs274ngc code doesnt exist, in particular wrt expressions: expressions and assignments are evaluated at parse time. This implies that either syncing must happen on the fly during reference to stale parameters during expression evaluation, or expression handling is separated into a parse tree generation during read() and parse tree evaluation during execute().