for December 2014
I would like to share some notes about how some of the object features were implemented in the Red compiler. As these are probably the most complex parts in the Red toolchain right now, I thought that it would be worth documenting them a bit, as it can be useful to current and future code contributors.
Reminder: the Red toolchain is currently written entirely in Rebol.
During the work on object support in the Red compiler, I realized that I could leverage the proximity of Red with Rebol much deeper than before, in order to more easily map some Red constructs directly to Rebol ones. That’s how I came up with the “shadow” objects concept (later extended to functions too).
It is pretty simple in fact, each time a Red object is processed from the source code, an equivalent, minimized object is created by the compiler in memory and connected to a tree of existing objects in order to match the definitional scoping used in the Red code.
Here is an example Red source code with two nested objects:
Red  a: object [ n: 123 b: object [ inc: func [value][value + n] ] ]
Once processed by the compiler, the following shadow objects are created in memory:
objects: [ a object [ n: integer! b: object [ inc: function! ] ] ]
But it does not just stop there, the body of the Red object is bound (using Rebol’s
bind native) to the internal Rebol object, in such way that the definitional scoping order is preserved. So the Red code is directly linked to the Rebol shadow objects in memory. The same procedure (including the Red code binding to Rebol objects part) is applied to all compiled functions, which context is represented as a nested Rebol object in the compiler’s memory.
If you get where I am heading, yes, that means that resolving the context of any of the words contained in a Red object/function body becomes as simple as calling Rebol’s
bind' native on the word value. (Remember that Red source code is converted to a tree of blocks before being compiled). The
bind' native will return one of the Rebol’s objects, that can then be used as a key in an hashtable to retrieve all the associated metadata.
I wish I had come up with that simple method when I was implementing namespaces support for Red/System. I think that I will rework that part in Red/System in the future, reusing the same approach in order to reduce compilation times (namespaces compilation overhead is significant in Red/System, roughly taking 20% of the compilation time).
Choosing Rebol as the bootstrapping language for Red, shows here its unique advantages.
Processing path values is really difficult in Red (as it would be in Rebol if it had a compiler). The main issue can be visualized easily in this simple example:
foo: func [o][o/a 123]
Now if you put yourself in the shoes of the compiler, what code would you generate for
o/a ‘… Could be a block access, could be a function call with
/a as refinement, could be an object path accessing a field, could be an object path calling the function
a defined in the object. All these cases would require a different code output, and the compiler has no way to accurately guess which one it is in the general case. Moreover,
foo can be called with different datatypes as argument, and the compiled code still need to account for that…
One method could be to generate different code paths for each case listed above. As you can guess, this would become quickly very expensive to manage for expressions with multiple paths, as the possible combinations would make the number of cases explode quickly.
Another, very simple solution, would be to defer that code evaluation to the interpreter, but as you cannot know where the expression ends, the whole function (or at least the root level of the function) would need to be passed to the interpreter. Not a satisfying solution performance-wise.
The solution currently implemented in Red compiler for such cases, is a form of “dynamic invocation”. If you go through all cases, actually they can be sorted in two categories only:
a) access to a passive value
b) function invocation
Only at runtime you can know which category the
o/a path belongs to (even worse, category can change at each
foo function call!). The issue is that the compiler generates code that evaluates Red expressions as stack manipulations (not the native stack, but a high-level Red stack), so the compiler needs to know which category it is, so it can:
Basically, the generated Red/System code for the foo function would be (omitting prolog/epilog parts for clarity):
For a) case:
stack/mark-native ~eval-path stack/push ~o word/push ~a actions/eval-path* false stack/unwind integer/push 123 stack/reset
For b) case (with
/a being a refinement):
stack/mark-func ~o integer/push 123 logic/push true ;-- refinement /a set to TRUE f_o stack/unwind
As you can see, the moment where the integer value 123 is pushed on stack for processing is very different in both cases. In case a), it is outside of the o/a stack frame, in case b), it is part of it. So what should the compiler do then…looks unsolvable’
Actually some stack tricks can help solve it. This is how the compiler handles it now:
This is the code currently produced by the Red compiler for
stack/push ~o either stack/func' [stack/push-call path388 0 0 null] [ stack/mark-native ~eval-path stack/push stack/arguments - 1 ;-- pushes ~o word/push ~a actions/eval-path* false stack/unwind-part either stack/func' [ stack/push-call path388 1 0 null ][ stack/adjust ] ] integer/push 123 stack/reset
This generated code, with the help of the dual-mode stack, can support evaluation of
o/a whatever value the path refers to (passive or function).
stack/func’ here checks if the stack top entry is a function or not. There is a little performance impact, but it is not significant, especially in respect to the high flexibility it brings.
So far so good. What happens now if the path is used as argument of a function call:
foo: func [o][probe o/a 123]
The outer stack frame that
probe will create then becomes problematic, because it will close just after o/a, preventing it to fetch eventual arguments when o/a is a function call…so back to the drawing board’ Fortunately not, we can apply the same transformation for the wrapping call and defer it until its arguments have been fully evaluated. This is the resulting code:
f_~path389: func [/local pos] [ pos: stack/arguments stack/mark-func ~probe stack/push pos f_probe stack/unwind ] stack/defer-call ~probe as-integer :f_~path389 1 null stack/push ~o either stack/func' [stack/push-call path388 0 0 null] [ stack/mark-native ~eval-path stack/push stack/arguments - 1 word/push ~a actions/eval-path* false stack/unwind-part either stack/func' [ stack/push-call path388 1 0 null ][ stack/adjust ] ] ------------| "probe o/a" integer/push 123 stack/reset
As you can see, it gets more hairy, but still manageable. The outer stack frame is externalized (into another Red/System function), so it can be called later, once the nested expressions are evaluated.
That said, dynamic calls still need a bit more work in order to support routine! calls and refinements for wrapping calls. Those features will be added in the next releases. Also, the gain in flexibility makes the compiler more short-sighted when a particular structure is expected, like for control flow keywords requiring blocks. I don’t see yet how this dynamic call approach could support such use-cases in a more user-friendly way.
But another feature can come to the rescue, the upcoming
#alias directive proposed in the previous blog post. As long as the user will be willing to use this new directive, it would simply avoid these dynamic constructions, by providing enough information to the compiler to statically determine what kind of value, paths are referring to, resulting in much shorter and faster code, without the short-sightness issue.
This is the kind of problem I had to solve during object implementation and why it took much longer than planned initially.
Hope this deeper look inside the compiler’s guts is not too scary. ;-) Now, back to coding for next release!
And, by the way, Merry Christmas to all Red followers. :-)
We are bumping the version number up higher as we are bringing a new foundational layer and important construct to Red:
object! datatype and contexts support.
Supporting objects in the Red interpreter is relatively easy and straightforward. But adding those features in the compiler has proven to be more complex than expected, especially for access-path support, paths being especially tricky to process, given their highly dynamic nature. Though, I have pushed Red beyond the edges I was planning to stop at for objects support, and the result so far is really exciting!
Just a short reminder mainly intended for newcomers. Red implements the same object concept as Rebol, called prototype-based objects. Creating new objects is done by cloning existing objects or the base
object! value. During the creation process, existing field values can be modified and new fields can be added. It is a very simple and efficient model to encapsulate your Red code. There is also a lot to say about words binding and contexts, but that topic is too long for this blog entry, we will address that in the future documentation.
The syntax for creating a new object is:
make object! <spec> <spec>: specification block
Shorter alternative syntaxes (just handy shortcuts):
object <spec> context <spec>
The specification block can contain any valid Red code. Words set at the root level of that block will be collected and will constitute the new object’s fields.
make object! [a: 123] object [a: 123 b: "hello"] c: context [ list:  push: func [value][append list value] ]
You can put any valid code into a specification block, it will be evaluated during the object construction, and only then.
probe obj: object [ a: 123 print b: "hello" c: mold 3 + 4 ]
hello make object! [ a: 123 b: "hello" c: "7" ]
Objects can also be nested easily:
obj: object [ a: 123 b: object [ c: "hello" d: object [ data: none ] ] ]
Another way to create an object is to use the
copy action which does not require a specification block, so does just a simple cloning of the object. Existing functions will be re-bound to the new object.
In order to access object fields, the common path syntax is used (words separated by a slash character). Each word (or expression) in a path is evaluated in the context given by the left side of the path. Evaluation of a word referring to a function will result in invoking the function, with its optional refinements.
book: object [ title: author: none show: does [print [mold title "was written by" author]] ] book/title: "The Time Machine" book/author: "H.G.Wells" print book/title book/show
The Time Machine "The Time Machine" was written by H.G.Wells
A special keyword named
self has been reserved when self-referencing the object is required.
book: object [ title: author: none list-fields: does [words-of self] ] book/list-fields
[title author list-fields]
While cloning produces exact replicas of the prototype object, it is also possible to extend it in the process, using
make <prototype> <spec> <prototype> : object that will be cloned and extended <spec> : specification block
a: object [value: 123] c: make a [ increment: does [value: value + 1] ] print c/increment print c/increment
It is also possible to use another object as
<spec> argument. In such case, both objects are merged to form a new one. The second object takes priority in case both objects share same field names.
a: object [ value: 123 show: does [print value] ] b: object [value: 99] c: make a b c/show
Sometimes, it can be very useful to detect changes in an object. Red allows you to achieve that by defining a function in the object that will be called just after a word is set. This event is generated only when words are set using a path access (so inside the object, you can set words safely). This is just a first incursion in the realm of metaobject protocols, we will extend that support in the future.
In order to catch the changes, you just need to implement the following function in your object:
on-change*: func [word [word!] old new][...] word : field name that was just affected by a change old : value referred by the word just before the change new : new value referred by the word
It is allowed to overwrite the word just changed if required. You can directly set the field name or use
set word <value>
book: object [ title: author: year: none on-change*: func [word old new /local msg][ if all [ word = 'year msg: case [ new > 2014 ["space-time anomaly detected!"] new < -3000 ["papyrus scrolls not allowed!"] ] ][ print ["Error:" msg] ] ] ] book/title: "Moby-Dick" book/year: -4000
Error: papyrus scrolls not allowed!
You can use
set on an object to set all fields at the same time.
get on an object will return a block of all the fields values. get can also be used on a get-path!.
obj: object [a: 123 b: "hello"] probe get obj set obj none '' obj set obj [hello 0] '' obj probe :obj/a
[123 "hello"] obj: make object! [ a: none b: none ] obj: make object! [ a: 'hello b: 0 ] hello
Find action gives you a simple way to check for a field name in an object. If found it will return
Select action does the same check as
find, but returns the field value for matched word.
obj: object [a: 123] probe find obj 'a probe select obj 'a probe find obj 'hello
true 123 none
in native will allow you to bind a word to a target context:
a: 0 obj: object [a: 123] probe a probe get in obj 'a
Bind native is also available, but not completly finished nor tested.
Some reflective functions are provided to more easily access objects internal structure.
words-ofreturns a block of field names.
values-ofreturns a block of field values.
body-ofreturns the object’s content in a block form.
a: object [a: 123 b: "hello"] probe words-of a probe values-of a probe body-of a
[a b] [123 "hello"] [a: 123 b: "hello"]
system object is a special object used to hold many values required by the runtime library. You can explore it using the new extended
help function, that now accepts object paths.
red>> help system `system` is an object! of value: version string! 0.5.0 build string! 21-Dec-2014/19:27:05+8:00 words function! Return a block of global words available platform function! Return a word identifying the operating system catalog object! [datatypes actions natives errors] state object! [interpreted' last-error] modules block!  codecs object!  schemes object!  ports object!  locale object! [language language* locale locale* months da... options object! [boot home path script args do-arg debug sec... script object! [title header parent path args] standard object! [header] view object! [screen event-port metrics] lexer object! [make-number make-float make-hexa make-char ...
Note: not all system fields are yet defined or used.
As this release already took a lot of time, some of the planned features are postponed to future releases. Here are a few of them.
Sometimes, it is convenient to be able to add fields to an object in-place, without having to recreate it, losing lexical binding information in the process. To achieve that, a new
extend native will be added, working like originaly intended in Rebol3.
In order to help the Red compiler produce shorter and faster code, a new
#alias compilation directive will be introduced. This directive will allow users to turn an object definition into a “virtual” type that can be used in type spec blocks. For example:
#alias book!: object [ title: author: year: none banner: does [form reduce [author "wrote" title "in" year]] ] display: func [b [book!]][ print b/banner ]
This addition would not only permit finer-grained type checking for arguments, but also help the user better document their code.
Another possible change will be in the output
mold produces for an object. Currently such output will start with “
make object!”, this might be changed to just “
object”, in order to be shorter and easier to read in addition to be more consistent to the way function! values are molded.
In order to make this release happen as quickly as possible, we have not fixed all the open tickets that were planned to be fixed in this release, but we still managed to fix a few of them. The other pending tickets will be fixed in the upcoming minor releases.
I should also mention that 537 new tests were added to cover objects features. The coverage is already good, but we probably need more of them to cover edge cases.
That’s all for this blog article! :-)
I will publish another blog entry about additional information regarding the implementation strategy used by the compiler for supporting contexts and object paths.
As we have almost completed other significant features during the last months, you should expect new minor releases happening very quickly in the next weeks. They will include:
Also, the work for 0.6.0 has started already (GUI support), even if its at prototype stage right now. I plan to release a first minimal version in the next few weeks (we will extend it step by step until 1.0).
Hope the waiting for the new release was worth it. ;-)