Objects, Buffers, References and the Dot Syntax


Primary and Secondary (cache) Buffers
Each object utilizes two internal buffers. The primary, that holds the string's data, and a secondary, that is used during SDS<>DDS conversions. E.g., for strD objects, the secondary buffer will be holding SDS data. This can significantly increase performance when working with different types of objects (strDs and strSs), since after the first conversion, subsequent requests won't require converting the data again. The secondary buffer is cleared automatically as required (e.g. when modifying the original string, or changing its codepage). The secondary buffer is used only when required - if you don't mix strDs with strSs, there will be nothing to cache.
The only case (except when changing the NMC) when the cache has to be cleared explicitly is when performing binary, or digit level modifications. Usually, when you go binary, you need speed. Checking if the cache needs to be cleared is a very fast internal procedure, but, for maximum speed, it has been disabled when directly setting bytes/digits - no matter how fast, in lengthy repeat loops the delay would be measurable.

Lazy Developer's note:
To be always on the safe side, instead of:
put _s("my data").char[10..20].word[1]
you should use:
s=_s("my data")
put s.char[10..20].word[1]
Both of the above lines are safe, but with the second approach, the data source object is always locked since it is assigned to a variable. So, you can safely create lines of code that include multiple property selection and regular commands on objects.
If you follow this method, and you don't care that much about speed and string optimizations, you can skip reading all reference / data source / top object stuff below.

Or, you can forget all about combining miltiple commands in a single line all together, and work with the traditional Director's method.




Reference Objects
When an <str> object is created using the _s, _d, or other command, its buffer is filled with the appropriate value. E.g. the command _s("abc") will create a new object, copy "abc" to the object's buffer, and return the object. To be more exact, a reference to the object is returned, but we'll be using the term 'object' to refer to the values returned to the command line, to avoid confusion with the 'reference objects' described below.
So far, everything works as with regular Director strings. However, Xtrema also supports the creation of objects that instead of using their own primary buffer for data retrieval, they pool data from another object's buffer (data source object).
Such dependant objects will be called 'reference objects', or references. Using references can significantly increase performance, but they must be used with caution, since their improper use can even crash Director.

There is just one rule for working safely with reference objects:
Never release/modify the data source object, till your work on all references to it is finished.
A reference to an object can be created with the .ref command.

Definitions
solid object: an object that is keeping its data in its own buffer.
reference object: an object that is pooling its data from another object's buffer.
data source of a reference object: the solid object from which the reference object pools data from.
parent object : the object from which a reference object (offspring) was created. May be a solid object or a reference.

offspring : the reference object created from a parent object. Soon as the reference is returned, no parent-child relation between the parent and the offspring exists.
The offspringwill depend only on its data source (which inherits from its parent). Offsprings are always references, but, they are often converted to solid objects before returned to the command line - see 'top object' below.

top object: the first object in a dot sequence. A top object will not be released until the final property of the dot syntax is returned. All objects created during property accessing will be references to the top object. Soon as property accessing is finished, and if the last property returns an offspring (e.g. myStr.word[2].char[1]), the top object will be examined. If the top object is a solid object, the final offspring will be dereferenced to a solid object as well - unless the dot syntax is terminated with the .ref command. Otherwise, if the top object is a reference, the final offspring won't be dereferenced - it will be a reference, whose data source will be the top object's data source. In this case, the final offspring can be converted to a solid object, if the dot syntax is terminated with the .deref() command.

s=_s("data source object") --solid object, will be the data source of the objects created below.
x=s.char[1..2] --solid. 's' is solid, so the result of the dot syntax will be dereferenced before returned to the command line.
x=s.char[1..2].ref --reference. By using .ref, we instruct the Xtra to not dereference the result.
x=s.char[1..2].word[1] --solid object, created from the parent object s.char[1..2] (internal reference, temporary object since more dots follow)
x=s.char[1..2].word[1].char[1] --solid. The result of s.char[1..2] is the parent of .word[1] , the result of which is the parent of the final object rC.
x=s.ref.char[1..2] --reference. the top object is the result of s.ref (reference object). Since the top object is a ref, the final object will be a ref - its data source will be s, which is the data source of the (temporary) s.ref top object.

The term 'parent object' is an abstract definition referring to the object other objects are created from.
A reference object is created from a parent, but, right after creation, depends only on the data source.
According to the above, even if an itermediate reference object from which offsprings have been created is discarded, working with its offsprings is safe.

s=_s("abcd")
p=s.char[2..3].ref --reference, offspring of s - solid.
r=p.char[1]          --reference, offspring of p - reference.
p=0   --resetting p is safe. 'r' is still usable, since its data source is s, not p.
s=0   --unsafe!! s is r's data source. r is now unusable - its fixed properties are still accessible, but an attempt to access its data, or properties that rely on data access may even crash your application.


Dot Syntax
One of the primary concerns of all of the Xtra's classes is ease of use. Creating smart commands that can evaluate arguments and act accordingly instead of creating multiple strict commands, and allowing a sequence of commands to be executed in just one line of code, are parts of this approach.
This object's dot syntax was not only designed with that philosophy, but goes even further:
By adding support for references / parent objects / offsprings, it is possible to create code lines like this:

s=_s("One and two and three and four!")
put s.idl["and"].item[-1].upper.exp.iSel["two"].insAft("...")
-- One and two... and three and FOUR!

Dot Syntax & Top Object
When accessing a <str> object's selection property, e.g. str.word[1], a new object is created, and <str> becomes the 'top object' of the dot queue.

At that point, the new object will be a reference, or a 'selection', in the top object - no data is being copied to the new object's buffers, just the memory address and length of the target area in the top object's buffer.
Further property accessing, e.g. str.word[1].char[2], return new objects, as above.

The range in the top object of each offspring depends on it's immediate parent - in the last example, str.word[1].char[2], the char command will select the 2nd character of the 1st word of str.
If no further properties are requested, the result is being dereferenced (data is being copied) and str (the top object) is being released, far as the queue is concearned.
If however, str (the top object) is itself a reference, the result won't be dereferenced - it will be dependant to the object the str was dependant to.

Caution to commands that return a reference to the object they were used on:
It is safe to call:
put _s("abcd").toClip().char[1]
but calling
put _s("abcd").char[1].toClip().char[1] -- *invalid data*
is not, and may even crash Director .
The reason:
The .toClip() command of the _s("abcd").char[1].toClip() sequence is performed on the result of _s("abcd").char[1] , which is a reference at this point. Soon as the .toClip() command is finished, the top object will be released. But the .toClip() will return a reference to the object it was called on  (the result of s("abcd").char[1]) , which will be, right after the completion of the command, an orphaned object (a reference whose top object has been disposed). It is safe to request from an orphaned object properties like .err, to check if the command was successful, but ANYTHING that has to do with data access/manipulation, will return invalid data, and may even lead to a chrash.
If you need to continue working with objects that are results of property accessing and finalized with commands that return references, you need to use the .deref() command right after accessing the last property.
This sequence is safe:
put _s("abcd").char[1].deref().toClip().char[1]

Still, if you prefer to work with references in order to avoid data copying, you could use:
s=_s("abcd")
put s.char[1].toClip().char[1]
In the above example, s will become a top object for the .char[1] command, but the actual data won't be disposed at it's release: s is a local variable, and, though the Xtra does not need it any more after property accessing is done, it is being kept alive by Director (until e.g. some other value is assigned to s).

DevNote: When designing string objects, there is always a speed vs security vs flexibility issue. Director's string objects are built for security, sacrificing both speed and flexibility. This class's strings were designed to offer a rich feature set and flexibility, plus extended multilanguage support, with the minimal speed penalty possible. Director's strings will never crash the program. xStrs may cause a crash if not used with caution.
In practice, though objects created with Xtras are slower than native Director objects, you'll find many commands to be from significant to several times faster than Director's equivalents, especially when working with large strings.

As for the command set and dot syntax.. Well, hope you enjoy them.


[1]
strX.ref create reference to an object
  Returns <strX> / <err>
     
  Description Returns a reference to the original object.
The new object will seem identical to the original, but its data will be a reference to the original object's data, rater than a copy of them.
The parent object (strX) may itself be a reference to another object.
The top object (the one that all reference objects access binary data from) should not be released or modified by means other than changing byte/digit values.
Intermediate objects can be safely disposed.
Reference objects will automatically be dereferenced if any modification - other than byte/digit - is performed to them.
     
  Examples s=_s("first   second")
r=s.ref
r.byte[6]=45 --45 is the decimal value of the ASCII character '-'
put s        --modifying r affects s
-- first - second


s=_s("my data")
x=s.word[-1].char[1..3] --x = copy of the first 3 chars of the last word of s
put x, x.isRef()
-- dat 0

r=s.word[-1].char[1..3].ref --r = reference to the first 3 chars of the last word of s
put r, r.isRef()
-- dat 1

r.byte[0]=66           --modifying r, will affect s, but not x:
put r, x, s
-- Bat   dat   my Bata  (spaces between results added for readability)


s=_s("my data")
r=s.word[2].ref
put r.insBef("binary ") --when using the dot syntax, r becomes the top object
-- binary data          --the .insBef command, is called on, and returns, the top object.
put r.isRef()           --and since .insBef modifies the object, it dereferences it.
-- 0

s=_s("my data")
r=s.word[2].ref
put s.ref[r].insBef("binary ") --with this syntax, the command is performed on s.
-- my binary data
After the last command, it's unsafe to use 'r' again, since its data source (s) has been modified.

src=_s("my data")
r=src.char[1] --the final object will be dereferenced before returned to the command line.
put r.isRef()
--0

s=src.ref
r=s.char[1]   --if the top object is a reference, the final object won't be dereferenced.
put r.isRef()
-- 1
     
  Notes Be VERY cautious when working with references. That is, *NEVER* release / modify their top object.
A reference object has and utilizes its independent secondary buffer for SDS<>DDS conversions.

[2]
strX.isRef( {srcStrX {, retErr} } )  |  strX.isRefX( srcStrX {, retErr } ) check if an object is a reference
  Returns True / False / <err>
  srcStrX Object to check if strX is a reference of
  retErr Boolean. If srcStrX is defined, and if strStrX is not a reference to strX returns an <err> instead of false
     
  Description When used with no arguments, returns true if strX is a reference to some other object
If srcStrX is used, and srcStrX is a valid parent or source object of strX, this command will return true.
Otherwise, depending on the value of retErr, 0 (for retErr false) or <err> will be returned.
isRefX will return true only if srcStrX is a source object (not reference).
     
  Examples s=_s("source object")

r=s.char[1].ref
put r.isRef()  --check if r is a reference object
-- 1
put r.isRef(s) --check if r is a reference to s
-- 1

put s.w[0].del() --when using a command that modifies the source object
--   object
put r.isRef()  --r is still a reference
-- 1
put r.isRef(s) --but s is not a valid source for r. r is now orphaned.
-- 0           --*caution* this command will return true, if s, after modification,
               --happens to be using the memory range that 'r' was pointing to from before.

put r.isRef(s, 1) --adding true to the above command returns the reason.
-- <xErr 66619 OutOfBounds>


s=_s("source object")
rA=s.word[1].ref
rB=rA.char[1] --no need to use the .ref command again, since rA is already a reference.

put rB.isRef(rA), rB.isRefX(rA)--rB is a reference to rA, but rB is itself a reference
-- 1 0
put rB.isRef(s), rB.isRefX(s)  --s is rB's data source.
-- 1 1
     
  Notes srcStrX can be a source object, or a reference.
When srcStrX is defined, this command will check only ifstrX's primary buffer's range in memory belongs in the srcStrX primary buffer's range in memory. Unlike .refX(), It will not check if srcStrX is itself a reference.

[3]
strX.deRef() de-reference an object
  Returns strX / <err>
     
  Description If strX is a reference object, this command will copy all required data from its source data object's primary buffer to strX's primary buffer. After using the deRef command, strX will have no dependencies, and its original source data object can be safely released.
This command has no effects on solid objects (objects that are not references).
If the command is successful, strX (guaranteed to be solid) is returned to the command line.
If the amount of memory required for this operation cannot be allocated, an <err> is returned.
     
  Examples s=_s("source object")
r=s.w[-1].c[0..2].ref
put r, r.isRef()
-- obj 1
r.deRef()        --convert r to a solid object. releasing/modifying 's' is now safe.
put r.isRef()
-- 0    
     
  Notes Any command that may change the size of the object's buffer, automatically dereferences the object.
Commands like e.g. resize, app and ins, when performed on a reference, will automatically dereference the object prior to their execution.
Commands that return references like e.g char, and word, do not dereference objects.
Commands that affect data but not the size of the data, like upper and lower or setting the values of digits and bytes, are not dereferencing the object they are performed on. Such commands are actually modifying the data source objects' buffers - and therefore affect both the source object itself, well as all of its offsprings.

[4]
strX.dupe( {targetStr}) duplicate
  Returns newStr / targetStr / <err>
  targetStr <str> object to dupe strX to.
     
  Description Duplicates the strX object.
If targetStr is a str object, it will become a duplicate of strX.
The result of this command will be a solid object, even if strX is a reference.
The original object will remain intact.
If the operation fails, an <err> will be returned.
Otherwise, the new object will be returned.
     
  Examples s=_s("original object")
c=s.dupe()
put c
-- original object

x=_s("some value")
s.dupe(x)
put x
-- original object
put s
-- original object
     
  Notes  

[5]
strX.dupeX( {targetStr}) dupe type only
  Returns newStr / targetStr / <err>
  targetStr <str> object to dupe strX to.
     
  Description Creates a new empty string whose type and cp are equal to the ones of strX
     
  Examples d = _d("abc", 1251).dupeX()
put d.length, d.cp, d.bpd
-- 0 1251 2
     
  Notes  

[6]
strX.pass( targetStr ) pass
  Returns targetStr / <err>
  targetStr <str> object to pass strX to.
     
  Description Transfers strX to targetStr.
The target object won't be dereferenced - if strX is a ref, targetStr will be a ref.
After the command is finished, strX will be a reference to targetStr.
     
  Examples s=_s("original object")
x=_s("some value")
s.pass(x)
put x, x.isRef()
-- original object 0
put s, s.isRef()
-- original object 1
     
  Notes This command is addressed to advanced users that prefer working with reference objects (or are familiar with c++ pointers).

[7]
strX.cacheSz( {clear} ) cache size
  Returns strX
  clear Pass any integral value to clear the cache
     
  Description

Returns the size of the secondary buffer of the object (which will be, for e.g. strD objects, caching SDS data)

If any integer is passes to this command, the cache will be cleared.

Clearing the cache is only required after performing binary operations on a string, or changing the nmc, and the string is to interact with strings of different types (when working with both SDS and DDS)

     
  Examples

d=_d("double")
put d.cacheSz()
-- 0
put _s("single").app(d) -- double appended to single, of the same CP. Intermediate SDS string will be cached by d.
-- singledouble
put d.cacheSz()
-- 6
put d.cacheSz(0)
-- -6
put d.cacheSz()
-- 0

s=_s("abc")
_d().app(s) -- single appended to single, of the same CP. Intermediate DDS string will be cached by s.
s.byte[0]=65 -- binary modification will not clear the cache
put s, _d().app(s)
-- Abc abc -- resulting to using the already cached value "abc", instead of the new "Abc"
s.cacheSz(0)
put s, _d().app(s)
-- Abc Abc


     
  Notes