Single Byte Per Digit Strings


The following commands will create a new Single Byte per Digit String object.
The string's CP will be used when:
- converting the strS to an strD, explicitly or internally
- manipulating string data (e.g. append, insert)
- searching / replacing data (e.g. search, replace, item access)
- accessing object by character position / counting characters

Code pages always installed on 2k/XP
Code pages that can be used as a string's code page, if installed (exl unicode)

Note: The values 1200, 1201, 65000 and 65001 can't be used as a string's CP, since they specify unicode encoding type, rather than CodePages.
Note: Strings containing Asian language or non UCS-2 characters will not appear correctly if an appropriate font is not installed. The existence of the appropriate font affects just the display, and not the string object's binary data.


[1]
_s(anyValue {, CP} ) Single byte per digit string
  Returns <strS> / <err>
  anyValue Any Director value
  CP Integer CodePage.
     
  Description Creates a new single byte per digit string object, by converting the anyValue parameter to an SDS.
If e.g. anyValue is a Director string, the anyValue's binary data will be copied to the <strS>'s internal buffer.
If anyValue is an integer, anyValue will be converted to a string and stored in the <strS>'s internal buffer.

The codepage will be used latter, if the string is explicitly converted to a strD, or when this conversion is performed automatically (e.g. when copying the string to the clipboard, or appending it to a strD).  
If no CP is defined, the default CP will be used.
If the conversion fails, or CP is invalid, an <err> will be returned.
     
  Examples put _s("my string")
-- my string
s = _s([#a, #b])
put s
-- [#a, #b]  --though the result looks like a list, it actually is a string:
put s.ilk
-- #xStr


put _s("abc def fed cba").sRep("f", "*")
-- abc de* *ed cba

-- The following commands will copy Unicode text to the clipboard:
_s("greek chars: αβγ", 1253).toClip()
the 'αβγ' portion will be displayed in Director with Greek characters only on Greek systems. However, the strS that will be returned by this command will be system independent (see below).

_s("cyrillic chars: αβγ", 1251).pop()
Same characters used (binary values), but a different code page. No matter what Director displays, if this string is converted to unicode, it will be "бвг", not "αβγ".
     
  Notes If anyValue is an <strS>, a copy of anyValue will be returned.
If anyValue is an <strD>, the result value will be strS converted to a strD.
In both cases, CP will be ignored - the CP of the <strX> object will be used.

The CP, if defined, will be stored in the object for future use (e.g. converting to strD) - No CP related conversion is performed to anyValue when creating a new object.

Unlike Director strings, the Xtra's strings are not displayed enclosed in quotes.

On MBCS (e.g. Japanese) systems, a string containing 3 characters does not necessarily consist of 3 bytes - might be up to 3*2 BPC = 6 Bytes.
Director, when running on MBCS systems, does not offer any native method to modify binary data - charToNum is accessing characters (1 or 2 bytes), not singe bytes (digits).

[2]
_sc(charCode {, CP} ) Single byte per digit Character
  Returns <strS> / <err>
  charCode Integer / Float / HexString character code.
  CP Integer CodePage.
     
  Description SD Character: Creates a new single byte per digit string object, containing a single character.

The character's value charCode may range from 0-255 for Single Byte per Character CodePages and from 0-65535 for MultiByte CodePages.
Note that this command will not return an error if charCode is above 255 for SBC CodePages. In such a case, the resulting string will be containing 2 characters.

If no CP is defined, the default CP will be used.
If charCode is out of the 0-65535 range, or CP is invalid, an <err> will be returned.
     
  Examples put _sc(65) --for ANSI CodePages, the decimal value 65 maps to the Latin capital letter 'A'
-- A
put _sc("41") --"41" is the hexadecimal equivalent of the decimal value '65'
-- A
put _sc(90) --the resulting string's CP will be the defaultCP.
-- Z
put _sc(90, 1251) --the resulting string's CP will be 1251 (Cyrillic).
-- Z
put _sc(33446, 932).pop() --create an strS containing the Hiragana Letter 'E' - え
--え --result as seen on Japanese systems
-- ‚¦ --result as seen on non Japanese systems
The system settings affect the display (Director string) but not the actual strS/D, or any functions performed on it. The Xtra's strings are system independent:

put _sc(33446, 932).length --the resulting string will contain 2 digits that map to 1 character
-- 1
put _sc(33446, 1252).length --the resulting string will contain 2 digits that map to 2 characters
-- 2
Both strings contain the same binary data, but a different CodePage:
put _sc(33446, 932).string, _sc(33446, 1252).string
-- "‚¦" "‚¦"
     
  Notes charCode is a CP independent binary value. It is copied directly to the strS's internal buffer.

[3]
_scs(charCode {, charCode, charCode} ) Single byte per digit Character Sequence
  Returns <strS> / <err>
  charCode Integer / Float / HexString character code.
  CP Integer CodePage.
     
  Description SDS Character Sequence: Creates a new single byte per digit string object, containing one or more characters.

Each character's value may range from 0-255 for Single Byte per Character CodePages and from 0-65535 for MultiByte per Character CodePages.
Note that this command will not return an error if charCode is above 255 for SBC CodePages. In such a case, 2 characters will be appended to the string.

The default CP will be used as the string's CP.
If charCode is out of the 0-65535 range, an <err> will be returned.
     
  Examples put _scs(65)    --for ANSI CodePages, the decimal value 65 maps to the Latin capital letter 'A'
-- A
put _scs("5A")  --"5A" is the hexadecimal equivalent of the decimal value '90' (letter 'Z')
-- Z
put _scs(65, "5A")
-- AZ

--the following commands produce the same result
put _sc(33446, 932).pop()
put _scs(33446).cp(932).pop()
put _scs(130, 166).cp(932).pop()
     
  Notes charCodes are CP independent binary values. They are written directly to the strS's internal buffer.


Quiz:
If myStr is a strS object that contains two characters, and it outputs the following in the message window
put myStr
-- αβ
is it safe to assume that:
1. Since Director's message window can display Greek characters, the system's codepage is 1253 (assuming that 1253 is the system code page for all Greek systems)?
2. myStr's codepage is 1253 (Ansi-Greek)?

A
1. No. We can tell that it's not any western system other than Greek, but it could be an eastern (MBCS) system. E.g., Japanese systems contain the Greek alphabet characters in sub-codepage 83:
put myStr.cList(#hex)
-- [83BF, 83C0]
If however, we knew that myStr contains 2 digits (rather than 2 characters), then we could be quite certain that the system is a Greek system, since eastern systems would require 4 digits to display the 2 characters.

2. No. All that we know (and assuming that we are not on an eastern system), is that myStr consists of two characters=two digits=two bytes, and that its data is identical to the data of  the Greek SDS "αβ" (225, 226). We can't tell the myStr's codepage, since any strS consisting of the digits 225 and 226 would be displayed as "αβ" on a Greek System.