Inheritance diagram for UString:
Public Types | |
typedef opCat | append |
typedef opIndexAssign | setCharAt |
typedef uint(* | Formatter )(wchar *dst, uint len, inout Error e) |
Public Member Functions | |
this (uint space=0) | |
this (wchar[] content, bool mutable=true) | |
this (UText other, bool mutable=false) | |
this (UString other, bool mutable=true) | |
UString | opCat (UText other) |
UString | opCat (UText other, uint start, uint len=uint.max) |
UString | opCat (wchar chr) |
UString | opCat (wchar[] chars) |
UString | opCat (char[] chars) |
UString | setTo (wchar chr, uint start=0, uint len=uint.max) |
UString | setTo (wchar[] chars, bool mutable=true) |
UString | setTo (UText other, bool mutable=true) |
UString | setTo (UText other, uint start, uint len, bool mutable=true) |
UString | opIndexAssign (wchar chr, uint index) |
UString | remove (uint start, uint length=uint.max) |
UString | truncate (uint length=0) |
UString | padLeading (uint count, wchar padChar=0x0020) |
UString | padTrailing (uint length, wchar padChar=0x0020) |
package void | expand (uint count) |
package UString | format (Formatter format, char[] msg) |
Private Types | |
typedef opIndex | charAt |
enum | CaseOption { Default = 0, SpecialI = 1 } |
Private Member Functions | |
final void | realloc (uint count=0) |
final UString | opCat (wchar *chars, uint count) |
this (wchar[] content) | |
package wchar[] | get () |
override int | opEquals (Object o) |
override int | opCmp (Object o) |
override uint | toHash () |
UString | copy () |
UString | extract (uint start, uint len=uint.max) |
uint | codePoints (uint start=0, uint length=uint.max) |
bool | hasSurrogates (uint start=0, uint length=uint.max) |
wchar | opIndex (uint index) |
uint | length () |
int | compare (UText other, bool codePointOrder=false) |
int | compare (wchar[] other, bool codePointOrder=false) |
int | compareFolded (UText other, CaseOption option=CaseOption.Default) |
int | compareFolded (wchar[] other, CaseOption option=CaseOption.Default) |
private int | compareFolded (wchar[] s1, wchar[] s2, CaseOption option=CaseOption.Default) |
bool | startsWith (UText other) |
bool | startsWith (wchar[] chars) |
bool | endsWith (UText other) |
bool | endsWith (wchar[] chars) |
uint | indexOf (wchar c, uint start=0) |
uint | indexOf (UText other, uint start=0) |
uint | indexOf (wchar[] chars, uint start=0) |
uint | lastIndexOf (wchar c, uint start=uint.max) |
uint | lastIndexOf (UText other, uint start=uint.max) |
uint | lastIndexOf (wchar[] chars, uint start=uint.max) |
UString | toLower (UString dst) |
UString | toLower (UString dst, inout ULocale locale) |
UString | toUpper (UString dst) |
UString | toUpper (UString dst, inout ULocale locale) |
UString | toFolded (UString dst, CaseOption option=CaseOption.Default) |
char[] | toUtf8 (char[] dst=null) |
UText | trim () |
UString | unEscape () |
uint | getCharStart (uint i) |
uint | getCharLimit (uint i) |
private void | pinIndex (inout uint x) |
private void | pinIndices (inout uint start, inout uint length) |
Static Private Member Functions | |
this () | |
bool | isSurrogate (wchar c) |
bool | isLeading (wchar c) |
bool | isTrailing (wchar c) |
Private Attributes | |
package uint | len |
package wchar[] | content |
Static Private Attributes | |
FunctionLoader Bind[] | targets |
In ICU, a Unicode string consists of 16-bit Unicode code units. A Unicode character may be stored with either one code unit — which is the most common case — or with a matched pair of special code units ("surrogates"). The data type for code units is UChar.
For single-character handling, a Unicode character code point is a value in the range 0..0x10ffff. ICU uses the UChar32 type for code points.
Indexes and offsets into and lengths of strings always count code units, not code points. This is the same as with multi-byte char* strings in traditional string handling. Operations on partial strings typically do not test for code point boundaries. If necessary, the user needs to take care of such boundaries by testing for the code unit values or by using functions like getChar32Start() and getChar32Limit()
UString methods are more lenient with regard to input parameter values than other ICU APIs. In particular:
Definition at line 157 of file UString.d.
|
Definition at line 159 of file UString.d. Referenced by UText::unEscape(). |
|
|
|
Internal method to support formatting into this UString. This is used by many of the ICU wrappers to append content into a UString. |
|
Definition at line 654 of file UString.d. Referenced by UText::trim(), and UText::unEscape(). |
|
|
|
Create an empty UString with the specified available space |
|
Create a UString upon the provided content. If said content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified. Definition at line 183 of file UString.d. References setTo(). |
|
Create a UString via the content of a UText. Note that the default is to assume the content is immutable (read-only). |
|
Create a UString via the content of a UString. If said content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via UString methods. |
|
Append text to this UString Definition at line 358 of file UString.d. References UText::get(). Referenced by opCat(). |
|
Append partial text to this UString Definition at line 369 of file UString.d. References UText::content, len, opCat(), and UText::pinIndices(). |
|
Append a single character to this UString Definition at line 381 of file UString.d. References opCat(). |
|
Append text to this UString Definition at line 392 of file UString.d. References opCat(). |
|
Converts a sequence of UTF-8 bytes to UChars (UTF-16) |
|
Set a section of this UString to the specified character Definition at line 423 of file UString.d. References len, UText::pinIndices(), and realloc(). Referenced by padLeading(), padTrailing(), setTo(), and this(). |
|
Set the content to the provided array. Parameter 'mutable' specifies whether the given array is likely to change. If not, the array is aliased until such time this UString is altered. Definition at line 441 of file UString.d. References len. |
|
Replace the content of this UString. If the new content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via one of these methods. Definition at line 461 of file UString.d. References UText::get(), and setTo(). |
|
Replace the content of this UString. If the new content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via one of these methods. Definition at line 476 of file UString.d. References UText::content, len, UText::pinIndices(), and setTo(). |
|
Replace the character at the specified location. Definition at line 488 of file UString.d. References len. |
|
Remove a piece of this UString. Definition at line 504 of file UString.d. References len, memmove(), UText::pinIndices(), realloc(), and truncate(). |
|
Truncate the length of this UString. Definition at line 528 of file UString.d. References len. Referenced by remove(), URegex::replaceAll(), URegex::replaceFirst(), and UText::unEscape(). |
|
Insert leading spaces in this UString |
|
Append some trailing spaces to this UString. |
|
Check for available space within the buffer, and expand as necessary. Definition at line 569 of file UString.d. Referenced by format(), opCat(), padLeading(), and padTrailing(). |
|
Allocate memory due to a change in the content. We handle the distinction between mutable and immutable here. Definition at line 582 of file UString.d. References len. |
|
Internal method to support UString appending |
|
|
Construct read-only wrapper around the given content |
|
|
|
|
Is this UText equal to another? Definition at line 729 of file UString.d. References UText::compare(). |
|
Compare this UText to another. Definition at line 744 of file UString.d. References UText::compare(). |
|
Hash this UText |
|
Clone this UText into a UString Definition at line 773 of file UString.d. References UString. |
|
Clone a section of this UText into a UString Definition at line 784 of file UString.d. References len, UText::pinIndices(), and UString. |
|
Count unicode code points in the length UChar code units of the string. A code point may occupy either one or two UChar code units. Counting code points involves reading all code units. Definition at line 799 of file UString.d. References UText::pinIndices(). Referenced by UText::hasSurrogates(). |
|
Return an indication whether or not there are surrogate pairs within the string. Definition at line 812 of file UString.d. References UText::codePoints(), and UText::pinIndices(). |
|
Return the character at the specified position. |
|
Return the length of the valid content Definition at line 837 of file UString.d. Referenced by URegex::replaceAll(), URegex::replaceFirst(), UTransform::setFilter(), USearch::setPattern(), UDecimalFormat::setPattern(), UDateFormat::setPattern(), USearch::setText(), and UBreakIterator::setText(). |
|
The comparison can be done in code unit order or in code point order. They differ only in UTF-16 when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff. Definition at line 855 of file UString.d. References UText::get(). Referenced by UText::opCmp(), and UText::opEquals(). |
|
The comparison can be done in code unit order or in code point order. They differ only in UTF-16 when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff. |
|
The comparison can be done in UTF-16 code unit order or in code point order. They differ only when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff. Definition at line 891 of file UString.d. References UText::content. Referenced by UText::compareFolded(), UText::endsWith(), and UText::startsWith(). |
|
The comparison can be done in UTF-16 code unit order or in code point order. They differ only when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff. Definition at line 909 of file UString.d. References UText::compareFolded(). |
|
Helper for comparison methods Definition at line 1398 of file UString.d. References len. |
|
Does this UText start with specified string? Definition at line 920 of file UString.d. References UText::get(). |
|
Does this UText start with specified string? Definition at line 931 of file UString.d. References UText::compareFolded(). |
|
Does this UText end with specified string? Definition at line 944 of file UString.d. References UText::get(). |
|
Does this UText end with specified string? Definition at line 955 of file UString.d. References UText::compareFolded(). |
|
Find the first occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair. Definition at line 970 of file UString.d. References UText::pinIndex(). Referenced by UText::indexOf(). |
|
Find the first occurrence of a substring in a string. The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs. Definition at line 991 of file UString.d. References UText::get(), and UText::indexOf(). |
|
Find the first occurrence of a substring in a string. The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs. Definition at line 1008 of file UString.d. References UText::pinIndex(). |
|
Find the last occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair. Definition at line 1025 of file UString.d. References UText::pinIndex(). Referenced by UText::lastIndexOf(). |
|
Find the last occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair. Definition at line 1042 of file UString.d. References UText::get(), and UText::lastIndexOf(). |
|
Find the last occurrence of a substring in a string. The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs. Definition at line 1059 of file UString.d. References UText::pinIndex(). |
|
Lowercase the characters into a seperate UString. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. Note that the return value refers to the provided destination UString. Definition at line 1080 of file UString.d. References UText::Default. |
|
Lowercase the characters into a seperate UString. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. Note that the return value refers to the provided destination UString. Definition at line 1097 of file UString.d. References ICU::toString(). |
|
Uppercase the characters into a seperate UString. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. Note that the return value refers to the provided destination UString. Definition at line 1120 of file UString.d. References UText::Default. |
|
Uppercase the characters into a seperate UString. Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original. Note that the return value refers to the provided destination UString. Definition at line 1137 of file UString.d. References ICU::toString(). |
|
Case-fold the characters into a seperate UString. Case-folding is locale-independent and not context-sensitive, but there is an option for whether to include or exclude mappings for dotted I and dotless i that are marked with 'I' in CaseFolding.txt. The result may be longer or shorter than the original. Note that the return value refers to the provided destination UString. |
|
Converts a sequence of wchar (UTF-16) to UTF-8 bytes. If the output array is not provided, an array of appropriate size will be allocated and returned. Where the output is provided, it must be large enough to hold potentially four bytes per character for surrogate-pairs or three bytes per character for BMP only. Consider using UConverter where streaming conversions are required. Returns an array slice representing the valid UTF8 content. Definition at line 1188 of file UString.d. References ICU::testError(). |
|
Remove leading and trailing whitespace from this UText. Note that we slice the content to remove leading space. Definition at line 1208 of file UString.d. References UText::charAt, and len. |
|
Unescape a string of characters and write the resulting Unicode characters to the destination buffer. The following escape sequences are recognized: uhhhh 4 hex digits; h in [0-9A-Fa-f] Uhhhhhhhh 8 hex digits xhh 1-2 hex digits x{h...} 1-8 hex digits ooo 1-3 octal digits; o in [0-7] cX control-X; X is masked with 0x1F as well as the standard ANSI C escapes: a => U+0007, \b => U+0008, \t => U+0009, \n => U+000A, v => U+000B, \f => U+000C, \r => U+000D, \e => U+001B, \" =U+0022, \' => U+0027, \? => U+003F, \\ => U+005C Anything else following a backslash is generically escaped. For example, "[a\\-z]" returns "[a-z]". If an escape sequence is ill-formed, this method returns an empty string. An example of an ill-formed sequence is "\\u" followed by fewer than 4 hex digits. Definition at line 1257 of file UString.d. References append, UText::charAt, len, truncate(), and UString. |
|
Is this code point a surrogate (U+d800..U+dfff)? |
|
Is this code unit a lead surrogate (U+d800..U+dbff)? |
|
Is this code unit a trail surrogate (U+dc00..U+dfff)? |
|
Adjust a random-access offset to a code point boundary at the start of a code point. If the offset points to the trail surrogate of a surrogate pair, then the offset is decremented. Otherwise, it is not modified. |
|
Adjust a random-access offset to a code point boundary after a code point. If the offset is behind the lead surrogate of a surrogate pair, then the offset is incremented. Otherwise, it is not modified. |
|
Pin the given index to a valid position. Definition at line 1371 of file UString.d. Referenced by UText::indexOf(), and UText::lastIndexOf(). |
|
Pin the given index and length to a valid position. Definition at line 1383 of file UString.d. Referenced by UText::codePoints(), UText::extract(), UText::hasSurrogates(), opCat(), remove(), and setTo(). |
|
|
Definition at line 660 of file UString.d. Referenced by UText::compareFolded(), UTransform::execute(), opCat(), UDateFormat::parse(), UNumberFormat::parseDouble(), UNumberFormat::parseInteger(), UNumberFormat::parseLong(), and setTo(). |
|
Initial value: [ {cast(void**) &u_strFindFirst, "u_strFindFirst"} |