Main Page | Class Hierarchy | Alphabetical List | Class List | File List | Class Members | File Members | Related Pages

UString Class Reference

Inheritance diagram for UString:

UText List of all members.

Public Types

typedef opCat append
typedef opIndexAssign setCharAt
typedef uint(* Formatter )(wchar *dst, uint len, inout Error e)

Public Member Functions

 this (uint space=0)
 this (wchar[] content, bool mutable=true)
 this (UText other, bool mutable=false)
 this (UString other, bool mutable=true)
UString opCat (UText other)
UString opCat (UText other, uint start, uint len=uint.max)
UString opCat (wchar chr)
UString opCat (wchar[] chars)
UString opCat (char[] chars)
UString setTo (wchar chr, uint start=0, uint len=uint.max)
UString setTo (wchar[] chars, bool mutable=true)
UString setTo (UText other, bool mutable=true)
UString setTo (UText other, uint start, uint len, bool mutable=true)
UString opIndexAssign (wchar chr, uint index)
UString remove (uint start, uint length=uint.max)
UString truncate (uint length=0)
UString padLeading (uint count, wchar padChar=0x0020)
UString padTrailing (uint length, wchar padChar=0x0020)
package void expand (uint count)
package UString format (Formatter format, char[] msg)

Private Types

typedef opIndex charAt
enum  CaseOption { Default = 0, SpecialI = 1 }

Private Member Functions

final void realloc (uint count=0)
final UString opCat (wchar *chars, uint count)
 this (wchar[] content)
package wchar[] get ()
override int opEquals (Object o)
override int opCmp (Object o)
override uint toHash ()
UString copy ()
UString extract (uint start, uint len=uint.max)
uint codePoints (uint start=0, uint length=uint.max)
bool hasSurrogates (uint start=0, uint length=uint.max)
wchar opIndex (uint index)
uint length ()
int compare (UText other, bool codePointOrder=false)
int compare (wchar[] other, bool codePointOrder=false)
int compareFolded (UText other, CaseOption option=CaseOption.Default)
int compareFolded (wchar[] other, CaseOption option=CaseOption.Default)
private int compareFolded (wchar[] s1, wchar[] s2, CaseOption option=CaseOption.Default)
bool startsWith (UText other)
bool startsWith (wchar[] chars)
bool endsWith (UText other)
bool endsWith (wchar[] chars)
uint indexOf (wchar c, uint start=0)
uint indexOf (UText other, uint start=0)
uint indexOf (wchar[] chars, uint start=0)
uint lastIndexOf (wchar c, uint start=uint.max)
uint lastIndexOf (UText other, uint start=uint.max)
uint lastIndexOf (wchar[] chars, uint start=uint.max)
UString toLower (UString dst)
UString toLower (UString dst, inout ULocale locale)
UString toUpper (UString dst)
UString toUpper (UString dst, inout ULocale locale)
UString toFolded (UString dst, CaseOption option=CaseOption.Default)
char[] toUtf8 (char[] dst=null)
UText trim ()
UString unEscape ()
uint getCharStart (uint i)
uint getCharLimit (uint i)
private void pinIndex (inout uint x)
private void pinIndices (inout uint start, inout uint length)

Static Private Member Functions

 this ()
bool isSurrogate (wchar c)
bool isLeading (wchar c)
bool isTrailing (wchar c)

Private Attributes

package uint len
package wchar[] content

Static Private Attributes

FunctionLoader Bind[] targets

Detailed Description

UString is a string class that stores Unicode characters directly and provides similar functionality as the Java String class.

In ICU, a Unicode string consists of 16-bit Unicode code units. A Unicode character may be stored with either one code unit — which is the most common case — or with a matched pair of special code units ("surrogates"). The data type for code units is UChar.

For single-character handling, a Unicode character code point is a value in the range 0..0x10ffff. ICU uses the UChar32 type for code points.

Indexes and offsets into and lengths of strings always count code units, not code points. This is the same as with multi-byte char* strings in traditional string handling. Operations on partial strings typically do not test for code point boundaries. If necessary, the user needs to take care of such boundaries by testing for the code unit values or by using functions like getChar32Start() and getChar32Limit()

UString methods are more lenient with regard to input parameter values than other ICU APIs. In particular:

Definition at line 157 of file UString.d.


Member Typedef Documentation

typedef opCat append
 

Definition at line 159 of file UString.d.

Referenced by UText::unEscape().

typedef opIndexAssign setCharAt
 

Definition at line 160 of file UString.d.

typedef uint(* Formatter)(wchar* dst, uint len, inout Error e)
 

Internal method to support formatting into this UString. This is used by many of the ICU wrappers to append content into a UString.

Definition at line 619 of file UString.d.

Referenced by format().

typedef opIndex charAt [inherited]
 

Definition at line 653 of file UString.d.

Referenced by UText::trim(), and UText::unEscape().


Member Enumeration Documentation

enum CaseOption [inherited]
 

Enumeration values:
Default 
SpecialI 

Definition at line 668 of file UString.d.


Member Function Documentation

this uint  space = 0  )  [inline]
 

Create an empty UString with the specified available space

Definition at line 168 of file UString.d.

this wchar[]  content,
bool  mutable = true
[inline]
 

Create a UString upon the provided content. If said content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified.

Definition at line 183 of file UString.d.

References setTo().

this UText  other,
bool  mutable = false
[inline]
 

Create a UString via the content of a UText. Note that the default is to assume the content is immutable (read-only).

Definition at line 195 of file UString.d.

this UString  other,
bool  mutable = true
[inline]
 

Create a UString via the content of a UString. If said content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via UString methods.

Definition at line 210 of file UString.d.

UString opCat UText  other  )  [inline]
 

Append text to this UString

Definition at line 357 of file UString.d.

References UText::get().

Referenced by opCat().

UString opCat UText  other,
uint  start,
uint  len = uint.max
[inline]
 

Append partial text to this UString

Definition at line 368 of file UString.d.

References UText::content, opCat(), and UText::pinIndices().

UString opCat wchar  chr  )  [inline]
 

Append a single character to this UString

Definition at line 380 of file UString.d.

References opCat().

UString opCat wchar[]  chars  )  [inline]
 

Append text to this UString

Definition at line 391 of file UString.d.

References opCat().

UString opCat char[]  chars  )  [inline]
 

Converts a sequence of UTF-8 bytes to UChars (UTF-16)

Definition at line 402 of file UString.d.

References expand(), and format().

UString setTo wchar  chr,
uint  start = 0,
uint  len = uint.max
[inline]
 

Set a section of this UString to the specified character

Definition at line 422 of file UString.d.

References UText::pinIndices(), and realloc().

Referenced by padLeading(), padTrailing(), setTo(), and this().

UString setTo wchar[]  chars,
bool  mutable = true
[inline]
 

Set the content to the provided array. Parameter 'mutable' specifies whether the given array is likely to change. If not, the array is aliased until such time this UString is altered.

Definition at line 440 of file UString.d.

References UText::length().

UString setTo UText  other,
bool  mutable = true
[inline]
 

Replace the content of this UString. If the new content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via one of these methods.

Definition at line 460 of file UString.d.

References UText::get(), and setTo().

UString setTo UText  other,
uint  start,
uint  len,
bool  mutable = true
[inline]
 

Replace the content of this UString. If the new content is immutable (read-only) then you might consider setting the 'mutable' parameter to false. Doing so will avoid allocating heap-space for the content until it is modified via one of these methods.

Definition at line 475 of file UString.d.

References UText::content, UText::pinIndices(), and setTo().

UString opIndexAssign wchar  chr,
uint  index
[inline]
 

Replace the character at the specified location.

Definition at line 487 of file UString.d.

UString remove uint  start,
uint  length = uint.max
[inline]
 

Remove a piece of this UString.

Definition at line 503 of file UString.d.

References memmove(), UText::pinIndices(), realloc(), and truncate().

UString truncate uint  length = 0  )  [inline]
 

Truncate the length of this UString.

Definition at line 527 of file UString.d.

Referenced by remove(), and UText::unEscape().

UString padLeading uint  count,
wchar  padChar = 0x0020
[inline]
 

Insert leading spaces in this UString

Definition at line 540 of file UString.d.

References expand(), memmove(), and setTo().

UString padTrailing uint  length,
wchar  padChar = 0x0020
[inline]
 

Append some trailing spaces to this UString.

Definition at line 554 of file UString.d.

References expand(), and setTo().

package void expand uint  count  )  [inline]
 

Check for available space within the buffer, and expand as necessary.

Definition at line 568 of file UString.d.

Referenced by format(), opCat(), padLeading(), and padTrailing().

final void realloc uint  count = 0  )  [inline, private]
 

Allocate memory due to a change in the content. We handle the distinction between mutable and immutable here.

Definition at line 581 of file UString.d.

Referenced by remove(), and setTo().

final UString opCat wchar *  chars,
uint  count
[inline, private]
 

Internal method to support UString appending

Definition at line 603 of file UString.d.

References expand().

package UString format Formatter  format,
char[]  msg
[inline]
 

Definition at line 621 of file UString.d.

References expand(), Formatter, and ICU::isError().

Referenced by UNormalize::concatenate(), UNumberFormat::format(), UMessageFormat::format(), UDateFormat::format(), UTimeZone::getDefault(), UCollator::getDisplayName(), USearch::getMatchedText(), UCommonFormat::getPattern(), UMessageFormat::getPattern(), UDateFormat::getPattern(), UCollator::getRules(), UCalendar::getTimeZoneName(), UDomainName::IdnToAscii(), UDomainName::IdnToUnicode(), opCat(), UStringPrep::prepare(), UDomainName::toAscii(), USet::toPattern(), and UDomainName::toUnicode().

this wchar[]  content  )  [inline, inherited]
 

Construct read-only wrapper around the given content

Definition at line 690 of file UString.d.

this  )  [inline, static, inherited]
 

Definition at line 1463 of file UString.d.

package wchar [] get  )  [inline, inherited]
 

Return the valid content from this UText

Definition at line 716 of file UString.d.

Referenced by UMessageFormat::Args::add(), USet::addString(), USet::applyPattern(), UNormalize::check(), UText::compare(), UNormalize::compare(), UDomainName::compare(), UNormalize::concatenate(), USet::containsString(), UText::endsWith(), UCollator::equal(), UTransform::execute(), UTimeZone::getDefault(), UCollator::getDisplayName(), UCollator::getRules(), UCollator::getSortKey(), UCollator::greater(), UCollator::greaterOrEqual(), UDomainName::IdnToAscii(), UDomainName::IdnToUnicode(), UText::indexOf(), UNormalize::isNormalized(), UText::lastIndexOf(), UNormalize::normalize(), opCat(), UStringPrep::prepare(), USet::removeString(), UTransform::setFilter(), USearch::setPattern(), UDecimalFormat::setPattern(), UMessageFormat::setPattern(), UDateFormat::setPattern(), USearch::setText(), UBreakIterator::setText(), setTo(), UCollator::setVariableTop(), UText::startsWith(), UCollator::strcoll(), UDomainName::toAscii(), and UDomainName::toUnicode().

override int opEquals Object  o  )  [inline, inherited]
 

Is this UText equal to another?

Definition at line 727 of file UString.d.

References UText::compare().

override int opCmp Object  o  )  [inline, inherited]
 

Compare this UText to another.

Definition at line 742 of file UString.d.

References UText::compare().

override uint toHash  )  [inline, inherited]
 

Hash this UText

Definition at line 760 of file UString.d.

UString copy  )  [inline, inherited]
 

Clone this UText into a UString

Definition at line 771 of file UString.d.

References UString.

UString extract uint  start,
uint  len = uint.max
[inline, inherited]
 

Clone a section of this UText into a UString

Definition at line 782 of file UString.d.

References UText::pinIndices(), and UString.

uint codePoints uint  start = 0,
uint  length = uint.max
[inline, inherited]
 

Count unicode code points in the length UChar code units of the string. A code point may occupy either one or two UChar code units. Counting code points involves reading all code units.

Definition at line 797 of file UString.d.

References UText::pinIndices().

Referenced by UText::hasSurrogates().

bool hasSurrogates uint  start = 0,
uint  length = uint.max
[inline, inherited]
 

Return an indication whether or not there are surrogate pairs within the string.

Definition at line 810 of file UString.d.

References UText::codePoints(), and UText::pinIndices().

wchar opIndex uint  index  )  [inline, inherited]
 

Return the character at the specified position.

Definition at line 822 of file UString.d.

uint length  )  [inline, inherited]
 

Return the length of the valid content

Definition at line 835 of file UString.d.

Referenced by UTransform::setFilter(), USearch::setPattern(), UDecimalFormat::setPattern(), UDateFormat::setPattern(), USearch::setText(), UBreakIterator::setText(), and setTo().

int compare UText  other,
bool  codePointOrder = false
[inline, inherited]
 

The comparison can be done in code unit order or in code point order. They differ only in UTF-16 when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff).

In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff.

Definition at line 853 of file UString.d.

References UText::get().

Referenced by UText::opCmp(), and UText::opEquals().

int compare wchar[]  other,
bool  codePointOrder = false
[inline, inherited]
 

The comparison can be done in code unit order or in code point order. They differ only in UTF-16 when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff).

In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff.

Definition at line 871 of file UString.d.

int compareFolded UText  other,
CaseOption  option = CaseOption.Default
[inline, inherited]
 

The comparison can be done in UTF-16 code unit order or in code point order. They differ only when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff).

In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff.

Definition at line 889 of file UString.d.

References UText::content.

Referenced by UText::compareFolded(), UText::endsWith(), and UText::startsWith().

int compareFolded wchar[]  other,
CaseOption  option = CaseOption.Default
[inline, inherited]
 

The comparison can be done in UTF-16 code unit order or in code point order. They differ only when comparing supplementary code points (U+10000..U+10ffff) to BMP code points near the end of the BMP (i.e., U+e000..U+ffff).

In code unit order, high BMP code points sort after supplementary code points because they are stored as pairs of surrogates which are at U+d800..U+dfff.

Definition at line 907 of file UString.d.

References UText::compareFolded().

private int compareFolded wchar[]  s1,
wchar[]  s2,
CaseOption  option = CaseOption.Default
[inline, inherited]
 

Helper for comparison methods

Definition at line 1397 of file UString.d.

bool startsWith UText  other  )  [inline, inherited]
 

Does this UText start with specified string?

Definition at line 918 of file UString.d.

References UText::get().

bool startsWith wchar[]  chars  )  [inline, inherited]
 

Does this UText start with specified string?

Definition at line 930 of file UString.d.

References UText::compareFolded().

bool endsWith UText  other  )  [inline, inherited]
 

Does this UText end with specified string?

Definition at line 943 of file UString.d.

References UText::get().

bool endsWith wchar[]  chars  )  [inline, inherited]
 

Does this UText end with specified string?

Definition at line 954 of file UString.d.

References UText::compareFolded().

uint indexOf wchar  c,
uint  start = 0
[inline, inherited]
 

Find the first occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair.

Definition at line 969 of file UString.d.

References UText::pinIndex().

Referenced by UText::indexOf().

uint indexOf UText  other,
uint  start = 0
[inline, inherited]
 

Find the first occurrence of a substring in a string.

The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs.

Definition at line 990 of file UString.d.

References UText::get(), and UText::indexOf().

uint indexOf wchar[]  chars,
uint  start = 0
[inline, inherited]
 

Find the first occurrence of a substring in a string.

The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs.

Definition at line 1007 of file UString.d.

References UText::pinIndex().

uint lastIndexOf wchar  c,
uint  start = uint.max
[inline, inherited]
 

Find the last occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair.

Definition at line 1024 of file UString.d.

References UText::pinIndex().

Referenced by UText::lastIndexOf().

uint lastIndexOf UText  other,
uint  start = uint.max
[inline, inherited]
 

Find the last occurrence of a BMP code point in a string. A surrogate code point is found only if its match in the text is not part of a surrogate pair.

Definition at line 1041 of file UString.d.

References UText::get(), and UText::lastIndexOf().

uint lastIndexOf wchar[]  chars,
uint  start = uint.max
[inline, inherited]
 

Find the last occurrence of a substring in a string.

The substring is found at code point boundaries. That means that if the substring begins with a trail surrogate or ends with a lead surrogate, then it is found only if these surrogates stand alone in the text. Otherwise, the substring edge units would be matched against halves of surrogate pairs.

Definition at line 1058 of file UString.d.

References UText::pinIndex().

UString toLower UString  dst  )  [inline, inherited]
 

Lowercase the characters into a seperate UString.

Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original.

Note that the return value refers to the provided destination UString.

Definition at line 1079 of file UString.d.

References UText::Default.

UString toLower UString  dst,
inout ULocale  locale
[inline, inherited]
 

Lowercase the characters into a seperate UString.

Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original.

Note that the return value refers to the provided destination UString.

Definition at line 1096 of file UString.d.

References ICU::toString().

UString toUpper UString  dst  )  [inline, inherited]
 

Uppercase the characters into a seperate UString.

Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original.

Note that the return value refers to the provided destination UString.

Definition at line 1119 of file UString.d.

References UText::Default.

UString toUpper UString  dst,
inout ULocale  locale
[inline, inherited]
 

Uppercase the characters into a seperate UString.

Casing is locale-dependent and context-sensitive. The result may be longer or shorter than the original.

Note that the return value refers to the provided destination UString.

Definition at line 1136 of file UString.d.

References ICU::toString().

UString toFolded UString  dst,
CaseOption  option = CaseOption.Default
[inline, inherited]
 

Case-fold the characters into a seperate UString.

Case-folding is locale-independent and not context-sensitive, but there is an option for whether to include or exclude mappings for dotted I and dotless i that are marked with 'I' in CaseFolding.txt. The result may be longer or shorter than the original.

Note that the return value refers to the provided destination UString.

Definition at line 1162 of file UString.d.

char [] toUtf8 char[]  dst = null  )  [inline, inherited]
 

Converts a sequence of wchar (UTF-16) to UTF-8 bytes. If the output array is not provided, an array of appropriate size will be allocated and returned. Where the output is provided, it must be large enough to hold potentially four bytes per character for surrogate-pairs or three bytes per character for BMP only. Consider using UConverter where streaming conversions are required.

Returns an array slice representing the valid UTF8 content.

Definition at line 1187 of file UString.d.

References ICU::testError().

UText trim  )  [inline, inherited]
 

Remove leading and trailing whitespace from this UText. Note that we slice the content to remove leading space.

Definition at line 1207 of file UString.d.

References UText::charAt.

UString unEscape  )  [inline, inherited]
 

Unescape a string of characters and write the resulting Unicode characters to the destination buffer. The following escape sequences are recognized:

uhhhh 4 hex digits; h in [0-9A-Fa-f] Uhhhhhhhh 8 hex digits xhh 1-2 hex digits x{h...} 1-8 hex digits ooo 1-3 octal digits; o in [0-7] cX control-X; X is masked with 0x1F

as well as the standard ANSI C escapes:

a => U+0007, \b => U+0008, \t => U+0009, \n => U+000A, v => U+000B, \f => U+000C, \r => U+000D, \e => U+001B, \" =U+0022, \' => U+0027, \? => U+003F, \\ => U+005C

Anything else following a backslash is generically escaped. For example, "[a\\-z]" returns "[a-z]".

If an escape sequence is ill-formed, this method returns an empty string. An example of an ill-formed sequence is "\\u" followed by fewer than 4 hex digits.

Definition at line 1256 of file UString.d.

References append, UText::charAt, truncate(), and UString.

bool isSurrogate wchar  c  )  [inline, static, inherited]
 

Is this code point a surrogate (U+d800..U+dfff)?

Definition at line 1285 of file UString.d.

bool isLeading wchar  c  )  [inline, static, inherited]
 

Is this code unit a lead surrogate (U+d800..U+dbff)?

Definition at line 1296 of file UString.d.

bool isTrailing wchar  c  )  [inline, static, inherited]
 

Is this code unit a trail surrogate (U+dc00..U+dfff)?

Definition at line 1307 of file UString.d.

uint getCharStart uint  i  )  [inline, inherited]
 

Adjust a random-access offset to a code point boundary at the start of a code point. If the offset points to the trail surrogate of a surrogate pair, then the offset is decremented. Otherwise, it is not modified.

Definition at line 1321 of file UString.d.

uint getCharLimit uint  i  )  [inline, inherited]
 

Adjust a random-access offset to a code point boundary after a code point. If the offset is behind the lead surrogate of a surrogate pair, then the offset is incremented. Otherwise, it is not modified.

Definition at line 1339 of file UString.d.

private void pinIndex inout uint  x  )  [inline, inherited]
 

Pin the given index to a valid position.

Definition at line 1370 of file UString.d.

Referenced by UText::indexOf(), and UText::lastIndexOf().

private void pinIndices inout uint  start,
inout uint  length
[inline, inherited]
 

Pin the given index and length to a valid position.

Definition at line 1382 of file UString.d.

Referenced by UText::codePoints(), UText::extract(), UText::hasSurrogates(), opCat(), remove(), and setTo().


Member Data Documentation

package uint len [inherited]
 

Definition at line 658 of file UString.d.

Referenced by USet::addString(), USet::applyPattern(), UNormalize::check(), UNormalize::compare(), UDomainName::compare(), UNormalize::concatenate(), USet::containsString(), UCollator::equal(), UTransform::execute(), UCollator::getDisplayName(), UCollator::getRules(), UCollator::getSortKey(), UCollator::greater(), UCollator::greaterOrEqual(), UDomainName::IdnToAscii(), UDomainName::IdnToUnicode(), UNormalize::isNormalized(), UNormalize::normalize(), UDateFormat::parse(), UNumberFormat::parseDouble(), UNumberFormat::parseInteger(), UNumberFormat::parseLong(), UStringPrep::prepare(), USet::removeString(), UTransform::setFilter(), UMessageFormat::setPattern(), UCollator::setVariableTop(), UCollator::strcoll(), UDomainName::toAscii(), and UDomainName::toUnicode().

package wchar [] content [inherited]
 

Definition at line 659 of file UString.d.

Referenced by UText::compareFolded(), UTransform::execute(), opCat(), UDateFormat::parse(), UNumberFormat::parseDouble(), UNumberFormat::parseInteger(), UNumberFormat::parseLong(), and setTo().

FunctionLoader Bind [] targets [static, inherited]
 

Initial value:

 
                [
                {cast(void**) &u_strFindFirst,      "u_strFindFirst"}

Definition at line 1442 of file UString.d.


The documentation for this class was generated from the following file:
Generated on Tue Jan 25 21:18:44 2005 for Mango by doxygen 1.3.6