Main Page | Class Hierarchy | Alphabetical List | Class List | File List | Class Members | File Members | Related Pages

UBreakIterator Class Reference

Inheritance diagram for UBreakIterator:

ICU UCharacterIterator ULineIterator URuleIterator USentenceIterator UTitleIterator UWordIterator List of all members.

Public Member Functions

 this (Type type, inout ULocale locale, UText text)
 ~this ()
void setText (UText text)
uint current ()
uint next (uint offset=uint.max)
uint previous (uint offset=uint.max)
uint first ()
uint last ()
bool isBoundary (uint offset)
void getStatus (inout uint s)

Static Public Member Functions

 this ()
 ~this ()

Public Attributes

package Handle handle
const uint Done = uint.max

Static Public Attributes

FunctionLoader Bind[] targets

Private Types

enum  Type {
  Character, Word, Line, Sentence,
  Title
}
typedef void * Handle
enum  Error { OK, BufferOverflow = 15 }

Private Member Functions

 this ()
uint getStatus ()

Static Private Member Functions

bool isError (Error e)
void testError (Error e, char[] msg)
char * toString (char[] string)
wchar * toString (wchar[] string)
uint length (char *s)
uint length (wchar *s)
char[] toArray (char *s)
wchar[] toArray (wchar *s)

Static Private Attributes

void * library

Detailed Description

BreakIterator defines methods for finding the location of boundaries in text. Pointer to a UBreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.

Line boundary analysis determines where a text string can be broken when line-wrapping. The mechanism correctly handles punctuation and hyphenated words.

Sentence boundary analysis allows selection with correct interpretation of periods within numbers and abbreviations, and trailing punctuation marks such as quotation marks and parentheses.

Word boundary analysis is used by search and replace functions, as well as within text editing applications that allow the user to select words with a double click. Word selection provides correct interpretation of punctuation marks within and following words. Characters that are not part of a word, such as symbols or punctuation marks, have word-breaks on both sides.

Character boundary analysis allows users to interact with characters as they expect to, for example, when moving the cursor through a text string. Character boundary analysis provides correct navigation of through character strings, regardless of how the character is stored. For example, an accented character might be stored as a base character and a diacritical mark. What users consider to be a character can differ between languages.

Title boundary analysis locates all positions, typically starts of words, that should be set to Title Case when title casing the text.

See this page for full details.

Definition at line 303 of file UBreakIterator.d.


Member Typedef Documentation

typedef void* Handle [protected, inherited]
 

Use this for the primary argument-type to most ICU functions

Definition at line 114 of file ICU.d.


Member Enumeration Documentation

enum Type [private]
 

internal types passed to C API

Enumeration values:
Character 
Word 
Line 
Sentence 
Title 

Definition at line 316 of file UBreakIterator.d.

enum Error [protected, inherited]
 

ICU error codes (the ones which are referenced)

Enumeration values:
OK 
BufferOverflow 

Definition at line 148 of file ICU.d.


Constructor & Destructor Documentation

~this  )  [inline]
 

Close a UBreakIterator

Definition at line 358 of file UBreakIterator.d.

~this  )  [inline, static]
 

Definition at line 560 of file UBreakIterator.d.


Member Function Documentation

this  )  [inline, private]
 

Internal use only!

Definition at line 332 of file UBreakIterator.d.

this Type  type,
inout ULocale  locale,
UText  text
[inline]
 

Open a new UBreakIterator for locating text boundaries for a specified locale. A UBreakIterator may be used for detecting character, line, word, and sentence breaks in text.

Definition at line 344 of file UBreakIterator.d.

References ICU::testError(), and ICU::toString().

void setText UText  text  )  [inline]
 

Sets an existing iterator to point to a new piece of text

Definition at line 369 of file UBreakIterator.d.

References UText::get(), UText::length(), and ICU::testError().

uint current  )  [inline]
 

Determine the most recently-returned text boundary

Definition at line 382 of file UBreakIterator.d.

uint next uint  offset = uint.max  )  [inline]
 

Determine the text boundary following the current text boundary, or UBRK_DONE if all text boundaries have been returned.

If offset is specified, determines the text boundary following the current text boundary: The value returned is always greater than offset, or Done

Definition at line 399 of file UBreakIterator.d.

uint previous uint  offset = uint.max  )  [inline]
 

Determine the text boundary preceding the current text boundary, or Done if all text boundaries have been returned.

If offset is specified, determines the text boundary preceding the specified offset. The value returned is always smaller than offset, or Done.

Definition at line 417 of file UBreakIterator.d.

uint first  )  [inline]
 

Determine the index of the first character in the text being scanned. This is not always the same as index 0 of the text.

Definition at line 432 of file UBreakIterator.d.

uint last  )  [inline]
 

Determine the index immediately beyond the last character in the text being scanned. This is not the same as the last character

Definition at line 445 of file UBreakIterator.d.

bool isBoundary uint  offset  )  [inline]
 

Returns true if the specfied position is a boundary position. As a side effect, leaves the iterator pointing to the first boundary position at or after "offset".

Definition at line 458 of file UBreakIterator.d.

void getStatus inout uint  s  )  [inline]
 

Return the status from the break rule that determined the most recently returned break position.

Definition at line 470 of file UBreakIterator.d.

References getStatus().

uint getStatus  )  [inline, private]
 

Return the status from the break rule that determined the most recently returned break position.

The values appear in the rule source within brackets, {123}, for example. For rules that do not specify a status, a default value of 0 is returned.

For word break iterators, the possible values are defined in enum UWordBreak

Definition at line 489 of file UBreakIterator.d.

Referenced by getStatus().

this  )  [inline, static]
 

Definition at line 551 of file UBreakIterator.d.

bool isError Error  e  )  [inline, static, protected, inherited]
 

Definition at line 158 of file ICU.d.

Referenced by UConverter::detectSignature(), UString::format(), UCollator::getLocale(), and UConverter::this().

void testError Error  e,
char[]  msg
[inline, static, protected, inherited]
 

Definition at line 176 of file ICU.d.

Referenced by UCalendar::add(), USet::applyPattern(), UChar::charFromName(), UNormalize::check(), URegex::clone(), UNormalize::compare(), UDomainName::compare(), UConverter::UTranscoder::convert(), UEnumeration::count(), UConverter::decode(), UConverter::encode(), URegex::end(), UTransform::execute(), USearch::first(), UResourceBundle::get(), UCalendar::get(), UCollator::getAttribute(), UResourceBundle::getBinary(), UCollator::getBound(), UChar::getCharName(), UChar::getComment(), UCollator::getContractions(), URegex::getFlags(), UResourceBundle::getInt(), UResourceBundle::getIntVector(), UCalendar::getLimit(), UResourceBundle::getLocale(), UCalendar::getMillis(), UConverter::getName(), UResourceBundle::getNextString(), URegex::getPattern(), UCollator::getShortDefinitionString(), UResourceBundle::getString(), UCollator::getTailoredSet(), UDateFormat::getTwoDigitYearStart(), UCollator::getVariableTop(), URegex::groupCount(), UCalendar::inDaylightTime(), UNormalize::isNormalized(), USearch::last(), URegex::match(), USearch::next(), URegex::next(), UEnumeration::next(), UCollator::normalizeShortDefinitionString(), UDateFormat::parse(), USearch::previous(), URegex::probe(), URegex::replaceAll(), URegex::replaceFirst(), URegex::reset(), UEnumeration::reset(), UCalendar::roll(), UCollator::setAttribute(), USearch::setCollator(), UCalendar::setDate(), UCalendar::setDateTime(), UTransform::setFilter(), USearch::setIterator(), UCalendar::setMillis(), USearch::setOffset(), USearch::setPattern(), UDecimalFormat::setPattern(), UMessageFormat::setPattern(), USearch::setText(), URegex::setText(), setText(), UCalendar::setTimeZone(), UDateFormat::setTwoDigitYearStart(), UCollator::setVariableTop(), URegex::split(), URegex::start(), UTransform::this(), UStringPrep::this(), USet::this(), USearch::this(), UResourceBundle::this(), URegex::this(), UNumberFormat::this(), UMessageFormat::this(), UDateFormat::this(), UCollator::this(), UCalendar::this(), this(), URuleIterator::this(), and UText::toUtf8().

char* toString char[]  string  )  [inline, static, protected, inherited]
 

Definition at line 186 of file ICU.d.

References string.

Referenced by UChar::charFromName(), UConverter::compareNames(), UCollator::getDisplayName(), UResourceBundle::getResource(), UCollator::getShortDefinitionString(), UResourceBundle::getString(), UCalendar::getTimeZoneName(), UCollator::normalizeShortDefinitionString(), UMessageFormat::setLocale(), UStringPrep::this(), UResourceBundle::this(), UDateFormat::this(), UCollator::this(), this(), UText::toLower(), and UText::toUpper().

wchar* toString wchar[]  string  )  [inline, static, protected, inherited]
 

Definition at line 208 of file ICU.d.

References string.

uint length char *  s  )  [inline, static, protected, inherited]
 

Definition at line 230 of file ICU.d.

References strlen().

uint length wchar *  s  )  [inline, static, protected, inherited]
 

Definition at line 239 of file ICU.d.

References wcslen().

char [] toArray char *  s  )  [inline, static, protected, inherited]
 

Definition at line 248 of file ICU.d.

References strlen().

Referenced by UConverter::detectSignature(), UResourceBundle::getKey(), UResourceBundle::getLocale(), UMessageFormat::getLocale(), UCollator::getLocale(), UConverter::getName(), UChar::getPropertyName(), UChar::getPropertyValueName(), and UConverter::opApply().

wchar [] toArray wchar *  s  )  [inline, static, protected, inherited]
 

Definition at line 259 of file ICU.d.

References wcslen().


Member Data Documentation

package Handle handle
 

Definition at line 305 of file UBreakIterator.d.

Referenced by USearch::setIterator().

const uint Done = uint.max
 

Definition at line 308 of file UBreakIterator.d.

void* library [static, private]
 

Bind the ICU functions from a shared library. This is complicated by the issues regarding D and DLLs on the Windows platform

Definition at line 503 of file UBreakIterator.d.

FunctionLoader Bind [] targets [static]
 

Initial value:

 
                [
                {cast(void**) &ubrk_open,               "ubrk_open"}

Definition at line 530 of file UBreakIterator.d.


The documentation for this class was generated from the following file:
Generated on Sat Apr 9 20:11:44 2005 for Mango by doxygen 1.3.6