Main Page | Class Hierarchy | Alphabetical List | Class List | Directories | File List | Class Members | File Members | Related Pages

UBreakIterator Class Reference

Inheritance diagram for UBreakIterator:

ICU UCharacterIterator ULineIterator URuleIterator USentenceIterator UTitleIterator UWordIterator List of all members.

Public Member Functions

 this (Type type, inout ULocale locale, UText text)
 ~this ()
void setText (UText text)
uint current ()
uint next (uint offset=uint.max)
uint previous (uint offset=uint.max)
uint first ()
uint last ()
bool isBoundary (uint offset)
void getStatus (inout uint s)

Static Public Member Functions

static this ()
static ~this ()

Public Attributes

package Handle handle
const uint Done = uint.max

Static Public Attributes

static FunctionLoader Bind[] targets

Private Types

enum  Type {
  Character, Word, Line, Sentence,
  Title
}
typedef void * Handle
enum  Error { OK, BufferOverflow = 15 }

Private Member Functions

 this ()
uint getStatus ()

Static Private Member Functions

static bool isError (Error e)
static void testError (Error e, char[] msg)
static char * toString (char[] string)
static wchar * toString (wchar[] string)
static uint length (char *s)
static uint length (wchar *s)
static char[] toArray (char *s)
static wchar[] toArray (wchar *s)

Static Private Attributes

static void * library

Detailed Description

BreakIterator defines methods for finding the location of boundaries in text. Pointer to a UBreakIterator maintain a current position and scan over text returning the index of characters where boundaries occur.

Line boundary analysis determines where a text string can be broken when line-wrapping. The mechanism correctly handles punctuation and hyphenated words.

Sentence boundary analysis allows selection with correct interpretation of periods within numbers and abbreviations, and trailing punctuation marks such as quotation marks and parentheses.

Word boundary analysis is used by search and replace functions, as well as within text editing applications that allow the user to select words with a double click. Word selection provides correct interpretation of punctuation marks within and following words. Characters that are not part of a word, such as symbols or punctuation marks, have word-breaks on both sides.

Character boundary analysis allows users to interact with characters as they expect to, for example, when moving the cursor through a text string. Character boundary analysis provides correct navigation of through character strings, regardless of how the character is stored. For example, an accented character might be stored as a base character and a diacritical mark. What users consider to be a character can differ between languages.

Title boundary analysis locates all positions, typically starts of words, that should be set to Title Case when title casing the text.

See this page for full details.

Definition at line 303 of file UBreakIterator.d.


Member Typedef Documentation

typedef void* Handle [protected, inherited]
 

Use this for the primary argument-type to most ICU functions

Definition at line 112 of file ICU.d.


Member Enumeration Documentation

enum Type [private]
 

internal types passed to C API

Enumeration values:
Character 
Word 
Line 
Sentence 
Title 

Definition at line 316 of file UBreakIterator.d.

enum Error [protected, inherited]
 

ICU error codes (the ones which are referenced)

Enumeration values:
OK 
BufferOverflow 

Definition at line 146 of file ICU.d.


Constructor & Destructor Documentation

~this  )  [inline]
 

Close a UBreakIterator

Definition at line 358 of file UBreakIterator.d.

static ~this  )  [inline, static]
 

Definition at line 560 of file UBreakIterator.d.


Member Function Documentation

this  )  [inline, private]
 

Internal use only!

Definition at line 332 of file UBreakIterator.d.

this Type  type,
inout ULocale  locale,
UText  text
[inline]
 

Open a new UBreakIterator for locating text boundaries for a specified locale. A UBreakIterator may be used for detecting character, line, word, and sentence breaks in text.

Definition at line 344 of file UBreakIterator.d.

References ICU::testError(), and ICU::toString().

void setText UText  text  )  [inline]
 

Sets an existing iterator to point to a new piece of text

Definition at line 369 of file UBreakIterator.d.

References UText::get(), UText::length(), and ICU::testError().

uint current  )  [inline]
 

Determine the most recently-returned text boundary

Definition at line 382 of file UBreakIterator.d.

uint next uint  offset = uint.max  )  [inline]
 

Determine the text boundary following the current text boundary, or UBRK_DONE if all text boundaries have been returned.

If offset is specified, determines the text boundary following the current text boundary: The value returned is always greater than offset, or Done

Definition at line 399 of file UBreakIterator.d.

uint previous uint  offset = uint.max  )  [inline]
 

Determine the text boundary preceding the current text boundary, or Done if all text boundaries have been returned.

If offset is specified, determines the text boundary preceding the specified offset. The value returned is always smaller than offset, or Done.

Definition at line 417 of file UBreakIterator.d.

uint first  )  [inline]
 

Determine the index of the first character in the text being scanned. This is not always the same as index 0 of the text.

Definition at line 432 of file UBreakIterator.d.

uint last  )  [inline]
 

Determine the index immediately beyond the last character in the text being scanned. This is not the same as the last character

Definition at line 445 of file UBreakIterator.d.

bool isBoundary uint  offset  )  [inline]
 

Returns true if the specfied position is a boundary position. As a side effect, leaves the iterator pointing to the first boundary position at or after "offset".

Definition at line 458 of file UBreakIterator.d.

void getStatus inout uint  s  )  [inline]
 

Return the status from the break rule that determined the most recently returned break position.

Definition at line 470 of file UBreakIterator.d.

References getStatus().

uint getStatus  )  [inline, private]
 

Return the status from the break rule that determined the most recently returned break position.

The values appear in the rule source within brackets, {123}, for example. For rules that do not specify a status, a default value of 0 is returned.

For word break iterators, the possible values are defined in enum UWordBreak

Definition at line 489 of file UBreakIterator.d.

Referenced by getStatus().

static this  )  [inline, static]
 

Definition at line 551 of file UBreakIterator.d.

static bool isError Error  e  )  [inline, static, protected, inherited]
 

Definition at line 156 of file ICU.d.

Referenced by UCollator::getLocale().

static void testError Error  e,
char[]  msg
[inline, static, protected, inherited]
 

Definition at line 174 of file ICU.d.

Referenced by UCalendar::add(), USet::applyPattern(), UChar::charFromName(), UNormalize::check(), URegex::clone(), UNormalize::compare(), UDomainName::compare(), UText::compareFolded(), UConverter::UTranscoder::convert(), UEnumeration::count(), UConverter::decode(), UConverter::encode(), URegex::end(), UTransform::execute(), USearch::first(), UResourceBundle::get(), UCalendar::get(), UCollator::getAttribute(), UResourceBundle::getBinary(), UCollator::getBound(), UChar::getCharName(), UChar::getComment(), UCollator::getContractions(), URegex::getFlags(), UResourceBundle::getInt(), UResourceBundle::getIntVector(), UCalendar::getLimit(), UResourceBundle::getLocale(), UCalendar::getMillis(), UConverter::getName(), UResourceBundle::getNextString(), URegex::getPattern(), UCollator::getShortDefinitionString(), UResourceBundle::getString(), UCollator::getTailoredSet(), UDateFormat::getTwoDigitYearStart(), UCollator::getVariableTop(), URegex::groupCount(), UCalendar::inDaylightTime(), UNormalize::isNormalized(), USearch::last(), URegex::match(), USearch::next(), URegex::next(), UEnumeration::next(), UCollator::normalizeShortDefinitionString(), UDateFormat::parse(), USearch::previous(), URegex::probe(), URegex::replaceAll(), URegex::replaceFirst(), URegex::reset(), UEnumeration::reset(), UCalendar::roll(), UCollator::setAttribute(), USearch::setCollator(), UCalendar::setDate(), UCalendar::setDateTime(), UTransform::setFilter(), USearch::setIterator(), UCalendar::setMillis(), USearch::setOffset(), USearch::setPattern(), UDecimalFormat::setPattern(), UMessageFormat::setPattern(), USearch::setText(), URegex::setText(), setText(), UCalendar::setTimeZone(), UDateFormat::setTwoDigitYearStart(), UCollator::setVariableTop(), URegex::split(), URegex::start(), UTransform::this(), UStringPrep::this(), USet::this(), USearch::this(), UResourceBundle::this(), URegex::this(), UNumberFormat::this(), UMessageFormat::this(), UDateFormat::this(), UCollator::this(), UCalendar::this(), this(), URuleIterator::this(), and UText::toUtf8().

static char* toString char[]  string  )  [inline, static, protected, inherited]
 

Definition at line 184 of file ICU.d.

Referenced by UChar::charFromName(), UConverter::compareNames(), UCollator::getDisplayName(), UResourceBundle::getResource(), UCollator::getShortDefinitionString(), UResourceBundle::getString(), UCalendar::getTimeZoneName(), UCollator::normalizeShortDefinitionString(), UMessageFormat::setLocale(), UStringPrep::this(), UResourceBundle::this(), UDateFormat::this(), UCollator::this(), this(), UText::toLower(), and UText::toUpper().

static wchar* toString wchar[]  string  )  [inline, static, protected, inherited]
 

Definition at line 206 of file ICU.d.

static uint length char *  s  )  [inline, static, protected, inherited]
 

Definition at line 228 of file ICU.d.

References strlen().

Referenced by UConverter::UTranscoder::convert().

static uint length wchar *  s  )  [inline, static, protected, inherited]
 

Definition at line 237 of file ICU.d.

References wcslen().

static char [] toArray char *  s  )  [inline, static, protected, inherited]
 

Definition at line 246 of file ICU.d.

References strlen().

Referenced by UConverter::detectSignature(), UResourceBundle::getKey(), UResourceBundle::getLocale(), UMessageFormat::getLocale(), UCollator::getLocale(), UConverter::getName(), UChar::getPropertyName(), UChar::getPropertyValueName(), and UConverter::opApply().

static wchar [] toArray wchar *  s  )  [inline, static, protected, inherited]
 

Definition at line 257 of file ICU.d.

References wcslen().


Member Data Documentation

package Handle handle
 

Definition at line 305 of file UBreakIterator.d.

Referenced by USearch::setIterator().

const uint Done = uint.max
 

Definition at line 308 of file UBreakIterator.d.

void* library [static, private]
 

Bind the ICU functions from a shared library. This is complicated by the issues regarding D and DLLs on the Windows platform

Definition at line 503 of file UBreakIterator.d.

FunctionLoader Bind [] targets [static]
 

Initial value:

 
                [
                {cast(void**) &ubrk_open,               "ubrk_open"}

Definition at line 530 of file UBreakIterator.d.


The documentation for this class was generated from the following file:
Generated on Fri Nov 11 18:44:45 2005 for Mango by  doxygen 1.4.0