Inheritance diagram for USearch:
Public Types | |
enum | Attribute { Overlap, CanonicalMatch, Count } |
enum | AttributeValue { Default = -1, Off, On, Count } |
Public Member Functions | |
this (UText pattern, UText text, inout ULocale locale, UBreakIterator iterator=null) | |
this (UText pattern, UText text, UCollator col, UBreakIterator iterator=null) | |
~this () | |
void | setOffset (uint position) |
uint | getOffset () |
uint | getMatchedStart () |
uint | getMatchedLength () |
void | getMatchedText (UString s) |
void | setText (UText t) |
UText | getText () |
void | setPattern (UText t) |
UText | getPattern () |
void | setIterator (UBreakIterator iterator) |
UBreakIterator | getIterator () |
uint | first () |
uint | last () |
uint | next (uint pos=uint.max) |
uint | previous (uint pos=uint.max) |
void | reset () |
UCollator | getCollator () |
void | setCollator (UCollator col) |
Static Public Member Functions | |
static | this () |
static | ~this () |
Public Attributes | |
const uint | Done = uint.max |
Static Public Attributes | |
static FunctionLoader Bind[] | targets |
Private Types | |
typedef void * | Handle |
enum | Error { OK, BufferOverflow = 15 } |
Static Private Member Functions | |
static bool | isError (Error e) |
static void | testError (Error e, char[] msg) |
static char * | toString (char[] string) |
static wchar * | toString (wchar[] string) |
static uint | length (char *s) |
static uint | length (wchar *s) |
static char[] | toArray (char *s) |
static wchar[] | toArray (wchar *s) |
Private Attributes | |
Handle | handle |
UBreakIterator | iterator |
Static Private Attributes | |
static void * | library |
The algorithm implemented is a modified form of the Boyer Moore's search. For more information see "Efficient Text Searching in Java", published in Java Report in February, 1999, for further information on the algorithm.
There are 2 match options for selection: Let S' be the sub-string of a text string S between the offsets start and end <start, end>. A pattern string P matches a text string S at the offsets <start, end> if
Option 2 will be the default
This search has APIs similar to that of other text iteration mechanisms such as the break iterators in ubrk.h. Using these APIs, it is easy to scan through text looking for all occurances of a given pattern. This search iterator allows changing of direction by calling a reset followed by a next or previous. Though a direction change can occur without calling reset first, this operation comes with some speed penalty. Generally, match results in the forward direction will match the result matches in the backwards direction in the reverse order
USearch provides APIs to specify the starting position within the text string to be searched, e.g. setOffset(), previous(x) and next(x). Since the starting position will be set as it is specified, please take note that there are some dangerous positions which the search may render incorrect results:
A breakiterator can be used if only matches at logical breaks are desired. Using a breakiterator will only give you results that exactly matches the boundaries given by the breakiterator. For instance the pattern "e" will not be found in the string "\u00e9" if a character break iterator is used.
Options are provided to handle overlapping matches. E.g. In English, overlapping matches produces the result 0 and 2 for the pattern "abab" in the text "ababab", where else mutually exclusive matches only produce the result of 0.
Though collator attributes will be taken into consideration while performing matches, there are no APIs here for setting and getting the attributes. These attributes can be set by getting the collator from getCollator() and using the APIs in UCollator. Lastly to update String Search to the new collator attributes, reset() has to be called.
See http://oss.software.ibm.com/icu/apiref/usearch_8h.html for full details.
Definition at line 175 of file USearch.d.
|
Use this for the primary argument-type to most ICU functions |
|
|
|
|
|
ICU error codes (the ones which are referenced) |
|
Close this USearch Definition at line 239 of file USearch.d. References handle. |
|
Definition at line 603 of file USearch.d. References library. |
|
Creating a search iterator data struct using the argument locale language rule set Definition at line 208 of file USearch.d. References handle, iterator, and ICU::testError(). |
|
Creating a search iterator data struct using the argument locale language rule set Definition at line 224 of file USearch.d. References handle, iterator, and ICU::testError(). |
|
Sets the current position in the text string which the next search will start from. Definition at line 251 of file USearch.d. References handle, and ICU::testError(). |
|
Return the current index in the string text being searched Definition at line 265 of file USearch.d. References handle. |
|
Returns the index to the match in the text string that was searched Definition at line 277 of file USearch.d. References handle. |
|
Returns the length of text in the string which matches the search pattern Definition at line 289 of file USearch.d. References handle. |
|
Returns the text that was matched by the most recent call to first(), next(), previous(), or last(). Definition at line 301 of file USearch.d. References UString::format(), and handle. |
|
Set the string text to be searched. Definition at line 317 of file USearch.d. References UText::get(), handle, UText::length(), and ICU::testError(). |
|
Return the string text to be searched. Note that this returns a read-only reference to the search text. Definition at line 332 of file USearch.d. References handle. |
|
Sets the pattern used for matching Definition at line 346 of file USearch.d. References UText::get(), handle, UText::length(), and ICU::testError(). |
|
Gets the search pattern. Note that this returns a read-only reference to the pattern. Definition at line 361 of file USearch.d. References handle. |
|
Set the BreakIterator that will be used to restrict the points at which matches are detected. Definition at line 376 of file USearch.d. References UBreakIterator::handle, handle, and ICU::testError(). |
|
Get the BreakIterator that will be used to restrict the points at which matches are detected. Definition at line 392 of file USearch.d. References iterator. |
|
Returns the first index at which the string text matches the search pattern Definition at line 404 of file USearch.d. References handle, and ICU::testError(). |
|
Returns the last index in the target text at which it matches the search pattern Definition at line 420 of file USearch.d. References handle, and ICU::testError(). |
|
Returns the index of the next point at which the string text matches the search pattern, starting from the current position. If pos is specified, returns the first index greater than pos at which the string text matches the search pattern Definition at line 440 of file USearch.d. References handle, and ICU::testError(). |
|
Returns the index of the previous point at which the string text matches the search pattern, starting at the current position. If pos is specified, returns the first index less than pos at which the string text matches the search pattern. Definition at line 464 of file USearch.d. References handle, and ICU::testError(). |
|
Search will begin at the start of the text string if a forward iteration is initiated before a backwards iteration. Otherwise if a backwards iteration is initiated before a forwards iteration, the search will begin at the end of the text string Definition at line 486 of file USearch.d. References handle. |
|
Gets the collator used for the language rules. |
|
Sets the collator used for the language rules. This method causes internal data such as Boyer-Moore shift tables to be recalculated, but the iterator's position is unchanged Definition at line 511 of file USearch.d. References UCollator::handle, handle, and ICU::testError(). |
|
|
|
Definition at line 156 of file ICU.d. Referenced by UCollator::getLocale(). |
|
|
|
|
|
Definition at line 228 of file ICU.d. References strlen(). Referenced by UConverter::UTranscoder::convert(). |
|
Definition at line 237 of file ICU.d. References wcslen(). |
|
Definition at line 246 of file ICU.d. References strlen(). Referenced by UConverter::detectSignature(), UResourceBundle::getKey(), UResourceBundle::getLocale(), UMessageFormat::getLocale(), UCollator::getLocale(), UConverter::getName(), UChar::getPropertyName(), UChar::getPropertyValueName(), and UConverter::opApply(). |
|
Definition at line 257 of file ICU.d. References wcslen(). |
|
Definition at line 177 of file USearch.d. Referenced by first(), getCollator(), getMatchedLength(), getMatchedStart(), getMatchedText(), getOffset(), getPattern(), getText(), last(), next(), previous(), reset(), setCollator(), setIterator(), setOffset(), setPattern(), setText(), this(), and ~this(). |
|
Definition at line 178 of file USearch.d. Referenced by getIterator(), and this(). |
|
|
|
Bind the ICU functions from a shared library. This is complicated by the issues regarding D and DLLs on the Windows platform |
|
Initial value: [ {cast(void**) &usearch_open, "usearch_open"} Definition at line 564 of file USearch.d. Referenced by this(). |