Main Page | Class Hierarchy | Alphabetical List | Class List | Directories | File List | Class Members | File Members | Related Pages

Utf Class Reference

List of all members.

Static Public Member Functions

static char[] toUtf8 (wchar[] input, char[] output=null)
static wchar[] toUtf16 (char[] input, wchar[] output=null)
static char[] toUtf8 (dchar[] input, char[] output=null)
static dchar[] toUtf32 (char[] input, dchar[] output=null)

Static Private Member Functions

static void error (char[] msg)

Detailed Description

Fast UTF-8 to Unicode transcoder. These are really sensitive to small changes on 32bit x86 devices, because the register set of those devices is so small. Beware of subtle changes which might extend the execution-period by as much as 200% ...

These routines were tuned on an Intel P3; other devices may work more efficiently with a slightly different approach, though this is likely to be reasonably optimal on AMD x86 CPUs also. These algorithms could benefit significantly from those extra AMD64 registers.

Note that foreach can produce noticeable more efficient code than equivalent for() loops, with either indices or pointers. The 0.98 compiler version exhibited some rather erratic behavior over the course of testing: in particular, elapsed time of method execution is noticeably dependent upon its physical location within the file (or, more specifically, the enclosing class). Yes, it sure sounds crazy that if you switch the order of encode() with decode() that they will consistently execute slower than as currently arranged.

Finally, please note that these are between 5 and 30 times faster than equivalent functions in the std.utf Phobos module (dependent upon the mix of char values). Those functions (strangely) often allocate memory on a character basis, so will become significantly slower where there's heap-contention by multiple threads.

Definition at line 73 of file Utf.d.


Member Function Documentation

static void error char[]  msg  )  [inline, static, private]
 

Definition at line 79 of file Utf.d.

Referenced by toUtf16(), toUtf32(), and toUtf8().

static char [] toUtf8 wchar[]  input,
char[]  output = null
[inline, static]
 

Encode Utf8 up to a maximum of 3 bytes long (four, five & six byte variations are not supported). Throws an exception where the input wchar is greater than 0xd800.

If the output is provided off the stack, it should be large enough to encompass the entire utf8 encoding. This option is provided purely as an optimization for those cases where all boundary conditions are explicitly checked for by the caller.

Definition at line 97 of file Utf.d.

References error().

static wchar [] toUtf16 char[]  input,
wchar[]  output = null
[inline, static]
 

Decode Utf8 produced by the above toUtf8() method.

If the output is provided off the stack, it should be large enough to encompass the entire utf8 encoding. This option is provided purely as an optimization for those cases where all boundary conditions are explicitly checked for by the caller.

Definition at line 151 of file Utf.d.

References error().

static char [] toUtf8 dchar[]  input,
char[]  output = null
[inline, static]
 

Encode Utf8 up to a maximum of 4 bytes long (five & six byte variations are not supported). Throws an exception where the input dchar is greater than 0x10ffff.

If the output is provided off the stack, it should be large enough to encompass the entire utf8 encoding. This option is provided purely as an optimization for those cases where all boundary conditions are explicitly checked for by the caller.

Definition at line 207 of file Utf.d.

References error().

static dchar [] toUtf32 char[]  input,
dchar[]  output = null
[inline, static]
 

Decode Utf8 produced by the above toUtf8() method.

If the output is provided off the stack, it should be large enough to encompass the entire utf8 encoding. This option is provided purely as an optimization for those cases where all boundary conditions are explicitly checked for by the caller.

Definition at line 270 of file Utf.d.

References error().


The documentation for this class was generated from the following file:
Generated on Fri May 27 18:12:05 2005 for Mango by  doxygen 1.4.0