Main Page | Class Hierarchy | Alphabetical List | Class List | File List | Class Members | File Members | Related Pages

TranscoderUtf8 Class Reference

Inheritance diagram for TranscoderUtf8:

Transcoder TranscoderUtf8Checked List of all members.

Public Member Functions

char[] toString ()
char[] encode (dchar[] input, char[] output, out uint consumed)
dchar[] decode (char[] input, dchar[] output, out uint consumed)

Private Member Functions

void fault (char[] msg)

Detailed Description

Fast UTF-8 to Unicode transcoder. These are really sensitive to small changes on 32bit x86 devices, because the register set of those devices is so small. Beware of subtle changes which might extend the execution-period by as much as 200% ...

These routines were tuned on an Intel P3; other devices may work more efficiently with a slightly different approach, though this is likely to be reasonably optimal on AMD x86 CPUs also. These algorithms could benefit significantly from those extra AMD64 registers.

Note that foreach can produce noticeable more efficient code than equivalent for() loops, with either indices or pointers. The 0.98 compiler version exhibited some rather erratic behavior over the course of testing: in particular, elapsed time of method execution is noticeably dependent upon its physical location within the file (or, more specifically, the enclosing class). Yes, it sure sounds crazy that if you switch the order of encode() with decode() that they will consistently execute slower than as currently arranged.

Finally, please note that these are between 5 and 30 times faster than equivalent functions in the std.utf Phobos module (dependent upon the mix of char values). Those functions (strangely) often allocate memory on a character basis, so will become significantly slower where there's heap-contention by multiple threads.

Definition at line 70 of file Utf8.d.


Member Function Documentation

char [] toString  )  [inline]
 

Return the encoding name of this transcoder

Reimplemented from Transcoder.

Definition at line 76 of file Utf8.d.

char [] encode dchar[]  input,
char[]  output,
out uint  consumed
[inline]
 

Encode UTF-8 up to a maximum of 4 bytes long (five & six byte variations are not supported). Throws an exception where the input dchar is greater than 0x10ffff.

Reimplemented from Transcoder.

Definition at line 89 of file Utf8.d.

References Transcoder::fault().

dchar [] decode char[]  input,
dchar[]  output,
out uint  consumed
[inline]
 

Decode UTF-8 produced by the above encode() method. This executes notably faster than the validating version.

Reimplemented from Transcoder.

Reimplemented in TranscoderUtf8Checked.

Definition at line 159 of file Utf8.d.

void fault char[]  msg  )  [inline, protected, inherited]
 

overridable exception thrower

Definition at line 78 of file Transcoder.d.

References Exception.

Referenced by TranscoderUtf8Checked::decode(), encode(), and TranscoderIso8859_1::encode().


The documentation for this class was generated from the following file:
Generated on Sun Oct 24 22:31:32 2004 for Mango by doxygen 1.3.6