Main Page | Class Hierarchy | Alphabetical List | Class List | Directories | File List | Class Members | File Members | Related Pages

Copy of UnicodeFile.d File Reference

Go to the source code of this file.

Functions

class UnicodeFile (T)
typedef UnicodeFile (char) UnicodeFile8
typedef UnicodeFile (wchar) UnicodeFile16
typedef UnicodeFile (dchar) UnicodeFile32

Variables

module mango io UnicodeFile
import mango io FilePath
import mango io FileStyle
import mango io mango io FileProxy
import mango io mango io mango
io 
Exception
import mango io mango io mango
io mango io 
FileConduit
import mango sys ByteSwap
import mango convert Type
import mango convert mango
convert 
Unicode


Function Documentation

class UnicodeFile  ) 
 

Read and write unicode files

For our purposes, unicode files are an encoding of textual content. The goal of this module is to interface that external-encoding with a programmer-defined internal-encoding. This internal encoding is declared via the template argument T, whilst the external encoding is either specified or derived via methods herein.

Three internal encodings are supported: char, wchar, and dchar. The methods within operate upon arrays of this type. For example, read() returns an array of the type, whilst write() and append() expect an array of said type.

Supported external encodings are as follow (from Unicode.d):

Unicode.Unknown Unicode.UTF_8 Unicode.UTF_8N Unicode.UTF_16 Unicode.UTF_16BE Unicode.UTF_16LE Unicode.UTF_32 Unicode.UTF_32BE Unicode.UTF_32LE

These can be divided into non-explicit and explicit encodings:

Unicode.Unknown Unicode.UTF_8 Unicode.UTF_16 Unicode.UTF_32

Unicode.UTF_8N Unicode.UTF_16BE Unicode.UTF_16LE Unicode.UTF_32BE Unicode.UTF_32LE

The former group of non-explicit encodings may be used to 'discover' an unknown encoding, by examining the first few bytes of the file content for a signature. This signature is optional for all files, but is often written such that the content is self-describing. When the encoding is unknown, using one of the non-explicit encodings will cause the read() method to look for a signature and adjust itself accordingly. It is possible that a ZWNBSP character might be confused with the signature; today's files are supposed to use the WORD-JOINER character instead.

The group of explicit encodings are for use when the file encoding is known. These *must* be used when writing or appending, since written content must be in a known format. It should be noted that, during a read operation, the presence of a signature is in conflict with these explicit varieties.

Method read() returns the current content of the file, whilst write() sets the file content, and file length, to the provided array. Method append() adds content to the tail of the file. When appending, it is your responsibility to ensure the existing and current encodings are correctly matched.

Methods to inspect the file system, check the status of a file or directory, and other facilities are made available via the FileProxy superclass.

Note that the convert() method can be used to convert an arbitrary array of content ~ said content can come from somewhere other than a file (a socket, for example).

See $(LINK http://www.utf-8.com/) $(LINK http://www.hackcraft.net/xmlUnicode/) $(LINK http://www.unicode.org/faq/utf_bom.html/) $(LINK http://www.azillionmonkeys.com/qed/unicode.html/) $(LINK http://icu.sourceforge.net/docs/papers/forms_of_unicode/)

Construct a UnicodeFile from a text string. The provided encoding represents the external file encoding, and should be one of the Unicode.xx types

Construct a UnicodeFile from the provided FilePath. The given encoding represents the external file encoding, and should be one of the Unicode.xx types

Return the current encoding. This is either the originally specified encoding, or a derived one obtained by inspecting the file content for a BOM. The latter is performed as part of the read() method.

Return the content of the file. The content is inspected for a BOM signature, which is stripped. An exception is thrown if a signature is present when, according to the encoding type, it should not be. Conversely, An exception is thrown if there is no known signature where the current encoding expects one to be present.

Set the file content and length to reflect the given array. The content will be encoded accordingly.

Append content to the file; the content will be encoded accordingly.

Note that it is it is your responsibility to ensure the existing and current encodings are correctly matched.

Convert the provided content. The content is inspected for a BOM signature, which is stripped. An exception is thrown if a signature is present when, according to the encoding type, it should not be. Conversely, An exception is thrown if there is no known signature where the current encoding expects one to be present.

Internal method to perform writing of content. Note that the encoding must be of the explicit variety by the time we get here.

Scan the BOM signatures looking for a match. We scan in reverse order to get the longest match first.

Swap bytes around, as required by the encoding

Configure this instance with unicode converters

Definition at line 134 of file Copy of UnicodeFile.d.

References assert(), convert(), FileConduit, from(), into(), FileConduit::length(), type(), UnicodeFile, version, and FileProxy::write().

typedef UnicodeFile char   ) 
 

typedef UnicodeFile wchar   ) 
 

typedef UnicodeFile dchar   ) 
 


Variable Documentation

module mango io UnicodeFile
 

Definition at line 39 of file Copy of UnicodeFile.d.

import mango io FilePath
 

Definition at line 41 of file Copy of UnicodeFile.d.

import mango io FileStyle
 

Definition at line 43 of file Copy of UnicodeFile.d.

import mango io mango io FileProxy
 

Definition at line 43 of file Copy of UnicodeFile.d.

import mango io mango io mango io Exception
 

Definition at line 43 of file Copy of UnicodeFile.d.

import mango io mango io mango io mango io FileConduit
 

Definition at line 43 of file Copy of UnicodeFile.d.

import mango sys ByteSwap
 

Definition at line 48 of file Copy of UnicodeFile.d.

import mango convert Type
 

Definition at line 50 of file Copy of UnicodeFile.d.

import mango convert mango convert Unicode
 

Definition at line 50 of file Copy of UnicodeFile.d.


Generated on Sat Dec 24 17:28:35 2005 for Mango by  doxygen 1.4.0