Main Page   Modules   Compound List   File List   Compound Members   File Members   Related Pages  

Charset conversion
[LibTDS API]

Convert between different charsets. More...

Defines

#define CHUNK_ALLOC   4

Functions

iconv_t tds_sys_iconv_open (const char *tocode, const char *fromcode)
 Inputs are FreeTDS canonical names, no other.

int tds_sys_iconv_close (iconv_t cd)
size_t tds_sys_iconv (iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft)
void tds_iconv_open (TDSSOCKET *tds, const char *charset)
void tds_iconv_close (TDSSOCKET *tds)
void tds_iconv_free (TDSSOCKET *tds)
size_t tds_iconv (TDSSOCKET *tds, const TDSICONV *conv, TDS_ICONV_DIRECTION io, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft)
 Wrapper around iconv(3).

size_t tds_iconv_fread (iconv_t cd, FILE *stream, size_t field_len, size_t term_len, char *outbuf, size_t *outbytesleft)
 Read a data file, passing the data through iconv().

void tds_srv_charset_changed (TDSSOCKET *tds, const char *charset)
void tds7_srv_charset_changed (TDSSOCKET *tds, int sql_collate, int lcid)
const char * tds_canonical_charset_name (const char *charset_name)
 Determine canonical iconv character set name.

const char * tds_sybase_charset_name (const char *charset_name)
 Determine the name Sybase uses for a character set, given a canonical iconv name.

TDSICONV * tds_iconv_from_collate (TDSSOCKET *tds, int sql_collate, int lcid)
 Get iconv information from a LCID (to support different column encoding under MSSQL2K).


Detailed Description

Convert between different charsets.

Set up the initial iconv conversion descriptors. When the socket is allocated, three TDSICONV structures are attached to iconv. They have fixed meanings:

Other designs that use less data are possible, but these three conversion needs are very often needed. By reserving them, we avoid searching the array for our most common purposes.

To solve different iconv names and portability problem FreeTDS use a complex method. It maintain a list of all alias of a given charset. First it discover some needed charset (UTF-8, ISO8859-1 and UCS2) and then try to discover others from those characters (this discover happen only when required).

There are a list of canonic names (GNU iconv names) and a set of aliases (one for others iconv implementations and another for Sybase). For every canonic charset name we cache iconv name found during discovery.


Function Documentation

const char* tds_canonical_charset_name const char *  charset_name  ) 
 

Determine canonical iconv character set name.

Returns:
canonical name, or NULL if lookup failed.
Remarks:
Returned name can be used in bytes_per_char(), above.

size_t tds_iconv TDSSOCKET *  tds,
const TDSICONV *  conv,
TDS_ICONV_DIRECTION  io,
const char **  inbuf,
size_t *  inbytesleft,
char **  outbuf,
size_t *  outbytesleft
 

Wrapper around iconv(3).

Same parameters, with slightly different behavior.

Parameters:
io Enumerated value indicating whether the data are being sent to or received from the server.
iconv information about the encodings involved, including the iconv(3) conversion descriptors.
inbuf address of pointer to the input buffer of data to be converted.
inbytesleft address of count of bytes in inbuf.
outbuf address of pointer to the output buffer.
outbytesleft address of count of bytes in outbuf.
Return values:
number of irreversible conversions performed. -1 on error, see iconv(3) documentation for a description of the possible values of errno.
Remarks:
Unlike iconv(3), none of the arguments can be nor point to NULL. Like iconv(3), all pointers will be updated. Succcess is signified by a nonnegative return code and *inbytesleft == 0. If the conversion descriptor in iconv is -1 or NULL, inbuf is copied to outbuf, and all parameters updated accordingly.
In the event that a character in inbuf cannot be converted because no such cbaracter exists in the outbuf character set, we emit messages similar to the ones Sybase emits when it fails such a conversion. The message varies depending on the direction of the data. On a read error, we emit Msg 2403, Severity 16 (EX_INFO): "WARNING! Some character(s) could not be converted into client's character set. Unconverted bytes were changed to question marks ('?')." On a write error we emit Msg 2402, Severity 16 (EX_USER): "Error converting client characters into server's character set. Some character(s) could not be converted." and return an error code. Client libraries relying on this routine should reflect an error back to the appliction.

Todo:
Check for variable multibyte non-UTF-8 input character set.

Use more robust error message generation.

For reads, cope with encodings that don't have the equivalent of an ASCII '?'.

Support alternative to '?' for the replacement character.

size_t tds_iconv_fread iconv_t  cd,
FILE *  stream,
size_t  field_len,
size_t  term_len,
char *  outbuf,
size_t *  outbytesleft
 

Read a data file, passing the data through iconv().

Returns:
Count of bytes either not read, or read but not converted. Returns zero on success.

const char* tds_sybase_charset_name const char *  charset_name  ) 
 

Determine the name Sybase uses for a character set, given a canonical iconv name.

Returns:
Sybase name, or NULL if lookup failed.
Remarks:
Returned name can be sent to Sybase a server.

iconv_t tds_sys_iconv_open const char *  tocode,
const char *  fromcode
 

Inputs are FreeTDS canonical names, no other.

No alias list is consulted.


Generated on Tue Mar 29 19:52:37 2005 for FreeTDS API by doxygen1.3