Top  Previous  Next

What is UTF-8 and UTF-16?

UTF-8 and UTF-16 are 2 different ways for representing unicode characters.

To briefly touch on the technical side :

In UTF-8, anywhere between one to four 8-bit units are used to represent a particular unicode character.

In UTF-16, 16 bit units are used to represent (in either big-endian or little-endian format) a particular unicode character. 

[There is also UTF-32 format which is mostly used in Unix only].





Document version 4.0.1Copyright 2004 - 2006 Azhagi.com

For current/updated version of this document, always visit http://www.azhagi.com/azUnicodeHelp.html