Mixing extended character sets in ANSI C

Bob Jervis uunet!Eng.Sun.COM!Robert.Jervis
Tue Mar 1 19:51:42 PST 1994


 
> The C Standard specifies the support of extended character sets and locale
> specific behaviour. However, the Standard does not mention anything about
> codesets.

An extended character set is a codeset.

> 
> Consider the following program :
> 
> > main()
> > {
> >         setlocale (LC_ALL, "japanese");
> >         printf("taC\n");     /* this is a SJIS string */
> >         printf("%U%!%$%k%n");     /* this is a EUC  string */
> > }
> 
> The program sets the locale to be "japanese" and defines two strings. One
> of the strings uses Shift JIS characters and the other uses EUC which is
> a totally different encoding. I cannot find anything in the Standard to
> allow this. Is this a violation of ANSI rule (if so, which one) ? Or is
> this fall into "locale specific behaviour" ? Comments ?

My understanding of locales is that different code sets are different locales.
So this program is in error.  I can't put my hands on the specifics, but I
don't see how you could implement a locale that would make the above example
work (without a psychic hotline included).






More information about the Numeric-interest mailing list