Mixing extended character sets in ANSI C
Bob Jervis
uunet!Eng.Sun.COM!Robert.Jervis
Tue Mar 1 19:51:42 PST 1994
> The C Standard specifies the support of extended character sets and locale
> specific behaviour. However, the Standard does not mention anything about
> codesets.
An extended character set is a codeset.
>
> Consider the following program :
>
> > main()
> > {
> > setlocale (LC_ALL, "japanese");
> > printf("taC\n"); /* this is a SJIS string */
> > printf("%U%!%$%k%n"); /* this is a EUC string */
> > }
>
> The program sets the locale to be "japanese" and defines two strings. One
> of the strings uses Shift JIS characters and the other uses EUC which is
> a totally different encoding. I cannot find anything in the Standard to
> allow this. Is this a violation of ANSI rule (if so, which one) ? Or is
> this fall into "locale specific behaviour" ? Comments ?
My understanding of locales is that different code sets are different locales.
So this program is in error. I can't put my hands on the specifics, but I
don't see how you could implement a locale that would make the above example
work (without a psychic hotline included).
More information about the Numeric-interest
mailing list