Why should toupper()'s argument be cast to unsigned char?

The argument to any of the character handling functions (to*() and is*()) should generally include a cast to unsigned char to coerce it into the proper range, as in toupper ((unsigned char) c). The problem is that most of the time it is a char that should be passed to these functions, but whereas their arguments are required to be within the range of unsigned char (or the special value EOF). Regrettably, the ranges of char and unsigned char can be different; in particular, char can have the same range as signed char. Adding the cast maps negative char values into positive unsigned char values, fixing the problem.

However, under some circumstances, casting to unsigned char is unnecessary or even harmful:

Many implementations are flexible about arguments to character handling function, reporting correct results whether the argument is in the proper range or not. Others will crash, so it's best to be careful.


Last updated 28 Feb 2021 15:21. Copyright © 2021 Ben Pfaff.
May be freely redistributed, but copyright notice must be retained.