--- src/lib/libc/stdio/printf.3 2005/08/05 23:22:23 1.5 +++ src/lib/libc/stdio/printf.3 2006/08/26 10:27:55 1.6 @@ -34,10 +34,10 @@ .\" SUCH DAMAGE. .\" .\" @(#)printf.3 8.1 (Berkeley) 6/4/93 -.\" $FreeBSD: src/lib/libc/stdio/printf.3,v 1.17.2.11 2003/03/02 07:29:33 tjr Exp $ +.\" $FreeBSD: src/lib/libc/stdio/printf.3,v 1.59 2005/09/05 09:49:33 tjr Exp $ .\" $DragonFly$ .\" -.Dd March 2, 2003 +.Dd August 26, 2006 .Dt PRINTF 3 .Os .Sh NAME @@ -83,11 +83,13 @@ The family of functions produces output according to a .Fa format as described below. -.Fn Printf +The +.Fn printf and .Fn vprintf +functions write output to -.Pa stdout , +.Dv stdout , the standard output stream; .Fn fprintf and @@ -115,24 +117,25 @@ string that specifies how subsequent arg .Xr stdarg 3 ) are converted for output. .Pp -Upon success, these functions return the number of characters printed +These functions return the number of characters printed (not including the trailing .Ql \e0 -used to end output to strings), -or, in the case of +used to end output to strings) or a negative value if an output error occurs, +except for .Fn snprintf and .Fn vsnprintf , -the number of characters that would have been printed if the +which return the number of characters that would have been printed if the .Fa size were unlimited (again, not including the final .Ql \e0 ) . -All of these function return a negative value if an output error occurs. .Pp -.Fn Asprintf +The +.Fn asprintf and .Fn vasprintf +functions set .Fa *ret to be a pointer to a buffer sufficiently large to hold the formatted string. @@ -143,15 +146,17 @@ If sufficient space cannot be allocated, .Fn asprintf and .Fn vasprintf -will return -1 and set +will return \-1 and set .Fa ret to be a .Dv NULL pointer. .Pp -.Fn Snprintf +The +.Fn snprintf and .Fn vsnprintf +functions will write at most .Fa size Ns \-1 of the characters printed into the output string @@ -163,10 +168,13 @@ if the return value is greater than or e .Fa size argument, the string was too short and some of the printed characters were discarded. +The output is always null-terminated. .Pp -.Fn Sprintf +The +.Fn sprintf and .Fn vsprintf +functions effectively assume an infinite .Fa size . .Pp @@ -200,12 +208,9 @@ If unaccessed arguments in the format st are accessed the results will be indeterminate. .It Zero or more of the following flags: -.Bl -hyphen -.It -A -.Cm # -character -specifying that the value should be converted to an +.Bl -tag -width ".So \ Sc (space)" +.It Sq Cm # +The value should be converted to an .Dq alternate form . For .Cm c , d , i , n , p , s , @@ -229,7 +234,7 @@ for .Cm X conversions) prepended to it. For -.Cm e , E , f , g , +.Cm e , E , f , F , g , and .Cm G conversions, the result will always contain a decimal point, even if no @@ -241,11 +246,8 @@ and .Cm G conversions, trailing zeros are not removed from the result as they would otherwise be. -.It -A -.Cm 0 -(zero) -character specifying zero padding. +.It So Cm 0 Sc (zero) +Zero padding. For all conversions except .Cm n , the converted value is padded on the left with zeros rather than blanks. @@ -256,10 +258,9 @@ and the .Cm 0 flag is ignored. -.It -A negative field width flag -.Cm \- -indicates the converted value is to be left adjusted on the field boundary. +.It Sq Cm \- +A negative field width flag; +the converted value is to be left adjusted on the field boundary. Except for .Cm n conversions, the converted value is padded on the right with blanks, @@ -269,16 +270,14 @@ A overrides a .Cm 0 if both are given. -.It -A space, specifying that a blank should be left before a positive number +.It So "\ " Sc (space) +A blank should be left before a positive number produced by a signed conversion -.Cm ( d , e , E , f , g , G , +.Cm ( d , e , E , f , F , g , G , or .Cm i ) . -.It -A -.Cm + -character specifying that a sign always be placed before a +.It Sq Cm + +A sign must always be placed before a number produced by a signed conversion. A .Cm + @@ -301,9 +300,9 @@ This gives the minimum number of digits and .Cm X conversions, the number of digits to appear after the decimal-point for -.Cm e , E , +.Cm e , E , f , and -.Cm f +.Cm F conversions, the maximum number of significant digits for .Cm g and @@ -313,79 +312,70 @@ string for .Cm s conversions. .It -The optional character -.Cm h , -specifying that a following -.Cm d , i , o , u , x , -or -.Cm X -conversion corresponds to a -.Vt short int -or -.Vt unsigned short int -argument, or that a following -.Cm n -conversion corresponds to a pointer to a -.Vt short int -argument. -.It -The optional character -.Cm l -(ell) specifying that a following -.Cm d , i , o , u , x , -or -.Cm X -conversion applies to a pointer to a -.Vt long int -or -.Vt unsigned long int -argument, or that a following -.Cm n -conversion corresponds to a pointer to a -.Vt long int -argument. -.It -The optional characters -.Cm ll -(ell ell) specifying that a following -.Cm d , i , o , u , x , +An optional length modifier, that specifies the size of the argument. +The following length modifiers are valid for the +.Cm d , i , n , o , u , x , or .Cm X -conversion applies to a pointer to a -.Vt long long int -or -.Vt unsigned long long int -argument, or that a following -.Cm n -conversion corresponds to a pointer to a -.Vt long long int -argument. -.It -The optional character -.Cm q , -specifying that a following -.Cm d , i , o , u , x , +conversion: +.Bl -column ".Cm q Em (deprecated)" ".Vt signed char" ".Vt unsigned long long" ".Vt long long *" +.It Sy Modifier Ta Cm d , i Ta Cm o , u , x , X Ta Cm n +.It Cm hh Ta Vt "signed char" Ta Vt "unsigned char" Ta Vt "signed char *" +.It Cm h Ta Vt short Ta Vt "unsigned short" Ta Vt "short *" +.It Cm l No (ell) Ta Vt long Ta Vt "unsigned long" Ta Vt "long *" +.It Cm ll No (ell ell) Ta Vt "long long" Ta Vt "unsigned long long" Ta Vt "long long *" +.It Cm j Ta Vt intmax_t Ta Vt uintmax_t Ta Vt "intmax_t *" +.It Cm t Ta Vt ptrdiff_t Ta (see note) Ta Vt "ptrdiff_t *" +.It Cm z Ta (see note) Ta Vt size_t Ta (see note) +.It Cm q Em (deprecated) Ta Vt quad_t Ta Vt u_quad_t Ta Vt "quad_t *" +.El +.Pp +Note: +the +.Cm t +modifier, when applied to a +.Cm o , u , x , or .Cm X -conversion corresponds to a -.Vt quad int -or -.Vt unsigned quad int -argument, or that a following +conversion, indicates that the argument is of an unsigned type +equivalent in size to a +.Vt ptrdiff_t . +The +.Cm z +modifier, when applied to a +.Cm d +or +.Cm i +conversion, indicates that the argument is of a signed type equivalent in +size to a +.Vt size_t . +Similarly, when applied to an .Cm n -conversion corresponds to a pointer to a -.Vt quad int -argument. -.It -The character -.Cm L -specifying that a following -.Cm e , E , f , g , +conversion, it indicates that the argument is a pointer to a signed type +equivalent in size to a +.Vt size_t . +.Pp +The following length modifier is valid for the +.Cm e , E , f , F , g , or .Cm G -conversion corresponds to a -.Vt long double -argument. +conversion: +.Bl -column ".Sy Modifier" ".Cm a , A , e , E , f , F , g , G" +.It Sy Modifier Ta Cm e , E , f , F , g , G +.It Cm l No (ell) Ta Vt double +(ignored, same behavior as without it) +.It Cm L Ta Vt "long double" +.El +.Pp +The following length modifier is valid for the +.Cm c +or +.Cm s +conversion: +.Bl -column ".Sy Modifier" ".Vt wint_t" ".Vt wchar_t *" +.It Sy Modifier Ta Cm c Ta Cm s +.It Cm l No (ell) Ta Vt wint_t Ta Vt "wchar_t *" +.El .It A character that specifies the type of conversion to be applied. .El @@ -403,11 +393,12 @@ argument supplies the field width or pre A negative field width is treated as a left adjustment flag followed by a positive field width; a negative precision is treated as though it were missing. -If a single format directive mixes positional (nn$) +If a single format directive mixes positional +.Pq Li nn$ and non-positional arguments, the results are undefined. .Pp The conversion specifiers and their meanings are: -.Bl -tag -width "diouxX" +.Bl -tag -width ".Cm diouxX" .It Cm diouxX The .Vt int @@ -425,11 +416,11 @@ and .Cm X ) notation. The letters -.Cm abcdef +.Dq Li abcdef are used for .Cm x conversions; the letters -.Cm ABCDEF +.Dq Li ABCDEF are used for .Cm X conversions. @@ -438,7 +429,7 @@ appear; if the converted value requires the left with zeros. .It Cm DOU The -.Vt long int +.Vt "long int" argument is converted to signed decimal, unsigned octal, or unsigned decimal, as if the format had been .Cm ld , lo , @@ -450,7 +441,9 @@ These conversion characters are deprecat The .Vt double argument is rounded and converted in the style -.Oo \- Oc Ns d Ns Cm \&. Ns ddd Ns Cm e Ns \\*[Pm]dd +.Sm off +.Oo \- Oc Ar d Li \&. Ar ddd Li e \\*[Pm] Ar dd +.Sm on where there is one digit before the decimal-point character and the number of digits after it is equal to the precision; @@ -460,17 +453,38 @@ zero, no decimal-point character appears An .Cm E conversion uses the letter -.Cm E +.Ql E (rather than -.Cm e ) +.Ql e ) to introduce the exponent. The exponent always contains at least two digits; if the value is zero, the exponent is 00. -.It Cm f +.Pp +For +.Cm e , E , f , F , g , +and +.Cm G +conversions, positive and negative infinity are represented as +.Li inf +and +.Li -inf +respectively when using the lowercase conversion character, and +.Li INF +and +.Li -INF +respectively when using the uppercase conversion character. +Similarly, NaN is represented as +.Li nan +when using the lowercase conversion, and +.Li NAN +when using the uppercase conversion. +.It Cm fF The .Vt double argument is rounded and converted to decimal notation in the style -.Oo \- Oc Ns ddd Ns Cm \&. Ns ddd , +.Sm off +.Oo \- Oc Ar ddd Li \&. Ar ddd , +.Sm on where the number of digits after the decimal-point character is equal to the precision specification. If the precision is missing, it is taken as 6; if the precision is @@ -484,6 +498,8 @@ argument is converted in style or .Cm e (or +.Cm F +or .Cm E for .Cm G @@ -493,19 +509,42 @@ If the precision is missing, 6 digits ar it is treated as 1. Style .Cm e -is used if the exponent from its conversion is less than -4 or greater than +is used if the exponent from its conversion is less than \-4 or greater than or equal to the precision. Trailing zeros are removed from the fractional part of the result; a decimal point appears only if it is followed by at least one digit. +.It Cm C +Treated as +.Cm c +with the +.Cm l +(ell) modifier. .It Cm c The .Vt int argument is converted to an -.Vt unsigned char , +.Vt "unsigned char" , and the resulting character is written. +.Pp +If the +.Cm l +(ell) modifier is used, the +.Vt wint_t +argument shall be converted to a +.Vt wchar_t , +and the (potentially multi-byte) sequence representing the +single wide character is written, including any shift sequences. +If a shift sequence is used, the shift state is also restored +to the original state after the character. +.It Cm S +Treated as +.Cm s +with the +.Cm l +(ell) modifier. .It Cm s The -.Vt char * +.Vt "char *" argument is expected to be a pointer to an array of character type (pointer to a string). Characters from the array are written up to (but not including) @@ -519,9 +558,34 @@ need be present; if the precision is not the size of the array, the array must contain a terminating .Dv NUL character. +.Pp +If the +.Cm l +(ell) modifier is used, the +.Vt "wchar_t *" +argument is expected to be a pointer to an array of wide characters +(pointer to a wide string). +For each wide character in the string, the (potentially multi-byte) +sequence representing the +wide character is written, including any shift sequences. +If any shift sequence is used, the shift state is also restored +to the original state after the string. +Wide characters from the array are written up to (but not including) +a terminating wide +.Dv NUL +character; +if a precision is specified, no more than the number of bytes specified are +written (including shift sequences). +Partial characters are never written. +If a precision is given, no null character +need be present; if the precision is not specified, or is greater than +the number of bytes required to render the multibyte representation of +the string, the array must contain a terminating wide +.Dv NUL +character. .It Cm p The -.Vt void * +.Vt "void *" pointer argument is printed in hexadecimal (as if by .Ql %#x or @@ -529,7 +593,7 @@ or .It Cm n The number of characters written so far is stored into the integer indicated by the -.Vt int * +.Vt "int *" (or variant) pointer argument. No argument is converted. .It Cm % @@ -542,8 +606,13 @@ is .Ql %% . .El .Pp +The decimal point +character is defined in the program's locale (category +.Dv LC_NUMERIC ) . +.Pp In no case does a non-existent or small field width cause truncation of -a field; if the result of a conversion is wider than the field width, the +a numeric field; if the result of a conversion is wider than the field +width, the field is expanded to contain the conversion result. .Sh EXAMPLES To print a date and time in the form @@ -574,16 +643,87 @@ To allocate a 128 byte string and print #include char *newfmt(const char *fmt, ...) { - char *p; - va_list ap; - if ((p = malloc(128)) == NULL) - return (NULL); - va_start(ap, fmt); - (void) vsnprintf(p, 128, fmt, ap); - va_end(ap); - return (p); + char *p; + va_list ap; + if ((p = malloc(128)) == NULL) + return (NULL); + va_start(ap, fmt); + (void) vsnprintf(p, 128, fmt, ap); + va_end(ap); + return (p); } .Ed +.Sh SECURITY CONSIDERATIONS +The +.Fn sprintf +and +.Fn vsprintf +functions are easily misused in a manner which enables malicious users +to arbitrarily change a running program's functionality through +a buffer overflow attack. +Because +.Fn sprintf +and +.Fn vsprintf +assume an infinitely long string, +callers must be careful not to overflow the actual space; +this is often hard to assure. +For safety, programmers should use the +.Fn snprintf +interface instead. +For example: +.Bd -literal +void +foo(const char *arbitrary_string, const char *and_another) +{ + char onstack[8]; + +#ifdef BAD + /* + * This first sprintf is bad behavior. Do not use sprintf! + */ + sprintf(onstack, "%s, %s", arbitrary_string, and_another); +#else + /* + * The following two lines demonstrate better use of + * snprintf(). + */ + snprintf(onstack, sizeof(onstack), "%s, %s", arbitrary_string, + and_another); +#endif +} +.Ed +.Pp +The +.Fn printf +and +.Fn sprintf +family of functions are also easily misused in a manner +allowing malicious users to arbitrarily change a running program's +functionality by either causing the program +to print potentially sensitive data +.Dq "left on the stack" , +or causing it to generate a memory fault or bus error +by dereferencing an invalid pointer. +.Pp +.Cm %n +can be used to write arbitrary data to potentially carefully-selected +addresses. +Programmers are therefore strongly advised to never pass untrusted strings +as the +.Fa format +argument, as an attacker can put format specifiers in the string +to mangle your stack, +leading to a possible security hole. +This holds true even if the string was built using a function like +.Fn snprintf , +as the resulting string may still contain user-supplied conversion specifiers +for later interpolation by +.Fn printf . +.Pp +Always use the proper secure idiom: +.Pp +.Dl "snprintf(buffer, sizeof(buffer), \*q%s\*q, string);" .Sh ERRORS In addition to the errors documented for the .Xr write 2 @@ -591,14 +731,21 @@ system call, the .Fn printf family of functions may fail if: .Bl -tag -width Er +.It Bq Er EILSEQ +An invalid wide character code was encountered. .It Bq Er ENOMEM Insufficient storage space is available. .El .Sh SEE ALSO .Xr printf 1 , -.Xr scanf 3 +.Xr fmtcheck 3 , +.Xr scanf 3 , +.Xr setlocale 3 , +.Xr wprintf 3 .Sh STANDARDS -The +Subject to the caveats noted in the +.Sx BUGS +section below, the .Fn fprintf , .Fn printf , .Fn sprintf , @@ -608,7 +755,15 @@ and .Fn vsprintf functions conform to -.St -isoC . +.St -ansiC +and +.St -isoC-99 . +With the same reservation, the +.Fn snprintf +and +.Fn vsnprintf +functions conform to +.St -isoC-99 . .Sh HISTORY The functions .Fn asprintf @@ -650,14 +805,8 @@ nonsensical combinations such as are not standard; such combinations should be avoided. .Pp -Because -.Fn sprintf -and -.Fn vsprintf -assume an infinitely long string, -callers must be careful not to overflow the actual space; -this is often hard to assure. -For safety, programmers should use the -.Fn snprintf -interface instead. -Unfortunately, this interface is not portable. +The +.Nm +family of functions do not correctly handle multibyte characters in the +.Fa format +argument.