NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

standards/60369: mbrtowc, mbrlen have wrong return value for some invalid byte sequences



>Number:         60369
>Category:       standards
>Synopsis:       mbrtowc, mbrlen have wrong return value for some invalid byte sequences
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    standards-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jun 25 18:20:00 +0000 2026
>Originator:     Bruno Haible
>Release:        10.0
>Organization:
GNU
>Environment:
NetBSD ... 10.0 NetBSD 10.0 (GENERIC) ...
>Description:
When encountering an invalid byte sequence, mbrtowc and mbrlen must return (size_t)(-1). The return value (size_t)(-2) is reserved for incomplete byte sequences, that is, for byte sequences to which one or more bytes need to be appended before it can be decided whether the augmented byte sequences is valid or invalid.

In an UTF-8 locale, for some invalid byte sequences, NetBSD's mbrtowc and mbrlen return (size_t)(-2) when in fact they should return (size_t)(-1).
>How-To-Repeat:
Save this program as foo.c.
==============================================================
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <wchar.h>

int
main ()
{
  if (setlocale (LC_ALL, "en_US.UTF-8") == NULL)
    return 1;

  // Expected: -1 -1 -1 -1
  // NetBSD:   -2 -2 -2 -2
  {
    mbstate_t state;
    memset (&state, 0, sizeof (mbstate_t));
    wchar_t wc = 0xDEADBEEF;
    size_t ret = mbrtowc (&wc, "\xE0x\xE0", 2, &state);
    printf ("ret = %d\n", (int)ret);
  }
  {
    mbstate_t state;
    memset (&state, 0, sizeof (mbstate_t));
    wchar_t wc = 0xDEADBEEF;
    size_t ret = mbrtowc (&wc, "\xF0x\xF0", 3, &state);
    printf ("ret = %d\n", (int)ret);
  }

  {
    mbstate_t state;
    memset (&state, 0, sizeof (mbstate_t));
    size_t ret = mbrlen ("\xE0x\xE0", 2, &state);
    printf ("ret = %d\n", (int)ret);
  }
  {
    mbstate_t state;
    memset (&state, 0, sizeof (mbstate_t));
    size_t ret = mbrlen ("\xF0x\xF0", 3, &state);
    printf ("ret = %d\n", (int)ret);
  }
  return 0;
}
==============================================================
$ cc foo.c
$ ./a.out

Expected output:
-1
-1
-1
-1

Actual output:
-2
-2
-2
-2

>Fix:




Home | Main Index | Thread Index | Old Index