tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: UEFI: caveats about not utf-8 dir entries



On Thu, Jan 12, 2023 at 09:20:34AM +0100, tlaronde%polynum.com@localhost wrote:
> I don't know if this is for tech-kern or tech-userlevel (perhaps the
> two).
> 
> I just read today, on the devel UEFI edk2 devel list, from patches for
> ext4, a comment on the problem of the encoding of dir entries.
> 
> The problem is that, generally in fs, no encoding is specified: dir
> entries are just a sequence of bytes, whether nul byte terminated or
> with the length of the entry given (the later for ext4).
> 
> UEFI (edk2) deals, internally, with UCS-2 strings.
> 
> With ext4 (and I expect this is the same for other fs drivers),
> conversion is attempted from utf-8. Here, if the "from utf-8" conversion
> errors (not utf-8), the dir entry is skipped, meaning that not anything
> on a fs read can be reached by the UEFI code.
> 
> This has to be kept in mind when populating a msdos partition for
> booting and for people wandering in a filesystem using the UEFI shell:
> even if the fs is readable, perhaps not everything will be accessible.

Not a problem, none of our boot code is likely to use anything
beyond ACSII-compatible code points, and for the foreseeable future
we'll be using the FAT-formatted ESP, where the long file name support
is supposed to be UCS-2 anyway (that is, not UTF-16).

If you need multi-astral-plane-codepoint Unicode emoji to boot an OS
you're doing something very wrong.


Home | Main Index | Thread Index | Old Index