NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/54595: SD-MMC initialization randomly fails to detect devices at startup

>Number:         54595
>Category:       kern
>Synopsis:       Lost interrupt causes SD-MMC device detection to randomly fail
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 03 11:35:00 +0000 2019
>Originator:     Tom Ivar Helbekkmo
>Release:        NetBSD 9.99.15 as of 2019-10-02
System: NetBSD 9.99.15 NetBSD 9.99.15 (TARA) #66: Thu Oct  3 12:14:32 CEST 2019 evbarm
Architecture: aarch64
Machine: evbarm

With the current source code, specifically with
sys/arch/arm/sunxi/sunxi_mmc.c at revision 1.37, my Pinebook will more
often than not fail to attach its eMMC device corrrectly at boot,
stopping instead to ask for a root device.  (None is available.)

The problem turns out to be a lost interrupt: during initialization, a
MMC_SEND_EXT_CSD command is sent to the controller.  As this is a DMA
operation, it results in two interrupts: first a CMD_DONE one for the
command itself, and then an AUTO_CMD_DONE one when the DMA is

In the function sunxi_mmc_wait_rint(), in
sys/arch/arm/sunxi/sunxi_mmc.c, there are two ways to wait for an
interrupt to arrive: polled or non-polled.  Note that the polled
version will query the controller for non-acknowledged interrupts,
whereas the non-polled one expects the interrupt handler to do this.

During initialization, interrupts other than the ones explicitly
caused by the commands the initialization routine sends will, of
course, not arrive, so if the second of the two above described
interrupts arrives while we're handling the first, the handler doesn't
get triggered for it, and it can only be detected by the polled
handler in sunxi_mmc_wait_rint().

Boot a current kernel from eMMC, and observe that it often stops, asking for root device.

Observing that the MMC_SEND_EXT_CSD is only sent during
initialization, in the function sdmmc_mem_mmc_init(), in
sys/dev/sdmmc/sdmmc_mem.c, I decided that a simple workaround to check
that my reasoning was correct would be to enable polled waiting for
that particular command.  The command was already being specially
handled in sdmmc_mem_send_cxd_data(), in sys/dev/sdmmc/sdmmc_mem.c, so
the least intrusive way to ensure its result is polled for is this:

Index: sys/dev/sdmmc/sdmmc_mem.c
RCS file: /cvsroot/src/sys/dev/sdmmc/sdmmc_mem.c,v
retrieving revision 1.68
diff -u -p -r1.68 sdmmc_mem.c
--- sys/dev/sdmmc/sdmmc_mem.c	6 Jun 2019 20:50:46 -0000	1.68
+++ sys/dev/sdmmc/sdmmc_mem.c	3 Oct 2019 10:33:48 -0000
@@ -1484,10 +1484,12 @@ sdmmc_mem_send_cxd_data(struct sdmmc_sof
 	cmd.c_opcode = opcode;
 	cmd.c_arg = 0;
 	cmd.c_flags = SCF_CMD_ADTC | SCF_CMD_READ | SCF_RSP_SPI_R1;
-	if (opcode == MMC_SEND_EXT_CSD)
+	if (opcode == MMC_SEND_EXT_CSD) {
+		SET(cmd.c_flags, SCF_POLL);
 		SET(cmd.c_flags, SCF_RSP_R1);
-	else
+	} else {
 		SET(cmd.c_flags, SCF_RSP_R2);
+	}
 	if (ISSET(sc->sc_caps, SMC_CAPS_DMA))
 		cmd.c_dmamap = sc->sc_dmap;

Of course, this also requests polled waiting for the result for other
drivers than the sunxi one, so it may be undesirable for this reason.
Furthermore, it is quite possible that the observed problem is really
indicative of a design flaw in the general functioning of the driver,
and should be solved differently.

For now, I'm running with the above, and it's working well for me.

Home | Main Index | Thread Index | Old Index