External RAM and lost MCUCSR

Atmega’s MCUCSR status register gives info about restart reason like regular reset-pin shorted, watchdog initiated reset, or brown-out on low voltage from power supply. Pieces of valuable information for device defect tracking. Reading MCUCSR early in test program was fine, while in real system with bootloader and external RAM memory I always got incorrect value. It took me couple hours to pin down the problem of single incorrect byte of information.

At first I checked that bootloader does not mess up with MCUCSR; my simple test application was running fine. For some reason I wanted to make a copy of MCUCSR very early in start sequence before reaching main() function, so I have added lastMCUCSR variable to my application code:

uint8_t lastMCUCSR __attribute__ ((section (".noinit")));
 
__attribute__ ((section (".init3"), naked, used))
void reboot_init() {
	lastMCUCSR = MCUCSR;
	...
}

Since CRT clears globals on later stages (after .init3) lastMCUCSR was declared not to be cleared. At the same time I had external memory initialization routine also on .init3 stage. Both were working fine.

__attribute__ ((section (".init3"), naked, used));
void xmem_init(void) {
	// External memory interface enable
	MCUCR |= (1 << SRE);
	XMCRA = 0;
	// PC7 released, 32kB SRAM addressing
	XMCRB |= (1 << XMM0);
}

Test application using both routines on .init3 started by my bootloader worked fine while my full blown application was reading 0xF0 again and again from status register. Lower nibble was cleared even after forcing watchdog to reset MCU or after grounding reset pin. At the same time 4 upper bits were set as if JTAG is engaged. But JTAG was disabled in fuses.

mcucsr

I was sniffing around investigating LSS and MAP files to make sure initialization goes right and after a while I still had no clue. Unless one more restart got me 0xB0 instead of 0xF0. It clicked in my head that my variable is surely a garbage. I have instanly recalled my exercise with external RAM initialization and relocation of memory segments. I realized that my test application was working nicely with and without external RAM simply because lastMCUCSR variable was allocated in internal MCU memory. In my real application I have huge number of static/global buffers brought by different libraries, the lastMCUCSR was placed with higher address effectively landing in external RAM address space. Looking again at the MAP file I understood the problem:

 *(.init3)
 .init3         0x00002d96       0x14 ./utils/reboot.o
                0x00002d96                reboot_init
 .init3         0x00002daa       0x14 ./xmem.o
                0x00002daa                xmem_init
 *(.init3)
 *(.init4)
 .init4         0x00002dbe       0x1a c:/developer/winavr-20100110/bin/...
                0x00002dbe                __do_copy_data
 .init4         0x00002dd8       0x10 c:/developer/winavr-20100110/bin/...
                0x00002dd8                __do_clear_bss

Order of functions in the same init section is linker specific and cannot be orchestrated (and looking at .init4 section it is not alphabetic either). In my case I was attempting to write to lastMCUCSR (reboot_init) before its address become valid (xmem_init). Reading variable back was actually getting non-initialized byte of external RAM cell, randomly but mostly set to 0xF0. Placing xmem_init to .init2 section got me correct program behavior.

This way I have earned double face-palm 🙂

This entry was posted in Electronics, Software and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.