I got to hacking on OpenPXA today. This brought some noticable progress. For some time already, there was a bothersome issue where one had to prepare a NTIM and IPL binary for the PXA300/PXA310 and a special combined IPLNTIM for PXA320.
Knowing it's most probably no use trying to get NTIM working on PXA320, I used a horrible hack. Firstly, lets recapitulate the facts. The first important fact is, that the first four bytes of the NTIM contain minimal version of BootROM this image will run on. The second important fact is that the four bytes at offset 0xc contain a date when the NTIM was created and is ignored by the BootROM.
On PXA300 and PXA310 which have BootROM v3.xx, the version string is compared to the BootROM version and if it's lower, the NTIM header is used to further control the boot process. On PXA320 on the other side, the code at 0x0 is copied to SRAM and normally executed. So the trick with unified NTIM now seems simple. It has a problem though.
Firstly I thought placing a branch instruction at offset 0x0 which would just jump past the NTIM should do the trick. How mistaken I was. The branch instruction was encoded as 0xea000028, if converted to LE it is 0x280000ea, whereas the version string in LE is 0x00030102 for BootROM v3.12, so this can't possibly work. Though there is a solution.
Branch which jumps eight bytes forward is encoded as 0xea000000, in LE 0x000000ea, which is lower than the version string. With this trick, I used the date entry and inserted an instruction which branches right past the NTIM header. You probably noticed by now, that the branch jumps 8 bytes forward, but the date entry is at 0xc. This is no problem as the entry at 0x8 is the HMIT section offset, which is zero and zero means nop.
This change has one more advantage. The IPL is now stored in block 0 of NAND just past the NTIM. It was necessary to patch the linker command so it'd correctly adjust the location of built in variables. But the result now is, one more block is free and can contain U-Boot most likely and there's no need to explicitly choose which CPU will the NTIM run at. As of today, the CPU parameter is dropped and only file called iplntim.bin is generated, holding the contents of NAND block 0.
Besides this crucial change, I moved the stack setup into a self standing function which then calls the main function of the IPL. This allows the main function to have it's variables on the stack and correctly allocated.