Sunday, October 31, 2010

Erratum ENGcm09395 and OpenOCD

Apparently, the Erratum ENGcm09395 for iMX515 made the OpenOCD problems clear. It states the location of ROM Table is misreported. But still, I wanted to implement the ROM Table location autodetection as Oyvind Harboe hinted me to.

Therefore I added a function for handling quirks as this and did some code shuffling in OpenOCD. The patches for this are already in the OpenOCD mailing list. I'm still a bit uncertain about them, but I hope after one or two iterations, they will land in mainline OpenOCD.

Saturday, October 30, 2010

Studies, U-Boot, OpenOCD and CortexA8

The lack of additions on this blog recently wasn't caused by me disappearing out of the blue. It was caused by me preparing for my degree's exams, then fighting with U-Boot and recently working on a new hardware. Read on.

Studies
In case you're only interested in technical stuff, you can skip this part, which is mostly about gaining my bachelor's degree at the Faculty of Mathematics and Physics Charles University and the OpenBSD thesis.

Due to rather personal than other circumstances, I ended up finishing my first academic degree at the end of the summer rather than at the beginning of it. Well, whatever.

That means, the OpenBSD hacking days are past me since I finished my bachelor's thesis. I believe I wrote about this in some of the previous blogs, but now it's officially past me and sealed away. In the end, the thesis was graded A by both the opponent (Mgr. Martin Decky) and the supervisor (RNDr. Leo Galambos, Ph.D.) so the final grade from the thesis was A as well.

Note, the grading system in our country is different than in most other countries. Here, it's numeric, but "1" is equal to A, "2" to B, "3" to C and "4" to Failed. I'll hereby stick with the system using letters to make it less confusing.

Then I had to prepare for the exams, which were really thorough so the preparation certainly couldn't be underestimated. That took me about a month where I didn't hack on stuff much, I was just reviewing patches and from time to time I sent some minor fixes. At the exam, I finished with a total grade of B.

Apparently, on the diploma, there is only one grade, not two (one from thesis and one from examination), so the examination committee has to ultimately decide this grade. My final grade was A, which is what will be on my bachelor's diploma, which I'll receive later next month.

U-Boot
Once the exam was past me, I got back to maintaining U-Boot PXA. Apparently, there were a lot of changes going on inside the U-Boot.

Firstly, Wolfgang Denk pushed me to rework pxa-regs.h so the standard accessors can be used to access PXA CPU registers instead of the preprocessor goo that was there before. This was a patch over 6000 lines big as it affected a lot of code in U-Boot, not only pxa-regs.h. Obviously, platform code as well as drivers were affected and had to be patched to use {read[blw],write[bwl]}() accessors. Catching all the damaged code wasn't really easy in some cases, but I think I caught them all.

With that out of the way, I checked pxa-lcd driver and fixed various, mostly coding-style related problems and added a few new LCD definitions. Then, I fixed PXA250 UART problem, where the UART was omitted in U-Boot from the initialization routine on PXA25x and PXA26x.

Also, the Palm LifeDrive, Palm Tungsten|C and Balloon3 support went in, Voipac PXA270 got various bugs fixed and there were also fixes for the memory init code by Mikhail Kshevetskiy. I also got an USB3.0 flash disk, but apparently, it didn't work with USB on PXA in U-Boot, so I submitted a patch for this.

Then though, Heiko Schocher introduced a change in U-Boot, which allows it to relocate in RAM. More into detail, this change allows U-Boot to always sit at the end of RAM, which is useful on systems where there are various models with different size of RAM detected at runtime. This change broke the PXA port of U-Boot altogether.

This was a good enough motivation to completely rework the PXA init code. I rewrote half of the start.S assembler bootstrap from scratch, because the code there was really old, messy and and with the relocation patches in, no longer suitable for the purpose. The start.S can no longer use RAM until certain point, but it needs to use stack. Sure, for stack, SRAM can be used on PXA27x, but there are PXA CPUs older than that which have no SRAM. This set me in front of a challenge -- where should I point stack pointer register is there's no RAM.

The solution is somehow easy, but needs you to understand how the caches on ARMv5/XScale core work. What was the trick? It was easy, the caches allow the user to lock entries in them. That is, they allow the user to write into the cached area, while convince the CPU not to actually write the data into that area, but only keep them in cache. But for using caches, MMU has to be enabled. So the big picture is, I manually generated an 1:1 VA:PA mapping MMU table within the U-Boot binary and at the bootstrap, I point the CPU at this table and enable MMU. Within this table, at the place which indexes RAM start +1MB*, I made a special entry that configures the area as cachable. Then I used the information from the XScale core documentation to lock 4 kb of data to create a tiny, 4 kb in-cache RAM. So writing to that area of RAM behaves as if the RAM was where and was working, even though there's no backing store. The rest was just a matter of pointing a stack pointer there, end of the fairy tale.

This wasn't the only thing about relocation patches though. Because there was no need to init the RAM in assembler code anymore, I could push the RAM init into the C routines which were executed much later. I did this and in current state, most of the hardware initialization takes place much later. This has one very big advantage in the long-term. As the hardware is not initialized in C code, it'll now be easily possible to walk a Flattened Device Tree and do the hardware initialization according to that. So imagine having single U-Boot binary work many CPUs and only supplying a FDT for each board. That's the long-term plan now.

There were other changes related to this relocation stuff, which in the end generated about 30 patches in total. One more thing is interesting though. Because there are boards which have multiple variants and it's not possible to detect which variant is which on the fly, there started to appear a Makefile pollution in U-Boot, which Wolfgang Denk didn't like. We discussed this and came to a conclusion about a solution. I implemented first version, then he improved it and committed it. So now, all of the U-Boot board variant setup should happen in boards.cfg finally and there's no longer need to touch Makefile.

OpenOCD and CortexA8
About a month ago, I received an Genesi USA EfikaMX Smarttop from their developer sponsorship program called Power Developer. My task was to support this device with OpenOCD so common people can JTAG it with those cheap FT2232-based JTAG dongles.

The system contains an Freescale iMX515 CPU, for which apparently there were some previous unsuccessful attempts to make it work. I based my efforts on these, but did a further investigation of the matter. "Sadly", all the support for Cortex A8 was already in place, which spoiled my fun ;-)

And in the end, it turned out the solution of the problem was very simple. The people trying to get iMX515 working with OpenOCD forget, that the Cortex A8 support there was OMAP3-centric. The DAP address was hardcoded for OMAP3, which has it at 0x54011000 address, but for iMX515, the DAP address is at 0x60008000. Adjusting this in the source code made the CPU mostly work.

The problem with this though is, that this stuff should be auto-detectable. So far, I was digging in the CortexA8 stuff for about two days so I have still a lot to learn about this. But I was hinted by Oyvind Harboe and Zack Welch to use the "dap info 1" command and look around that. The command reads the MEM-AP's BASE register to figure out the location of ROM Table. And by parsing the ROM table, it's possible to figure out the location of DAP, which is exactly the address I mentioned in previous paragraph that needs to be autodetected.

The problem is, this command for some reason misdetects the location of ROM Table and returns the ROM Table is empty. By manually adjusting the source code and pointing the command to a proper location of ROM Table, the parsing algoright catches up and correctly dumps the contents. You can keep an eye on further development of this matter for example here (or the OpenOCD mailing list obviously).

There, you can also find preliminary patches with which the iMX515 works, but the autodetection of ROM Table base is still missing. So, if you want to use iMX515 with OpenOCD, you can, but those three patches won't probably make it into mainline OpenOCD until the autodetection problem is solved.

Monday, August 16, 2010

U-Boot: PXA27x Matrix keypad driver

Today, I hacked together a PXA27x matrix keypad driver for U-Boot bootloader. It behaves as a usual stdin. Then I made ZipitZ2 utilize this new driver. I also implemented support for various modifier keys directly into the driver for convenience's sake. The driver still has a few rough edges and needs some more care, but it does it's job on ZipitZ2 well.

The source code can be found in u-boot-pxa -devel branch.

Saturday, August 14, 2010

Last month's hacking report

Today's blog entry will be a long one as it will span my activities over the last month. I focused more on hacking than on blogging, but the time has come and the FIFO got filled so it's time to empty it.

Voipac PXA270
Firstly, I got a bit involved with Android. Radoslav Deak and me did some kernel and userland hacking on it to get it running on the Voipac PXA270 board. The Android kernel changes didn't make me feel well, but the userland made me cry, no more words needed here I believe. Anyway, the Android does not support 18bpp LCD, therefore I was told they'll send me another board with 16bpp LCD. Though a mistake happened and I got a board with 18bpp LCD again. Luckily enough, I don't have to hack on Android anymore, but it's not only Android that doesn't support 18bpp LCD, it's also X.org.

Here you'll probably already start questioning my sanity when I say I started digging in the X.org to get this fixed. The X.org developers are actually very nice and very helpful people and they helped me with patching the X.org. One thing worth noting is that I compiled the X.org on the other Voipac PXA270 board which is running Debian already. The building took a little bit of time anyway. The patch is already in mainline X.org I believe and it was not all that hard in the end.

The only trouble with this change is that it uses four bits of memory for each pixel and with the 104MHz SDRAM on the Voipac PXA270 and 640x480 LCD, this eats a lot of memory bus bandwidth.

One more thing is worth noting, that is the xserver-xfbdev, which is capable of 18bpp already, but it's not certain whether the apps connecting to this kind of Xserver won't have problems with 18bpp. On the other hand, this kind of problem may happen even with the standard Xserver, but so far, GDM worked well (and probably whole GTK does, I didn't test anything but that).

There were some other minor changes on the Voipac PXA270 front, but these were mostly bugfixes. There were also bootloader related changes and fixes though. The most relevant one was the one adding support for 128MB variant of the module. All these changes can be found either in my kernel tree or in my u-boot tree.

The OpenBSD
Here comes one sad piece of news. Sad, because I had to leave a piece of code unmaintained. I won't disclose the reasons why as I believe interested people will be able to find it themselves. To put it short, I'm not a member of the OpenBSD developer community anymore. I released the rest of the patches for Palm port to their mailing list and I hope jasper@ and drahn@ will at least not let the code bitrot.

On the other hand, it's actually quite a relief as I can now fully focus on Linux kernel hacking and U-Boot bootloader hacking. My opinion of the OpenBSD kernel code is biased and therefore you won't find it here. Though I think drahn@ did a very good job on the ARM code.

Well, at least I managed to finish my thesis and the system actually works quite well on Palm hardware, including Xenocara (the OpenBSD version of X.org), touchscreen etc.

Palm hacking again
A lot of redundant code in the Palm hardware support code and Eric's whining on that matter some time ago motivated me to consider trimming it a bit. Though hacking on the Palm hardware was different from the old Hack&Dev days this time.

I introduced a set of common functions which each registered one piece of hardware on the Palm handheld in question. This allowed me to remove about six copies of redundant data structures and other stuff. While at this, I wrote the common functions modular so in case the support for a particular piece of hardware isn't compiled in kernel, the support structures are removed as well.

I also added a long lost patch for the MAX1586A PMIC into the common code as that's used in all Palms while at that. Further on, I modularized the rest of the non-common code in all the Palm devices in the same fashion. I'm actually quite happy about the code quality now. And Eric did his little Linus-like gag about the number of deletions ;-)

This was not all on the Palm front though. As my Palm Tungsten|C is a nice toy I like quite a bit and I needed a PXA255 testing platform, I updated the U-Boot on this device to mainline version. This can be found in my u-boot tree again. I also modularized the kernel code the same way I did with PXA27x Palms, though I don't use the common code as this platform is quite different. While at that, I added support for LEDs and switched the UDC to gpio-vbus. All these changes can be found in my kernel tree.

U-Boot
I already said this earlier probably, but I'm now the U-Boot-PXA maintainer. Therefore support for some new machines hit mainline and support for a few others is underway. This time, the new ones are surprisingly the Palm LifeDrive, Palm Tungsten|C and Balloon3. The Palm Tungsten|C is no news as the code to support that hardware in U-Boot is about one year old. So this one was only updated, see above.

The Palm LifeDrive is a different thing though. Because I wanted to avoid destroying my unit, I figured out a way how to pseudo-dual-boot PalmOS and U-Boot for testing purposes. This turned out to be quite simple, though without JTAG one has to be very careful. Firstly, the size of flash in Palm LifeDrive is only 512kb -- 0x80000 bytes -- and the last 0x20000 bytes are unused and PalmOS ignores them. That's enough for slightly stripped down U-Boot binary, but there's one more thing missing, that is, in case U-Boot was installed in that part of flash, something must execute it.

Executing it means jumping to it's first instruction. By looking at the disassembly dump of the Palm LifeDrive flash, very close to the beginning, the code clears register "r0" by moving zero into it. By replacing this instruction with a jump at the 0x60000 address, the U-Boot can be executed. But that would make the dual-boot impossible in case there was just pure U-Boot at 0x60000.

The trick is to add code to 0x60000, that checks PSPR register aka. the only register that survives suspend-to-mem. This register always contains value that modulo 4 gives no remainder, ie. is aligned to four bytes in PalmOS. The last two bits are unused though and that's the trick.

The point is to load U-Boot just after this code that checks PSPR. In case these last two bits in PSPR are set, the code will not jump right past the place where the "r0" was zeroed, but instead will just continue executing the code further on, which means CPU will reach U-Boot. Otherwise, the PalmOS loader will just continue execution. I also had to zero the "r0" before jumping back to the PalmOS loader in the later case.

One more thing is important here. To set the PSPR, I adjusted the suspend routine in Linux kernel to store some totally broken number into it and when I then pressed any of the "unsuspend" buttons, the U-Boot came up. Note, the U-Boot I used for this had the wakeup capability disabled, which is very important.

To get the U-Boot wakeup the kernel properly, it's necessary to replace the Palm LifeDrive bootloader altogether. That's what I did and it worked very well for me, until I wanted to test a newly compiled version for Tomas Cech. The battery choked while the system was being updated. Well, that's sad. Luckily, I already have a replacement device.

Cypress EZUSB SX2
Having U-Boot on Palm LifeDrive means it can now be fully used with Linux. That also means need for some kind of connection to the outside world. Most of the people involved with Hack&Dev certainly recall the USB 2.0 chip in Palm LifeDrive. The chip is actually quite simple, but also quite incompatible with Linux's perception of an USB gadget chip.

The SX2 requires the so called Descriptor RAM to be filled with proper USB descriptor in order to enumerate at all. I started writing a driver for the SX2, but it's still very incomplete. In the current state, it can send data from host and to host and is behaving as 4 bulk endpoints. That implies it can be used probably only with g_serial for now. But it's still far from complete. The draft can be found in my kernel tree and anyone willing to hack on it is welcome.

The drinking party
During the Linaro conference in Prague, me and some other people -- Eric Miao, Nicolas Pitre, Daniel Mack, to name a few -- went for a beer. The party turned out into an awesome thing, it was probably the most enjoyable evening and night of the year for me.

Anyway, Eric decided to give up his maintainership of the Sharp Akita and pushed it into me. The current support for Sharp Akita in mainline is kind of poor, but with improving tendencies. The power management code is very broken and there are still a few other issues.

I decided to try fixing the power management issue so I joined efforts with Pavel Machek who started putting together a new battery driver for Sharp Spitz. Actually this also work on Sharp Akita and Sharp Borzoi. The driver currently has more rough edges than I'd love it to have, but it's kind of usable given enough knowledge on the user side. The problem with it is, that it still doesn't check the battery temperature so the user has to be careful not to overcharge the battery. Actually, there should be some GPIO, that prevents the battery from turning the device into a lethal weapon and the driver checks this GPIO, but according to Pavel, this is not safe enough.

I met with Wookey the other day and got a Balloon3 board with some additional peripherals. That obviously ended up in me porting U-Boot to this board and doing some heavy rework on the side of mainline kernel.

The first problem I faced was the FPGA that's on the Balloon3 board. That's used to route various CPU lines to hardware. It also controls NAND, CF and such. The PCMCIA/CF driver had to be slightly fixed on the Balloon3, because in case of FPGA, the only working firmware is version 0.3. For CPLD, which is integrated on some other Balloon3 boards, there is already version 1.0 of the firmware which has a different behaviour. I hope the fix for the FPGA firmware is underway as Hector Oron told me he'd look into it.

The worse thing was the NAND driver. For that, I used the gen_nand driver and implemented various necessary callbacks based on the original out-of-the-tree NAND driver for Balloon3. I heavily reworked this so the driver was used only as a technical documentation. Pretty much all the hardware is implemented on the board I have by now.

When I was hacking on the NAND driver, I noticed a bug in the gen_nand driver where it ignored the number of chips passed to the driver. The gen_nand assumed the number of chips was always one, which was true for all the boards that used the driver so far. But on balloon3, there were four chips in total. I fixed this and even implemented a check for invalid value of number of chips as per David Woodhouse's suggestion.

Toradex Colibri PXA3xx
Recently, there were a few people concerned about the status of Toradex Colibri PXA3xx. I was aware of a fact, that the code needs to be reworked for quite some time already. Just recently, I was motivated enough to go forth and do it. Therefore the code that's now in my tree is the reworked support for the board and the CPU card. I also reworked support for Colibri PXA300/PXA310 while at it. It is done in a very similar fashion to the PXA27x Colibri support. Actually, I used the evalboard code for Colibri PXA27x. Now all the Colibris use the same evalboard code and that's how it should be.

This is not all though. There was some stuff going on in the OpenPXA project about the Colibri PXA320 recently too and I was also pestering the tech support in Toradex. The result is that the E-Boot bootloader supplied by Toradex is not the only one that can update the CPLD firmware on the Colibri PXA320 board anymore.

Toradex provided me with the information on how the CPLD JTAG is connected to the CPU GPIO pins and an xsvf file for the CPLD containing the latest version of the CPLD firmware. And by using an xsvf player that I found is already in U-Boot and used on some PowerPC platforms, I was able to reprogram the CPLD to the latest version v1.7 of the CPLD firmware.

The xsvf player needed some obvious changes to operate the GPIO-emulated JTAG correctly, but the code is well abstracted and I think I'll make it into a generic code soon. Currently there are two copies of the same code in my U-Boot tree, which is no good. Note that the v1.7 of the CPLD firmware is probably some internal testing version from Toradex and I still have to ask them about the license for that file. It'd be best of they just put this xsvf file on their website.

Still, this is not the end. Because I wanted to use CF memory card on the Toradex PXA320 board, I decided to start implementing the PXA320 PCMCIA support. I first had to patch the pxa2xx_base to support different locations of the Static Memory controller registers, the MCIO, MCATT, MCMEM and MECR registers. That was the easier part.

After a bit of struggling with the Toradex CPLD and GPIO pin configuration and muxing, which is doing some obscure stuff to the PCMCIA signals, I managed to get the PCMCIA partly working. Note the struggle means I was stuck with it for more than a two days until I figured out what Toradex actually did. Their tech support on the other hand is very good and helpful even if a little bit slow, I'm happy about this.

The partly working above means, that if I insert a Marvell CF8385 into the CF slot, the card is detected, firmware is downloaded and no problems observed. Once I insert a memory card though, nothing happens. The card insertion GPIO is asserted, but it probably reads zeroes or something from the CIS. This issue can actually be just about anything, starting from wrong timing to misconfigured GPIOs. The more confusing thing here is that if I insert an Emtec or Patriot card, the card insertion is only detected. In case I insert a SanDisk card, the pata_pcmcia driver is loaded, but in the end, the disk is not recognized.

The cluster
You might have noticed the hardware kind of started piling up. That's true, but it's not unused actually. I use the boards that are in good shape software- and hardware-wise as an ARM compile cluster nodes. I run distcc on these and use it to build Debian packages. It's also an awesome stability testing and bug hunting mechanism for these platforms.

The cluster is still growing, but currently there are these machines:
  • Palm Tungsten|C
    • PXA255 400MHz
    • 64MB RAM
    • 1GB MMC+ card
    • USB CDC Ethernet
    • Debian SID

  • Voipac PXA270
    • PXA270C5 520MHz
    • 128MB RAM
    • 2GB SD card
    • Ethernet connection
    • Debian SID

  • Voipac PXA270
    • PXA270C5 520MHz
    • 256MB RAM
    • 160GB harddrive
    • Ethernet connection
    • Debian SID

  • Balloon3
    • PXA270C0 520MHz
    • 384MB RAM
    • 1GB NAND + 4GB CF card
    • USB Ethernet connection
    • Debian SID
More devices will be added as they will reach good enough state. I was also thinking of granting access to the cluster to a few fellow developers once it stabilizes well enough.

Other
Just a few days ago, Mike Rapoport aranged it so I received a CM-X300 board. I was still unable to replace the bootloader there as I still don't have JTAG connected to that board.

I also started plumbing into the Debian u-boot package and prepared a patch. The problem is the current maintainer is totally ignoring any requests from me so far. I hope it's just temporary as he was busy with organising the debconf.

I hope you enjoyed reading the text.

Friday, July 9, 2010

Marvell Zylonite PXA300

Just a quick entry concerning OpenPXA today. In need for some rest, I hacked on the Marvell Zylonite PXA300 board I got some time ago. It is currently supported by the OpenPXA OBM2 as well as U-Boot (u-boot-pxa -openpxa).

I also tested the new SDHC capable PXAMCI driver I recently added to U-Boot on PXA300. It works if I disabled the High-Speed support, which is a new feature on PXA3xx where the card runs at 26 MHz. I think that it's just some really simple problem, maybe miscomputation of the clock speed or something, but I'll analyze that later. For now, I use the old driver on pxa3xx.

I'll also need to sort out the PXA3xx branch of U-Boot (the -openpxa branch), as the mess there starts being bigger and bigger. Once that's done and I have the Zylonite PXA320 fixed, I'll start pushing this stuff into mainline U-Boot.

Not to forget a quick summary of other happenings, I added LCD support to ZipitZ2 U-Boot port. That also implies a SPI interface, which I did using the softspi (GPIO driven SPI) driver. Lastly, I started pushing the PXA2xx fixes and improvements into mainline U-Boot.

Wednesday, June 30, 2010

U-Boot: PXA SDHC support

Today, it'll be just a quick entry about a recent achievement in the U-Boot land. I was reported by some people from the Zipit Z2 project that U-Boot doesn't detect their SDHC cards. That was really troublesome and so was the state of the pxa_mmc driver in U-Boot.

Actually, pxa_mmc was probably the last SD/MMC driver in U-Boot that used the old driver interface there. I decided it needs a rewrite. I based the new code on the Linux version of pxa_mmc driver. The resulting pxa_mmc_gen.c now uses the generic SD/MMC framework in U-Boot and the code is much more cleaner. It also supports SDHC well. And one more benefit is that the timing issues that were observed with the old driver are gone now.

The code wasn't merged mainline yet and is currently held as a separate driver to make the transition from the old driver to the new one as smooth as possible. Also, I'd like the driver to get some testing before some other boards are converted to use it.

The code is currently available in the u-boot-pxa.git -devel branch. Also, it was submitted to the u-boot mailing list. To enable the new driver, you'll have to replace
#define CONFIG_PXA_MMC
with
#define CONFIG_GENERIC_MMC
#define CONFIG_PXA_MMC_GENERIC

And recompile the U-Boot.

To use the new driver, you can use commands like:
$ mmcinfo
$ fatls mmc 0

etc. Note, that the "mmcinfo" command is new and the old "mmc" command class is gone. So no "mmc init" anymore.

Monday, June 21, 2010

OpenPXA: first release

The OpenPXA project proudly released first beta version of the bootloader package for PXA3xx. The package supports the following boards:
  • Toradex Colibri PXA300
  • Toradex Colibri PXA310
  • Toradex Colibri PXA320
  • Marvell Littleton PXA310
The software is labeled "beta" because there still might appear problems that we are unaware of.

The package contains the OpenPXA OBM2 and U-Boot bootloader binaries for the boards as well as installation instructions. The package can be downloaded from the OpenPXA website.

Zaurus hacking

Just a quick update on the problems I described on Zaurus. I've been hacking a little bit further into it and figured there's a problem in PCMCIA. Actually, there was always a warning in the pxa2xx_base.c stating that the timing recomputation for the PCMCIA devices is based off the CPU core frequency and that's probably wrong. I changed this and based the timing off the CPU bus frequency, which fixed one of the problems with Zaurus. It fixed the problem where at 312MHz the harddrive died. The other problem with weird lines on the LCD still persists, though not in such a big scale.

On the other hand, the LCD problem now appears even on 416 MHz which makes me think the PCMCIA timing recomputation has some other bug. Maybe, the timing is now too tight so the PCMCIA device takes too much bus bandwidth, triggering the LCDs FIFOs underruns.

There is one more thing that brings me to that conclusion, that is, when I ran a hdpart -T /dev/sda on the harddrive attached to PCMCIA, I got the following results:

FrequencyRead speed without fixRead speed with fix
416 MHz8 MB/s12 MB/s
208 MHz2 MB/s4 MB/s

Thursday, June 17, 2010

Recent XScale hacking

About a month passed since my last post so it's probably time for a recap. The last month was again very busy and the number of platforms that undercame changes overgrew the size of the header for this entry. For starters, let's just say that even the always so popular and beloved Zaurus got some cosmetic changes. There were also some changes in the PXA3xx land and the ZipitZ2 land wasn't left intact either.

ZipitZ2
The first turning point on this platform was a release of the U-Boot bootloader update kit for public testing. Some people installed this and reported me various bugs. There was for example a grave typo in the new U-Boot macros for pxa2xx which made wake-up from suspend-to-mem impossible. That was easily located and fixed right away. The new version of update kit was obviously uploaded very fast. Actually, there were no more bugs reported since then.

The only drawback is that U-Boot doesn't support booting from SDHC cards, only SD and MMC are supported. This can be easily worked around by flashing the kernel into the NOR to an address 0x60000. In case the U-Boot bootloader is unable to either detect a card or find a script on the card, it boots the kernel from the 0x60000 location.

The other changes related to ZipitZ2 happened within the Linux kernel. Firstly, the NOR flash layout was standardized. It is incompatible with the old layout, though on the other hand, now there are strict rules on how the NOR flash layout will look and no more anarchy will take place. The final layout is like this:
  • 0x0-0x40000 -- U-Boot bootloader
  • 0x40000-0x60000 -- U-Boot environment
  • 0x60000-END -- Linux kernel or possibly some small filesystem
I was also reported a bug in the screen blanking on ZipitZ2. Not long ago, Alberto Panizzo wrote a driver for his LCD based on my LMS283GF05 driver (the LCD in ZipitZ2). Though back when I wrote this driver, I made a mistake and screen blanking wasn't working correctly. Sadly, the same mistake got into the new driver by Alberto as well. I therefore sent a patch for both of the drivers.

In the end, I must not forget to mention, that the WM8750 codec is now finally registered in the platform definition file for both ZipitZ2 and Sharp Zaurus/Spitz. This brings the conversion of the WM8750 codec which I started before the 2.6.34 was out to an end.

Income SBC
In one of the older entries here, I mentioned I got my hands on the Income SBC and that my patch started some discussion in the Linux-ARM kernel mailing list. Just to explain what this Income SBC device is, it is a custom-made board with a LCD which is powered by the Toradex Colibri PXA270 module. The conclusion of the discussion was so that we need to separate the board support code from the module support code.

A friend of mine, Daniel Mack, took the Income SBC patch and the Colibri PXA270 code and did the split, which was very helpful. Before the split, I sent a pile of patches which added support for various devices on the Toradex Colibri PXA270 module/baseboard as the support was quite incomplete in mainline.

Sharp Zaurus
Just recently, a friend of mine, Metan, was complaining that the LCD screen on his Sharp Zaurus goes crazy when he uses the mainline Linux kernel and changes frequency of the CPU with CPUfreq to just about anything but 416MHz. He also complained, that if he changes the frequency to 312MHz, the harddrive in the machine fails. He lent me the machine for a while with an explanation on how to boot kernel on it.

It didn't take long and I came up with a first patch. That was actually just a cleanup. The Zaurus support code in mainline Linux kernel was in a terrible state. The first patch therefore only improved the readability of the code. Actually, there was also a minor revamp of the CF/SD power control function. All in all, the patch was pretty much a complete rewrite of the file.

Then, there was another patch. The Zaurus contains an Intersil ISL6271A PMIC for the CPU. Even before I got my hands on the Zaurus device, I implemented a driver for it and asked Metan if he can test it. Well in the end, he lent me the Zaurus instead. The initial version of the ISL6271A driver contained bugs, but after about three revisions of the patch, the driver was clean and was applied to the regulator tree.

There was still the LCD going crazy though. I discussed this with Eric Miao and he explained to me, that the CPU bus probably isn't capable of transferring so many data and therefore the LCD FIFOs underrun. I checked this by enabling the Input and Output FIFO Underrun interrupt in the PXAFB driver. And bingo, the underruns really happened. This means there is either something, that consumes the bus bandwidth or the bus is simply slow. I also tested some other kernels which supported CPUfreq and noticed the issues was there as well. Sadly, I wasn't able to get CPUfreq working with a stock Zaurus ROM.

Also there is still a lingering issue with the harddrive and 312MHz CPU speed. This will need further investigation. It'll probably involve the Sharp SCOOP chip and PCMCIA drivers undergoing a rewrite.

Marvell
Apparently, the popularity of the OpenPXA project is rising. I assume this from a fact that I get more requests for help and bug reports. There were some that helped the project greatly, even if there was no patch included with them. What's even better is the project actually gets testers and that really helps.

The first important bug report that arrived was from some person from HCLTech. He had issues with saving the U-Boot environment into the NAND on the Marvell Littleton PXA310 board. The whole problem was that the size of the environment sector differed from the size of NAND block and the U-Boot MTD subsystem can only write whole blocks. Also, the environment position wasn't aligned to block boundary.
That was quickly solved and a patch was applied on top of u-boot-pxa, openpxa branch.

Though he then wanted to do a TFTP boot and figured out the ethernet chip doesn't work in U-Boot. This took some further investigation. I figured out, that if I flash a Marvell BBU into the board and execute U-Boot from it, the ethernet magically works. I dumped various interesting registers from this setup and bisected the one which caused the problems with ethernet. The most funny part that the problem was caused by a bit marked as "SETALWAYS" not being set. Apparently, these bits marked as "SETALWAYS" mean something Marvell doesn't want other people to know. Obviously, armed with this knowledge, I fixed the ethernet and applied a patch.

The other batch of problems came from a guy owning a Colibri PXA320 board. He complained that he can't even boot a kernel from U-Boot. I figured this issue was only there if the kernel was compiled with a built-in initramfs. In the end, it was fixed by a quick patch. The problem was a mess in the RAM base addresses. It's an agreement in Linux-ARM, that the RAM on XScale starts at 0xa0000000. On PXA3xx though, the RAM is mirrored from 0x80000000 to 0xa0000000 by default for compatibility reasons. The U-Boot contained mess in this and both of these 0x80000000 and 0xa0000000 addresses were used. Unifying these addresses to 0xa0000000 fixed the problem.

Furthermore, this person complained, that the Colibri PXA320 board doesn't support USB device mode in Linux. A quick hack fixed that and patch already hit LAKML. With this guy happy and not considering buying a very expensive BSP from a certain corporation, I could have taken a rest. Though, I noticed some things worth fixing in U-Boot for this board along the way. I fixed the ethernet support on the Colibri PXA320 board in U-Boot and added USB host support.

This is still not all though. I mentioned in one of the older entries, that I got those Marvell Zylonites. I decided to take a look at those, particularly on the one with PXA320 C0 CPU. Firstly, I figured the CPU contains BootROM v3.38. This was a big surprise as this implies the CPU supports the NTIM header. Furthermore U-Boot apparently supported this board, but with the ongoing changes in the U-Boot bootloader, the platform was broken. I therefore took it and rewrote it from scratch, reusing just some of the static memory controller configuration values from it. The U-Boot works just fine. There is still a problem with this board though.

With the Zylonite320, I decided to try a new approach. I pushed all of the low-level init into the OpenPXA OBM2. I also implemented a DDR DRAM initialization code in the OBM2. This should in the end allow me, to totally skip and remove lowlevel_init.S in U-Boot. With just a few tweaks in the OpenPXA OBM2 and the NTIM header, I managed to get this working on the Zylonite320 as well.

And now to the problem. The expected behavior would be that if a power supply is attached to the device, it'd power up and U-Boot will come up. For some unknown reason, this doesn't happen on the Zylonite320. The only way I was able to get the Zylonite320 boot from NAND was to issue a "reset run" in the OpenOCD shell over JTAG. Then the board booted, no problem. I was slightly investigating this with a Marvell XDB and figured the NFC is maybe misconfigured in a way the NAND is detected as x16 instead of x8 by the BootROM. Though maybe there are just some switches that toggle the boot-up behaviour of the CPU.

Other
This month was quite rich in finishing conversions. Not only was the WM8750 conversion finished, but also the final removal of wm97xx_batt.h happened. The file was lingering in the kernel tree for about a year now, to be accurate ever since it was possible to pass platform data through the AC97 bus. So now the file is gone and all the platforms that used this file are now fixed.

In the end, I must not forget I got into a little fight with Russell King. We exchanged a few rough mails etc. Also, if I mention Pavel Machek was involved, I believe no further explanation is necessary. Actually, I was only teasing Russell a bit, but he probably took it personally.

I hope you enjoyed the reading, it was quite a lot of boring text as always.

Friday, May 14, 2010

Today's OpenPXA progress

I got to hacking on OpenPXA today. This brought some noticable progress. For some time already, there was a bothersome issue where one had to prepare a NTIM and IPL binary for the PXA300/PXA310 and a special combined IPLNTIM for PXA320.

Knowing it's most probably no use trying to get NTIM working on PXA320, I used a horrible hack. Firstly, lets recapitulate the facts. The first important fact is, that the first four bytes of the NTIM contain minimal version of BootROM this image will run on. The second important fact is that the four bytes at offset 0xc contain a date when the NTIM was created and is ignored by the BootROM.

On PXA300 and PXA310 which have BootROM v3.xx, the version string is compared to the BootROM version and if it's lower, the NTIM header is used to further control the boot process. On PXA320 on the other side, the code at 0x0 is copied to SRAM and normally executed. So the trick with unified NTIM now seems simple. It has a problem though.

Firstly I thought placing a branch instruction at offset 0x0 which would just jump past the NTIM should do the trick. How mistaken I was. The branch instruction was encoded as 0xea000028, if converted to LE it is 0x280000ea, whereas the version string in LE is 0x00030102 for BootROM v3.12, so this can't possibly work. Though there is a solution.

Branch which jumps eight bytes forward is encoded as 0xea000000, in LE 0x000000ea, which is lower than the version string. With this trick, I used the date entry and inserted an instruction which branches right past the NTIM header. You probably noticed by now, that the branch jumps 8 bytes forward, but the date entry is at 0xc. This is no problem as the entry at 0x8 is the HMIT section offset, which is zero and zero means nop.

This change has one more advantage. The IPL is now stored in block 0 of NAND just past the NTIM. It was necessary to patch the linker command so it'd correctly adjust the location of built in variables. But the result now is, one more block is free and can contain U-Boot most likely and there's no need to explicitly choose which CPU will the NTIM run at. As of today, the CPU parameter is dropped and only file called iplntim.bin is generated, holding the contents of NAND block 0.

Besides this crucial change, I moved the stack setup into a self standing function which then calls the main function of the IPL. This allows the main function to have it's variables on the stack and correctly allocated.

Thursday, May 13, 2010

Marvell, Voipac, U-Boot and insomnia

Again the time has come to push out a summary of things that I worked on recently. Mostly, I worked on the Voipac Technologies a.s. hardware, but there are some great news from Marvell as well.

Voipac
In the last entry, I mentioned adding support for the Voipac PXA270 SBC. Most of the support for this device hit mainline and will be available in 2.6.35 probably. The rest will hopefully hit 2.6.36. That's not all, I estabilished cooperation with the corporation itself and they sent me a baseboard and a CPU card equipped with OneNAND memory. This made me busy for a few days until I figured how to properly prepare the OneNAND IPL that is in U-Boot.

Voipac also sent me their JTAG adapter, which I'm very grateful for as it helped me a lot. Support for this JTAG adapter as well as for the Voipac board with NOR flash hit mainline OpenOCD already. It's currently impossible to flash OneNAND directly from OpenOCD, but one can load U-Boot into SRAM using OpenOCD and execute it from there. Then one can simply reflash the board from that U-Boot.

I then hacked on the SparseMEM support for the board a little more. There was a bug in kernel which caused the kernel to crash in very early stages with SparseMEM enabled. The bug was caused by a recent patch that was applied to the tree. Though, Catalin proposed a fix a few days before I finished debugging it, the fix was pending until someone tests it. Giving my ACK for that, the fix was applied and I submitted a SparseMEM support patch. The SparseMEM support is still unapplied though.

Besides, the company told me they have a DMA capable driver for the ParallelATA disk attached to the board. I only used the pata-platform driver which was only PIO capable and giving about 3800 kB/s reads. This was a no-go. The company didn't send me the harddrive extension for that board, so I had to fix myself a hardware hack, which allowed me to connect a CDROM drive. But ATAPI has specific needs when it comes to DMA and I was unable to get the DMA working with CDROM drive. So I made another hardware hack, even worse this time and connected a PATA 2.5" harddrive. Hacking on this driver for a while resulted in a generic pata-pxa driver, which uses the PXA DMA controller and is capable of getting over 11084 kB/s reads, but only for harddrive. For CDROM the driver falls back to PIO, but it's unlikely anyone will connect a CDROM to this SBC.

Hacking on the Voipac platform was great because the hardware is very well documented and the design is very clean.

U-Boot
I prepared a few macros that drop the need for copying so many code between PXA2xx boards in their lowlevel_init.S files. Those macros allow easy GPIO configuration, SDRAM configuration, clock and interrupt configuration as well as wakeup procedure. When using these macros, the lowlevel_init.S file consist of about 6 lines. This is a great achivement in cleaning up the u-boot-pxa. The ZipitZ2 platform was converted to use these new macros and works well.

Besides these macros, I added support for some new platforms. Namely the Voipac PXA270 with OneNAND flash and a Toradex Colibri PXA270. All of this new stuff is available in the u-boot-pxa devel branch. The branches were also recently rebased to bring the newest code from mainline u-boot.

Marvell
Recently, a friend of mine in Marvell arranged it so I was able receive two Marvell Zylonite units, one with Marvell PXA320 CPU and one with Intel PXA300 (the CPU was apparently made in Intel factory and is possibly even a prototype, though I'm unsure). This will be a great help for the OpenPXA project and he deserves a big thanks.

Also, I was told that PXA320 does not support NTIM header because the CPU is old (and has BootROM v2.22), which means I don't have to RE the BootROM anymore probably and I'd better focus on implementing a single NTIM header for both PXA320 and PXA310/300 (BootROM v3.xx). This is very likely doable through some a little non-standard alternations to the NTIM header. Though it's also likely I'll still continue to RE the BootROM just for curiousity's sake. For example I found out that WTM registers are mapped at 0x43000000 (this is undocumented in PXA3xx manual).

Others
I got my hands on one more platform -- the smarthouse.cz SH-Dmaster. It's a platform based on Toradex Colibri PXA270 module, but uses it's own baseboard. The patch for this platform is pending and is uncertain to be applied as it started a discussion about splitting development platforms to hardware that's on the mainboard and hardware that's on the CPU card.

I must not forget one more thing, I added support for the Guillemot Jet Leader Force Feedback joystick into the iforce driver. I had the patch lying around from the times of 2.6.16 or so, but never got around to sending it. Now it's already applied in linux-input. As I didn't want to reboot my PC to test the driver, I debugged it on the Voipac board.

Conclusion
Well, it's getting pretty busy recently, but I'm happy with the direction things are taking. Also, most of the hardware I have available now works well.

Wednesday, April 14, 2010

Recent happening -- OpenPXA, U-Boot, Marvell, Voipac, Zipit,...

Quite a lot happened in recent weeks. Now the time has come to sum this stuff up so expect quite long entry. I divided this entry into some subsections for the sake of readability.

Marvell and OpenPXA
The most important thing first. The OpenPXA project undercame quite big changes. The most significant one being OBM2, which I'll describe further. Besides this, the websites were updated and the u-boot patches were also updated to match the latest GIT HEAD.

As for the OBM, the original OpenPXA OBM turned out to be too less scalable and modular and it was also very hard adding new platforms. Besides understanding the assembly there was not for everyone. Also, this version of OBM didn't use stack and there were no function calls, it was a linear program and everything that looked like functions were macros. This made the work with this OBM even more complicated and basically made the possibility to extend the OBM in some way impossible.

To quench the need for a better OBM, an OBM2 was born. Written completely from scratch in the C language with little bits of inline assembly. Also, this new OBM2 has a very modular design and uses normal function calls which puts no more limitations on developers wanting to work on this. Besides this, new platforms can be easily added through an unified interface where the developer only has to fill in about three easy functions. Moreover, the OBM2 already carries a library of hardware drivers for the PXA3xx CPUs, like driver for UART and driver for NAND flash. Actually, the NAND flash driver is fundamental of course. But enough praise, even this OBM2 has it's draws. Let's take a better look at those now.

Before we take a look at the problems of OBM2, we better describe the boot procedure of a PXA3xx CPU. The CPU loads the first page of NAND into SRAM location 0x5c008000. In case there is a so called NTIM header, which is generated by a program called mkntim (included with the OBM2), the BootROM within the PXA3xx CPU copies the portions of memory according to the NTIM header. In the NTIM header, there is a configuration section called HMIT, a section describing the NTIM header called OBMI and a section with the OBM2 called BOOT. And the BOOT section is executed.

Now to the problem with OBM2. It's actually not really a problem with OBM2 but more with the PXA320 CPU BootROM. The BootROM in the PXA320 is older (v 2.22 in RTPXA320B2) than in the PXA310 I have available (v 3.33 in PXA310 A2). With the newer BootROM in PXA310, there is no problem with the NTIM header, it works fine, but with the PXA320 the NTIM header does not work at all. That's why I dumped the BootROM, started disassembling it and tracing the instruction flow. Not much luck yet though. But the fact the NTIM headers doesn't work on PXA320 doesn't render it unusable with OpenPXA OBM2. There is a way to boot the PXA320 CPU without an NTIM header by putting 8 zero bytes before the OBM2. Interestingly enough, the CPU drops the first two instructions, copies the rest to 0x5c014000 and executes it.

I introduced a variable CPU which determines which kind of image should be generated at compile time. Setting it to PXA320 generates file called 'iplntim.bin' which goes to 0x00 of NAND and is actually the OBM2 prefixed with those 8 zero bytes. Setting it to something else produces normal NTIM header and a OBM2 binary. In both cases, the compile process ends with a small help describing which file should be flashed where.

As for the hardware support on the OpenPXA OBM2 side, currently supported platforms are the Marvell Littleton PXA310 and the Toradex Colibri PXA320. Both of those platforms boot well with the OBM2.

U-Boot
I became the maintainer of U-Boot/PXA. This means fixes will hopefully flow in faster in this area. I'm looking forward to your patches and cooperation. There are a few fixes pending already. The GIT repository is here.

Some time ago, I wrote about a strange bug in U-Boot that caused printf() and NAND driver malfunction when VSPRINTF64 (simply 64bit support for printf()) was enabled. I hit this bug on my first PXA3xx platform, but was unable to debug it. This ended up with various weird workarounds which were in the OpenPXA "U-Boot patches" tree for some time. Just recently, I thought I'd be better if I took another look at the issue as those patches had side effects (like NAND was read-only etc.).

After a few hours of debugging a redeeming thought arrived. I took a look at the stack alignment, stupid me I didn't thought of this earlier. The LDRD and STRD instructions, which do a data movement between 64bit memory area and 2 following ARM registers need the memory area aligned to 8 bytes. Looking at the address of the variables, they were aligned to four bytes even with GCC attribute for variables __attribute__ ((aligned(8))). It turned out the aligned(8) does not mean it's aligned to 8 bytes from start of memory, but from the top of the stack. Therefore the final patch was simple, it was just a question of aligning the stack to 8 bytes.

Also, there was one more patch that caused problems to users on the PXA CPUs and not only the PXA3xx was affected. It was the PXAMCI driver for the MMC card. The first attempt to detect the card in most cases timed out. It was because the card had different delays for PXA27x than for PXA25x. Removing this shorter loops for PXA27x fixed this problem.

Voipac
I established cooperation with Voipac Technologies a.s.. This was actually started by a friend of mine who lent me their hardware and said it's a pretty interesting box. The software was too limited though, meaning the preinstalled u-boot only allowed TFTP boot, there were no patches for VPAC270 board in the mainline kernel etc.

So I prepared a patchset for the VPAC270 and pushed it mainline. The only patch remaining outside mainline is for the SparseMEM support on this board. This one needs to be properly reworked probably before merging. And this needs some further thought. The patches that made it in add pretty much all the functionality for the VPAC270 board.

Zipit Z2
I also prepared a patchset for the Aeronix Zipit Z2. The patchset made it into mainline, though it was a slow process because it spanned across multiple kernel trees. There were drivers for various peripherals, that is battery driver and ASoC audio driver.

As for the ASoC driver, Mark Brown pointed out the WM8750 codec still wasn't turned to the new API. I therefore converted it over and then pushed both the conversion patch and a vastly reworked Z2 ASoC audio driver.

Conclusion
And this is it. That's all. I guess I should update this place more often rather than creating such a huge pieces of text once in a few weeks.

Monday, April 5, 2010

ZipitZ2 U-Boot update

I was recently asked by Magon if I could fix an u-boot for his Z2 that'd be able to be operated without a serial console. It took a little bit of investigation to figure out u-boot can run scripts from memory card. Though there still was a problem with the memory card.

The PXAMCI is known to be crappy. Though fixing this bug with the PXAMCI was simple. The problem was the mmc init command had to be called twice (or multiple times as some people reported) before the card was detected. It was just that the card was not fast enough to respond to commands so increasing the delays in the initialization routine fixed the problem.

Next was the scripting. I fixed a small conditional statement into the u-boot bootcmd variable which tries starting the MMC card and in case the card is there, starts a script called uboot.script from it. If the card is not present, it tries to start kernel from 0x60000.

The script for u-boot has to be compiled in a little special way. For that, there's a utility called mkimage. This utility takes input files, adds CRC and various other goodies and spits out an image that u-boot can use. To compile uboot.script, run the command like this (this is a one-liner):

mkimage -A arm -O u-boot -T script -C none -a 0 -e 0 -n uboot.script -d SOURCEFILE uboot.script

where SOURCEFILE is the file with the script and uboot.script is the result that goes onto the memory card. For example, my script that only boots the kernel looks like this:

if fatload mmc 0 0xa0000000 uImage ; then
bootm 0xa0000000 ;
fi

As for the updated u-boot, the patches and a binary are available here:
Compiling and flashing (start from pt. 5) instructions:
  1. Apply all three patches in correct order (use git am)
  2. make CROSS_COMPILE=arm-crosscc- clean mrproper
  3. make CROSS_COMPILE=arm-crosscc- zipitz2_config
  4. make CROSS_COMPILE=arm-crosscc- -j9
  5. Flash u-boot.bin to 0x00 and erase 0x40000 (for environment) in your flash. In old u-boot, run:
    1. loady 0xa0000000 (here send the u-boot.bin over Ymodem)
    2. protect off all
    3. erase 0x00 +0x20000
    4. erase 0x40000 +0x20000
    5. cp.b 0xa0000000 0x0 0x1ea88
    6. reset