Page 1 of 2

Entirely polled V-USB? [yes, it works!]

Posted: Tue Nov 19, 2013 9:49 pm
by blargg
Any reasons an entirely polled V-USB wouldn't work? I gave it a try but got host errors (polling the interrupt flag between usbPoll() calls).

Fundamentally, you've got the loop that calls usbPoll(), and periodic interrupts to run code that handles the timing-critical I/O. The I/O interrupts come at regular intervals. If you could arrange so that between the I/O you do at most slightly less work than until the next one, you could then poll the interrupt flag.

Polling shouldn't be a timing problem since the comments say that you have many tens of cycles before the ISR needs to actually fire (a polling loop maybe adds 7 cycles).

The question is how much work you need to do between the USB interrupts. Apparently it's not just a little bit, or my test would have worked. And also the I/O handling eats into some of the time, depending on how much data is being exchanged. I wonder whether the CRC routines eat a lot during initial configuration when several packets are being sent back to the host.

I became interested in this after reading that someone had implemented USBaspLoader on a attiny chip that lacked a separate set of vectors for the bootloader, and thus had to share the vectors with the application. If V-USB could be made to work with polling, the bootloader would only need to share the reset vector, a far easier one to do that with.

Re: Entirely polled V-USB?

Posted: Sat Nov 23, 2013 2:55 pm
by cpldcpu
That would actually be a really useful approach for micronucleus (https://github.com/micronucleus/micronucleus), which is probably one of the bootloaders you are referring to. Apart from USB handling and writing the flash, the CPU is basically idling. Interrupts have to be disable during flash writing anyhow. So there is nothing to lose.

What exactly did you try?

Re: Entirely polled V-USB?

Posted: Sat Nov 23, 2013 8:54 pm
by blargg
The basic approach (with interrupts always disabled of course, and the interrupt handler modified to end in RET instead of RETI):

Code: Select all

for ( ;; )
{
    USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
    while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
        { }
   
    USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
    INT0_vect();
   
    usbPoll();
}

I tried variations, like calling usbPoll() a few times in a row between waits, modifying usbPoll() to return after each potentially-lengthy task:

Code: Select all

USB_PUBLIC void usbPoll(void)
{
    ...
    if ( len >= 0 )
    {
        ...
        usbProcessRx(usbRxBuf + USB_BUFSIZE + 1 - usbInputBufOffset, len);
        ...
        return; // ** ADDED
    }
   
    if ( usbTxLen & 0x10 && usbMsgLen != USB_NO_MSG )
    {
        usbBuildTxBlock();
        return; // ** ADDED
    }
    ...
}

Putting the above USB_INTR_PENDING_BIT wait loop into the asm just before the interrupt handler.

The fast CRC routine was enabled.

On the host it starts to negotiate connection, but gets errors. I guess I need to add some timing checks to the code to see how long usbPoll() is taking, or just look at it on an o-scope.

I hadn't seen your bootloader project, and its existence (as well as wide adoption) gives me further enthusiasm to get this working.

Re: Entirely polled V-USB?

Posted: Sat Nov 23, 2013 9:19 pm
by cpldcpu
Quite a nice way of hijacking the interrupt. My first thought was to poll the port directly. But this opens an interesting opportunity to debug: You could read the interrupt flag after usbpoll returns to see whether any transmission was missed.

Btw, I am not the original author of micronucleus. Bluebie is. I am just playing a bit around with it, trying to improve it.

Re: Entirely polled V-USB?

Posted: Sat Nov 23, 2013 9:45 pm
by blargg
Yeah, I was going to just poll the port when I realized that the interrupt flag tells us whenever the interesting conditions occurred without having to dig around to see which edge it wants the interrupt on, etc. (and as you noted, past-tense as well; will have to try that as a diagnostic). I am also working on improving the current bootloaders (almost done with USBaspLoader).

Re: Entirely polled V-USB?

Posted: Sun Nov 24, 2013 12:06 am
by blargg
OK, after thinking it was working, then that I was merely running the bootloader and being fooled, I'm sure it's working now. The critical change was a check for the interrupt in the small loop in usbPoll(). It's odd because that loop only checks a register, so the loop should only take a few microseconds. I'm still desiring a better understanding of why it's so critical.

Code: Select all

USB_PUBLIC void usbPollInterrupt( void )
{
    if ( (USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
    {
        USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
        INT0_vect();
    }
}

USB_PUBLIC void usbPoll(void)
{
    ...
    for(i = 20; i > 0; i--){
   
        usbPollInterrupt(); // *** Added
       
        uchar usbLineStatus = USBIN & USBMASK;
        if(usbLineStatus != 0)  /* SE0 has ended */
            goto isNotReset;
    }
    ...
}

I also put a call to usbPollInterrupt() in a modified boot_spm_busy_wait() when writing flash:

Code: Select all

void my_spm_busy_wait( void )
{
    do {
        usbPollInterrupt();
    }
    while ( boot_spm_busy() );
}

I believe some flash operations take ~4ms, so without this the USB interrupt would time out (I tried without this and flash writing did fail).

I've run some 4K programs now using this polled version of USBaspLoader and it's gotten no errors, nothing bad on dmesg. Glad you posted some encouragement to keep trying!

I hope you don't mind if I take a crack at making micronucleus polled (though I lack an tiny85, just have atmega8).

EDIT: Was too much to try to adapt the micronucleus code, so I just adapted USBaspLoader to micronucleus' protocol and it worked. Not handling the vector modification, it's 1636 bytes of code for 16.5MHz, though that lacks the OSCCAL stuff you need for the tiny85. My plan is to see if the vector modification can just be done on the host side, keeping the code minimal on the avr side.

Re: Entirely polled V-USB?

Posted: Sun Nov 24, 2013 2:20 pm
by cpldcpu
Congrats on getting it to work!

The small loop to detect the reset condition is odd. It should exit anyhow if no reset was asserted? And if a reset was asserted it should not enter it IRQ. So I don't understand why you had to add the polling loop there. Shouldn't it be sufficent to poll at "isNotReset"?

Re: Entirely polled V-USB?

Posted: Mon Nov 25, 2013 4:44 am
by blargg
I solved the mystery, and now feel much more confident about polling. I now see it as a legitimate alternative to an interrupt where you know the timing of your code well and aren't doing much. It might help for really constrained systems.

Here is what I found with a scope. First, I modified the code to output a signal to compare on the scope with USB data:

Code: Select all

for ( ;; )
{
   DDRD |= 2;
   PORTD |= 2; // scope signal = +5V

   // Wait for interrupt
   while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
      { }
   
   PORTD &= ~2; // scope signal = 0V

   USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
   INT0_vect();

   DDRD &= ~2; // Scope signal = 2.5V (I have two 1K resistors to +5V and GND, forming a divider when the pin is an input)

   usbPoll();
   // usbPoll() internally does this:
   for ( n = 20; n; n-- )
   {
      if ( USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
      {
         USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
         INT0_vect();
      }
      ...
   }
}

It outputs +5V while waiting for an interrupt, 0V when processing it, and 2.5V when processing the received data (usbPoll() and whatever it calls). But also, usbPoll() might check for further interrupts and process them, so those are included. Those aren't important for what I found.

I then had a version of USBaspLoader running with avrdude constantly doing a flash dump back to the host, so it was sending lots of data continuously.

Image

There are 20uS/division here. A is this tracer signal, B is USB.

At 0.5 divisions, A goes from +5V to 0V, where it's noticed USB and is running V-USB's interrupt routine. At 1.75 divisions, the interrupt routine finishes and we call usbPoll(). But note how further USB data comes in, which apparently usbPoll() catches because it takes a while to return. At 5.2div, you can see V-USB responding to the host, as V-USB's USB signal level is slightly higher than the host's. We go back to waiting for the next interrupt at 6.0div.

So the problem was that sometimes we'd get more USB data just after we had finished handling the current data. Thus, that extra check for a new interrupt in usbPoll() was just what was needed to catch it in time.

So I simply moved that equivalent loop into my main loop:

Code: Select all

for ( ;; )
{
   DDRD |= 2;
   PORTD |= 2; // scope signal = +5V

   // Wait for interrupt
   while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
      { }
   
   PORTD &= ~2; // scope signal = 0V

   USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
   INT0_vect();
   
   // Wait a little while for any more USB interrupts
   for ( n = 50; n; n-- )
   {
      if ( USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
      {
         USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
         INT0_vect();
      }
   }
   
   DDRD &= ~2; // Scope signal = 2.5V

   usbPoll(); // now usbPoll never checks the interrupt flag
}


I also changed V-USB's code back to the stock code, except for that one change of RETI to RET (I could probably disable INT0 before calling the interrupt handler, then do a CLI and re-enable INT0 after and be able to make zero changes to it for polling to work). It works just as well as before (I'd think better, since it's now not missing any USB interrupts), and looks great on the scope:

Image

It starts handling interrupts at 0.5div, and doesn't stop checking until 7.5div, well after USB activity is done. So it doesn't miss anything and is always ready to respond immediately. Then it spends 1.5div (30uS) processing things and then waiting for the next USB interrupt.

Finally, an overview of timing (and a demonstration of my camera's white balance making the green trace look blue):

Image

Here it's 0.2mS/div. You can see that for every 1mS USB interrupt, it only spends about 0.1mS (10%) in handling USB, and about 0.05mS (5%) doing other calculation before it goes to waiting for the next USB interrupt. So there's plenty of time for doing calculations each USB interrupt between polls. I forget how fast these microcontrollers are (and this is on a 12MHz part as well).

EDIT: I've improved the code slightly; now it waits a little less time for USB activity, but if it finds some, resets the delay for any further activity:

Code: Select all

for ( ;; )
{
    while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
        { }
   
    for ( ;; )
    {
        USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
        INT0_vect();
   
        unsigned char n = 20;
        while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
            if ( !--n )
                goto usb_idle;
    }
usb_idle:
   
    usbPoll();
}

In the wait loop while flash activity is going, it's a slight variation:

Code: Select all

while ( whatever_were_waiting_for )
{
    if ( (USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
    {
        for ( ;; )
        {
            USB_INTR_PENDING = 1<<USB_INTR_PENDING_BIT;
            INT0_vect();
   
            unsigned char n = 20;
            while ( !(USB_INTR_PENDING & (1<<USB_INTR_PENDING_BIT)) )
                if ( !--n )
                    goto usb_idle;
        }
    usb_idle:;
    }
}


I specifically do not want to call usbPoll() from within one of its callbacks, because it could become recursive.

Re: Entirely polled V-USB? [yes, it works!]

Posted: Mon Nov 25, 2013 5:38 pm
by ulao
Trying to follow along here? I'm wondering if this would be an effective way to cheat the fact that windows uses 8 -10 ms to take its polls on an interrupt transfer. Or am I way off here.

Re: Entirely polled V-USB? [yes, it works!]

Posted: Mon Nov 25, 2013 8:15 pm
by blargg
Yes, you're way off :)

This doesn't alter USB behavior, rather simply allows one to avoid using hardware interrupts on the AVR side. The main use is in a bootloader on an AVR without a separate vector table, where you want to preserve the user program's vectors and not hijack the INT0 handler (or rather pin change as I believe it is on an attiny).

It might also be useful for implementing a USB handler that doesn't have to preserve registers, and can rely on some being preloaded. Perhaps for an 8MHz implementation of V-USB (which I'm going to research today).

Re: Entirely polled V-USB? [yes, it works!]

Posted: Mon Nov 25, 2013 8:38 pm
by ulao
Ok, got it! Thx for that summary. Good luck on the 8mhz!

Re: Entirely polled V-USB? [yes, it works!]

Posted: Tue Nov 26, 2013 7:09 am
by cpldcpu
Thanks for the explanation. I will try to implement this in the bootloader.

I like the debugging by oscilloscope. I am using a bitbanged maximum speed SPI implementation as debug output on two unused pins. Then I use a logicanalyzer to capture both the SPI and USB traffic to get realtime debugging information.

Re: Entirely polled V-USB? [yes, it works!]

Posted: Fri Nov 29, 2013 8:31 pm
by cpldcpu
Ok, managed to get it working is well.

1) I did not realize your routine was jumping to INT0 and my code used the pin change interrupt PCINT0. One of the most annoying bugs ever! Whenever USB traffic appeared the device reset and I could not figure out why.

2) I removed all occurencec of SEI in the code and the first instruction after reset is CLI. Yet the code fails if I the PCINT0 vector is not initialized, meaning that an IRQ is asserted somewhere. Really odd.

Re: Entirely polled V-USB? [yes, it works!]

Posted: Fri Nov 29, 2013 9:04 pm
by blargg
So the .lss (from avr-objdump, not gcc) file doesn't show any occurrences of SEI? I wouldn't trust source code.

Anything mis-restoring the status register?

Oh, and you've changed RETI to RET in asmcommon.inc, right? :)

Re: Entirely polled V-USB? [yes, it works!]

Posted: Sat Nov 30, 2013 10:47 am
by cpldcpu
Yes, yes and yes :)

My guess is also on status register restoring, but I reviewed the diassembly and found nothing fishy. Looks like I have to dig further.