Page 1 of 1
Minimal USB implementation
Posted: Sat Jan 04, 2014 5:16 pm
by cpldcpu
To understand how to optimize the memory footprint of V-USB, I created a small ATtiny85 based device that controls a single WS2812 RGB LED via USB. This is very similar to the Blink[1] and other devices.
Current functionality
* Enumeration.
* Only SETUP-request can be received.
* Responses are limited to strings from flash memory or zero sized replies.
* All SETUP packets that are not system requests are forwarded to a WS2812 RGB LED on PB0.
I went as far as stripping down all the code from usbdrv.c and integrating it into a single function. I removed all code that was not required for the core functionality and combined some of the remaining functions:
* Removed data section and initiliazed variables "by hand"
* Turned global sram variables into local variable in the main loop.
* Reduced input buffer to single size
* Removed handling of USB reset.
* Used 16 MHz V-USB code instead of 16.5 MHz
* Included assembler implementation of osccal.c
My ultimate goal was to implement the code on a ATtiny 10. Currently this is not possible, because not enough SRAM is left for the stack and not enough flash is left for the 12MHz V-USB implementation.
Current resource usage:
* 1018 bytes Flash
* 28 bytes SRAM
* Uses only regs R16-R31
You can find the code here:
https://github.com/cpldcpu/u-wireMaybe somebody has an idea how to reduce the SRAM or flash footprint further?
Re: Minimal USB implementation
Posted: Sun Jan 05, 2014 7:14 am
by blargg
Very interesting. Your effort should also be useful to someone learning how USB works hands-on, as it clears away all but the essentials.
I've noticed that calling assembly routines frustrates the C optimizer, because it must assume the worst about registers preserved. Here was an attempt I made at inlining the CRC routine and communicating what registers it trashed (though it may do it wrong, as I still find the asm specification syntax confusing):
Code: Select all
static __attribute__((naked)) inline void usbCrc16Append( volatile unsigned char* data, unsigned char len )
{
asm volatile (
"\n ldi r20, 0xFF"
"\n ldi r21, 0xFF"
"\n rjmp usbCrc16LoopTest"
"\nusbCrc16r18Loop:"
"\n ld r18, Z+"
"\n eor r18, r20 ; r19 is now 'x' in table()"
"\n mov r19, r18 ; compute parity of 'x'"
"\n swap r18"
"\n eor r18, r19"
"\n mov r20, r18"
"\n lsr r18"
"\n lsr r18"
"\n eor r18, r20"
"\n inc r18"
"\n andi r18, 2 ; r18 is now parity(x) << 1"
"\n cp r1, r18 ; c = (r18 != 0), then put in high bit"
"\n ror r19 ; so that after xoring, shifting, and xoring, it gives"
"\n ror r18 ; the desired 0xC0 with r21"
"\n mov r20, r18"
"\n eor r20, r21"
"\n mov r21, r19"
"\n lsr r19"
"\n ror r18"
"\n eor r21, r19"
"\n eor r20, r18"
"\nusbCrc16LoopTest:"
"\n subi %1, 1"
"\n brsh usbCrc16r18Loop"
"\n com r20"
"\n com r21"
"\n st Z+, r20"
"\n st Z, r21"
"\n"
: "=z" (data), "=r" (len)
: "0" (data), "1" (len)
: "memory", "r18", "r19", "r20", "r21" );
}
Also, that's for the optimized CRC routine, so you'll want to convert the slower, shorter one.
Re: Minimal USB implementation
Posted: Tue Jan 14, 2014 7:28 pm
by cpldcpu
Update: I managed to get it to work on a meager ATtiny10!
>Here was an attempt I made at inlining the CRC routine and communicating what registers it trashed
Excellent idea. In fact I can onto a lot of trouble with registers on the ATiny10. I will look into this.
Re: Minimal USB implementation
Posted: Sat Jan 18, 2014 9:52 am
by cpldcpu
I tried inlining the crc routine. Unfortunately it only saved two bytes.
Re: Minimal USB implementation
Posted: Sat Jan 18, 2014 12:44 pm
by stf92
Why don't you include the CRC routine in usbdrvasm.S. Christian Starkjohann did this in his first implementations. That way the compiler will have nothing to do with it and code will be minimal.
Re: Minimal USB implementation
Posted: Sat Jan 18, 2014 12:59 pm
by cpldcpu
That is where it is right now. The idea was that inlining saves code space.
Re: Minimal USB implementation
Posted: Sun Jan 19, 2014 4:30 am
by stf92
You mean call overhead, I see!
Re: Minimal USB implementation
Posted: Sun Jan 19, 2014 7:55 am
by blargg
Actually not so much call overhead, but richer information to the optimizer about exactly what registers are modified. It could probably also be written to let the compiler assign all the registers it uses.
Re: Minimal USB implementation
Posted: Sun Jan 19, 2014 5:14 pm
by cpldcpu
blargg wrote:Actually not so much call overhead, but richer information to the optimizer about exactly what registers are modified. It could probably also be written to let the compiler assign all the registers it uses.
I wish
I have not found a way to define variables in the assemblercode without having the compiler initialize them.
Re: Minimal USB implementation
Posted: Mon Jan 20, 2014 2:44 am
by blargg
Can you just set them as out variables? That would force the compiler to give them registers and let it know that you're modifying them.
Re: Minimal USB implementation
Posted: Tue Jan 21, 2014 9:52 am
by cpldcpu
Whenever I tried that, the compiler would also initialize the variables, which took up more space than it saved.
Re: Minimal USB implementation
Posted: Wed Jan 22, 2014 2:53 am
by blargg
At least with avr-gcc 4.5.3, I think I was able to silence the optimizer warnings by initializing a variable with itself, e.g.char c = c;. Too bad there are so many snags to inlining assembly, as otherwise it could be possible to get more optimal code by using just C rather than mixing it with assembly.
Re: Minimal USB implementation
Posted: Wed Jan 22, 2014 5:51 pm
by cpldcpu
That looks like something that would break arbitrarily with new compiler versions
Re: Minimal USB implementation
Posted: Wed Mar 19, 2014 10:31 pm
by cpldcpu