Page 1 of 1

Sometimes it does ... sometimes it doesn't

Posted: Thu Jan 02, 2014 5:16 am
by rszemeti
I'm having a bit of headache with a (modified) version of Automator ... and at least part of the problem seems to be erratic comms.

Running #lsusb -D <device path> or opening the device from a random Perl script will either yield the manufacturer name, or the product name, or both, or neither ... depending on how it happens to feel at the time.

I ran usbmon and mounted the kernel debugfs and got this example of a partially failed transaction:

Code: Select all

daacee00 3258841209 S Ci:6:028:0 s 80 06 0300 0000 00ff 255 <
daacee00 3258847165 C Ci:6:028:0 0 4 = 04030904
daacee00 3258847184 S Ci:6:028:0 s 80 06 0301 0409 00ff 255 <
daacee00 3258856170 C Ci:6:028:0 -84 16 = 1a037700 77007700 2e006f00 62006400
f0b1d980 3258856251 S Ci:6:028:0 s 80 06 0300 0000 00ff 255 <
f0b1d980 3258862165 C Ci:6:028:0 0 4 = 04030904
f0b1d980 3258862185 S Ci:6:028:0 s 80 06 0302 0409 00ff 255 <
f0b1d980 3258873169 C Ci:6:028:0 0 20 = 14034100 75007400 6f006d00 61007400 6f007200
f0b1d980 3258873264 S Ci:6:028:0 s 80 06 0300 0000 00ff 255 <
f0b1d980 3258880172 C Ci:6:028:0 0 4 = 04030904
f0b1d980 3258880215 S Ci:6:028:0 s 80 06 0300 0409 00ff 255 <
f0b1d980 3258886165 C Ci:6:028:0 0 4 = 04030904


the line:

"daacee00 3258856170 C Ci:6:028:0 -84 16 = 1a037700 77007700 2e006f00 62006400"

appears to be the troublesome one ... it seems to be saying its going to send 0x0A bytes, I can see it manages "www.obd" and then seems to crap out ... not sure what the -84 code is .. I guess its an error code for something?

I've disabled other interrupts on the system, (im running on INT2, so I've cleared INT0 and INT1) .. any clues where to start looking? Its a 3.3V system and runing through a pair of 68R into the linux box, the lines appear to go close to rail and look lovely and crisp on the scope, so I doubt it is levels ...

The initial setup seems to go OK, I can see it finds the device and works out what it is, vendor ID, product ID etc, that all seems to go off with out a hitch .. its the later queries and anything from the fltk application that seem to be cursed ... any clues on where to begin to dig?

Re: Sometimes it does ... sometimes it doesn't

Posted: Thu Jan 02, 2014 7:48 pm
by blargg
Your modifications are a good place to start. Does the stock firmware work? Is your hardware any different? What clock source are you using?

Re: Sometimes it does ... sometimes it doesn't

Posted: Fri Jan 03, 2014 5:06 am
by rszemeti
I've tried with stock 'Automator' code, stripped down to essentials, removed all the events and polling with the exception of the usbPoll(); loop ... and still get pretty much the same result .. disconnecting the JTAG debugger and running production rather than debug code seems to make a little improvement, but maybe I'm imagining it, but even so, just acquiring manufacturer and product from the device fails about 25% of the time, which is core v-usb functionality,

Seems to generally be the same -84 error .. which apparently is EILSEQ .. a CRC error. Sometimes it -71, a EPROTO protocol error

The USB waveform looks really crisp and is going 0V to 3.3V rail to rail, or as close as you can imagine, the 1K5 pulls up to around 3V so it looks about right.

Just trying the standard automator 'download file' I get it to work around 50% of the time ... still seeing protocol errors when trying to probe the manufacrturer and product

Code: Select all

f652f480 1711975673 S Ci:6:099:0 s 80 06 0301 0409 00ff 255 <
f652f480 1711990667 C Ci:6:099:0 0 26 = 1a037700 77007700 2e006f00 62006400 65007600 2e006100 7400


and next time

Code: Select all

f23a0e80 1713236684 S Ci:6:099:0 s 80 06 0301 0409 00ff 255 <
f23a0e80 1713251665 C Ci:6:099:0 -71 26 = 1a037700 77007700 2e006f00 62006400 65007600 2e006100 7400


The probe is the same, the received packet is the same, but Linux seems to think there is a protocol error ... the notes I can find on the 'net say:

81 -EPROTO (*, **) a) bitstuff error
82 b) no response packet received within the
83 prescribed bus turn-around time
84 c) unknown USB error
85
86 -EILSEQ (*, **) a) CRC mismatch
87 b) no response packet received within the
88 prescribed bus turn-around time
89 c) unknown USB error
90
91 In cases b) and c) either -EPROTO or -EILSEQ
92 may be returned. Note that often the controller
93 hardware does not distinguish among cases a),
94 b), and c), so a driver cannot tell whether
95 there was a protocol error, a failure to respond
96 (often caused by device disconnect), or some
97 other fault.


By the way the packets look identical, but there is a protocol error ... I would say its either timing related or something deeper in the handshake?

Solved

Posted: Fri Jan 03, 2014 8:36 pm
by rszemeti
Sorted.

At some point I had uncommented the line:

// #define USB_INTR_CFG MCUCR

in usbconfig.h ... which caused it to work erratically ... for reasons not totally clear to me ... I think I did that while trying to configure for INT2

but anyway commenting that line out again that solved it, and it runs on INT2 just fine.