[pvrusb2] 24xx hardware instability.
Mike Isely
isely at isely.net
Fri Mar 31 19:49:36 CST 2006
On Fri, 31 Mar 2006, Martin Andrew Galese wrote:
> Hi Mike,
>
> I wanted to let you know that I'm still seeing occasions where the
> cx25840 gets wedged. The timing seems to be random, but definitely is
> related to the number of "shows" recording by mythtv. The recording time
> doesn't seem to matter. As in, I can reliable record a continuous 6 hour
> show, but I can only actually record a series of 6, 1 hour shows, 10-15%
> of the time. After 3 distinct recordings (and almost never before) there
> seems to be a good chance that the cx25840 wedges.
>
> I had thought this was fixed with the mpeg2 garbage filter, but that
> only seemed to make the device somewhat more stable.
>
> Has anyone else had this issue? Have you seen it?
>
Martin:
The remaining problem I have seen to-date in my testing with the new
hardware only happen when the hardware is first initialized. In my case
there's a definite probability that the cx25840 module fails to detect the
cx25843 chip. Unfortunately after that happens (and if you have the
msp3400 module in your system) then msp3400 might come along and falsely
detect the cx25840 as an msp3400. Then the msp3400 module goes batty
(because obviously this is not an msp3400 chip), generates lots of noise
in the log and then fails. And when it fails, frequently the kernel gets
corrupted - in fact at that point the outward behavior is the same as what
we just had to deal with involving the old hardware. (In the "old"
hardware case, the msp3400 failure cause was different but the endgame was
the same.) And before anyone (i.e. Hans) asks: No, I do _NOT_ know if
msp3400 is the cause for the kernel corruption. That needs to be chased.
All I can say is that the only times I have found the kernel corrupted
here have been after the msp3400 module fails.
Normally msp3400 should never falsely detect a cx25840 as an msp3400.
The msp3400 module does its detection by looking for revision info from
the chip, and under normal circumstances it won't get that info from the
cx25843 chip and we're fine. However in the scenario I'm seeing, the
cx25840 module is failing to detect the cx25843 chip because the cx25843
chip is spewing garbage data back to the host (usually 0x04 or 0x0a) for
any subaddress that is probed AND that unfortunately appears to msp3400
like a valid revision so then msp3400 comes into the fracas and really
screws things up.
The behavior of msp3400 going nuts and (apparently) corrupting the system
is collateral damage after the initial problem has happened (and it's the
same collateral damage from the bug I just fixed for the old hardware).
The root cause here involves figuring out why the cx25843 chip is spewing
garbage. I've already spent several days chasing that so far without
success. This is the last real problem I know of involving the new
hardware - everything else I understand. I will revisit this problem
after I finish dealing with issues surrounding getting the driver into the
kernel. FYI, there is also an issue getting wm8775 to detect correctly
(which is why that "force=-1,27" option is needed) but I already know a
clean way to fix that; I just need to implement the fix.
Anyway, I'm stating all this so you can examine the problem you're seeing
any maybe find some common ground. Note: If you manually modprobe cx25840
into the kernel with the option "debug=1" then you'll get useful info in
the log reporting that module's status wirh respect to the hardware.
Another trick you can do with the driver to help diagnose problems is just
simply to do this:
cat /sys/class/pvrusb2/sn-xxxx/debuginfo
(Replace "xxxx" with your device serial number.) When you issue that cat
command two things will take place. First, you'll get a compact dump to
stdout reporting information about each I2C client module (e.g.
cx25840.ko, msp3400.ko, saa7115.ko, tuner.ko, etc) that has attached to
the driver. Second, this action will also trigger a LOG_STATUS request to
all attached I2C modules, and typically modules will respond by dumping a
blob of status info into the kernel log. You can do that cat command at
_any_ time; it is a non-destructive action. It's a quick 'n easy way to
get the pulse of the driver.
So far the problem I describe _only_ happens when the hardware is first
initialized. Once it is successfully initialized then it is stable from
that point forward (until you replug the device, power cycle, reinsert the
driver, etc). Sounds like the problem you are describing can happen long
after the hardware has been initialized. That's new behavior.
Admittedly I haven't tried any long duration burn-in tests with the new
hardware yet so maybe I just haven't seen this. I'll keep my eyes open
for it though.
It would valuable information to learn if someone else is seeing this.
However don't anyone treat this as a "request" yet since I actually
haven't yet gotten around to officially documenting how to use the driver
with the new hardware :-)
-Mike
--
| Mike Isely | PGP fingerprint
Spammers Die!! | | 03 54 43 4D 75 E5 CC 92
| isely @ pobox (dot) com | 71 16 01 E2 B5 F5 C1 E8
| |
More information about the pvrusb2
mailing list