[pvrusb2] Driver instability
Mike Isely
isely at isely.net
Tue Mar 28 20:02:45 CST 2006
There appears to be a problem in the pvrusb2 driver, and worse still I
think it's been lurking for a while. Lots of additional details further
down.
On Tue, 28 Mar 2006, Steven Karp wrote:
> On Tuesday 28 March 2006 1:21 pm, Barry Jett wrote:
>> I am happy to report success from the new snapshot on my Debian "Etch"
>> system and 2040:2400 Hauppauge box.
>
> Hmmm. Can somebody clarify if these steps:
Sorry - I'm in the middle of a bunch of non-trivial changes (needed for
the merge window which closes in a few days) so I haven't had the chance
to elaborate this stuff. The web page is completely silent on this topic.
At the moment I've been less interested in getting new instructions
published and more interested in getting a snapshot out before I really
started slicing into the code (which I'm doing now).
>
>> 2) Apt-get install the kernel image and modify the FWSEND of
>> ./linux/drivers/media/video/cx25840/cx25840.c per Mike's instructions
>>
>> 3) Configure & compile the kernel but don't install. I know there is a way
>> to compile a single module but I've long since forgotten how.
>>
>> 4) Rename the existing cx25840.ko (like cx25840.ko.old) and move the newly
>> compiled module in its place.
>>
>> 7) After reboot, do a 'rmmod wm8775' followed by a 'modprobe wm8775
>> force=-1,27' per Mikes instructions.
>
> are necessary for *all* versions of the hardware, or just the new version?
Just the new hardware. For the old hardware, nothing has changed. In
theory...
The new hardware has two new chips, a cx25843 and a wm8775. The
additional steps are specific to that hardware. With the old hardware
none of those steps are relevant.
>
> I can confirm that if you do *not* do these steps, then the new snapshot does
> *not* work on my Suse 10.0 setup with the older hardware. All the modules
> appear to load, but /dev/video* doesn't appear, and attempting to modprobe -r
> the pvrusb2 module hangs my system to the point that I need to hit the power
> button to shut it off.
:-(
>
> The 18-Mar-06 snapshot is working fine.
>
> If someone can confirm the necessity for the above steps, I'm willing to give
> it a shot, especially if there's a way to just compile cx25840 instead of the
> whole kernel... Note, though, that my Hauppauge box is borrowed, so if I'm
> going to try this, it'll need to be in the next couple of days.
>
I too have started seeing a problem in the driver that affects old
hardware. I have managed to reproduce the problem with driver versions
going at least as far back as just after Feb snapshot (haven't tried
anything any older yet).
Sigh...
The problem is very non-deterministic and only happens when the msp3400
module is loaded, and only at the point when the driver first initializes.
What happens is that an I2C transaction to msp3400 times out, then msp3400
goes nuts and decides to stop talking to the hardware (you'll see lots of
messages in the log). At this point things "jam". You won't be able to
unload pvrusb2, because I believe msp3400 won't "let go" of it. If you
unplug the device, odds are good there will be a slab corruption oops
error and you'll have to reboot.
I've only recently noticed this, and only in the past week have I started
to see it really become a problem. I'd suspect my recent changes, but I
have also managed to reproduce the problem with older snapshots. It might
be a problem in the msp3400 module itself, but I've seen the problem now
with at least 3 different versions of msp3400. It's possible that the
problem has been there all along, but recent change have made
manifestation of it more probable.
There are actually 2 problems here.
First, it should simply be IMPOSSIBLE for an I2C transfer to timeout.
Under normal circumstances the slave device must always respond and if
there is no slave device then the master will rapidly detect this at the
selection phase of the transfer (due to lack of an ack bit). For a real
time out to happen, basically the slave (or somebody) has to ground the
I2C clock signal and hold it down forever - which is never supposed to
happen. This would suggest some kind of wierd hardware thing going on
(but it's not a hardware defect because I've triggered it now with 2
different devices).
The second problem of course is the Very Bad behavior that happens in the
kernel after this timeout. Ideally there needs to be a fix to get rid of
the bad behavior, then the root cause (i.e. the time out) can be chased.
This whole situation would be my top priority right now, were it not for
the fact that I'm trying to finish these other changes so that the driver
can make it into the 2.6.17 kernel before the merge window closes this
weekend. This problem will be at the top of my stack once I get these
other changes done. Promise :-)
In the mean time, if others here would like to attack the problem - if
only to try to better characterize the symptoms - that would be an immense
help. _I_ have seen these symptoms in driver code as far back as shortly
after the 20060209 snapshot but that's only because I haven't tried
anything any earlier yet. Just because you haven't seen it doesn't mean
the problem isn't there. This is a non-deterministic bug. It may be fine
for 10 passes in a row and then fail on the 11th. Note: The problem
_only_ happens when the driver initializes the hardware. Once you get
past initialization - and survive it - then everything is fine until the
next time you replug the hardware or reinsert the driver, power cycle,
etc... I'm sorry that this is happening, but if you want to help then
this is something that a parallel effort could probably make some headway
on...
-Mike
--
| Mike Isely | PGP fingerprint
Spammers Die!! | | 03 54 43 4D 75 E5 CC 92
| isely @ pobox (dot) com | 71 16 01 E2 B5 F5 C1 E8
| |
More information about the pvrusb2
mailing list