[pvrusb2] Ability to fully reset a PVRUSB2 Device
Mike Isely
isely at isely.net
Sat Sep 21 20:14:35 CDT 2019
An update on this...
1. There are two kernel threads involved. One manages contexts, the
other is involved in a kernel work queue for managing the hardware. A
week ago I first thought it was that context-managing thread, but now it
appears to be that second thread which is jamming, triggering a kernel
oops and then aborting, leaving the driver in a fubar state.
2. The v5.3.1 kernel happens to now include an upstream fix that deals
with a potential null pointer dereference problem in the sysfs part of
the pvrusb2 driver. This is a new change, since at least 5.2.13 (the
version I'm focusing on right now). This would be something that gets
hit on tear-down so without that fix things MIGHT go awry. But right
now I don't know if that is the same problem we're looking at here.
This is because...
3. After turning on additional trace print, I've noticed another problem
that might be masking things. Some background... The pvrusb2 driver
doesn't do "everything" on its own. Rather, like many v4l drivers, it
relies on common external v4l chip-level drivers to self-manage various
parts of the video pipeline. In these cases the pvrusb2 driver provides
a datapath for all these things to reach the hardware, via an I2C master
interface that is carried over the USB cable (tunneled, effectively).
Every chip-level driver in v4l that accesses stuff on the
pvrusb2-related hardware does so through this pvrusb2-provided I2C
interface. Well when you unplug the device / kill its power / whatever,
obviously that datapath is severed. When this happens any further
attempts to access that I2C master interface is met with an EIO error
back to the caller (and you'll see a kernel log message "pvrusb2:
Attempted to execute control transfer when device no ok"). During
tear-down that's actually expected. However the tear-down can't
complete until all these chip-level drivers in v4l stop trying to use
this interface. And somebody these isn't giving up - the driver is
getting into what appears to be an infinite loop of these errors and
never getting out. This leads me to suspect a v4l chip-level driver may
have a problem dealing with a hot-unplug situation. Given that those
drivers are managed outside of the pvrusb2 driver (for obvious reasons),
it's possible that a change in one of those might be a contributor to
the problem here.
So I'm trying to suss out #3 above first. That should hopefully clear
the air to solve #1 and figure out if #2 is related to any of this.
-Mike
On Mon, 9 Sep 2019, Mike Isely wrote:
>
> Stay tuned. And pester me again if I go quiet for too long.
>
> The pvrusb2 driver sets up a single internal kernel thread to take care
> of various bits of background activity. That thread also performs part
> of the setup and most of the tear-down when a device is hotplugged /
> hot-unplugged. The oops is definitely happening in that thread - which
> is a good thing because it means that it should be possible to rule out
> lots of bizarre interactions involving other threads calling into the
> driver. I am going to add printk's before each step of the tear-down
> process so I can start to get an idea where it is going awry. I hope to
> do that tonight.
>
> -Mike
>
>
> On Sun, 8 Sep 2019, Diego Rivera wrote:
>
> > No problem! I can imagine how normal life has you pegged down, just like it does with us all!
> > Thanks for circling back to it, though. Is there anything I can do on my end to help you?
> > Cheers!
> >
> > On Sat, 2019-09-07 at 14:26 -0500, isely at isely.net wrote:
> > > Hi Diego,
> > > I am sorry. I had gotten completely distracted away from this.
> > > I just updated to the latest kernel and have confirmed that it's still getting an oops when the
> > > device is hot-unplugged. I'm looking at it right now. At first glance this looks like a fairly
> > > nasty tear-down race - which long ago didn't used to be there. So there has to be some kind of
> > > environmental change leading to this behavior.
> > > -Mike
> > > On Wed, 21 Aug 2019, Diego Rivera wrote:
> > > > Hi, Mike!Any luck with this? I haven't poked you in some time so I figured I'd check to see if
> > > > you've had theopportunity to debug this anymore, and if there's any way I can help with the
> > > > process...Let me know!Cheers!
> > > > On Sat, 2019-04-20 at 20:16 -0600, Diego Rivera wrote:
> > > > > This is the result of a 2nd attempt with a hot-unplug. I don't see many differences beyond
> > > > > thevalues of some registers changing between one instance and the other.Cheers!--
> > > > >
> > > > >
> > > > > Diego Rivera
> > > > > On Sat, 2019-04-20 at 20:09 -0600, Diego Rivera wrote:
> > > > > > Guinea pig #1 responding as ordered, sir!☺One is the kernel log from connection, the other
> > > > > > is what happens if I try to do a modprobe-r. I noticed there's a call trace with registers
> > > > > > - I'm wondering if I need to add more symbolspackages so that trace can be more verbose and
> > > > > > offer up more info. Thoughts?Let me know if you want me to try anything else. I'm going to
> > > > > > produce the output now for hot-unplug of the same device, see how that differs.Cheers!--
> > > > > >
> > > > > >
> > > > > > Diego Rivera
> > > > > > On Sat, 2019-04-20 at 20:26 -0500, isely at isely.net wrote:
> > > > > > > Status update. Nothing really useful to report except that I am seeing some screwy
> > > > > > > behaviorjust on hotplug / hotunplug operations with the device just sitting idle not being
> > > > > > > touched byanything. In this case I tested an old 29032 model - a very early module but
> > > > > > > it's a usefultest subject because it is simpler than the HVR-1950 yet still exercises most
> > > > > > > of the keypieces of the driver. I ran a freshly compiled 5.0.9 kernel (latest stable) for
> > > > > > > this test.Sorry this has taken so long. As was guessed earlier, I haven't worked on this
> > > > > > > in a very longtime and I had to unbox a lot of stuff. I also spent far too much time
> > > > > > > today setting up aseparate purpose-built computer which I can trash / crash / hang with
> > > > > > > wild abandon withoutlosing anything of value. This approach allows me to keep my dev
> > > > > > > environment on a machineseparate from the one that is running test kernels.I was able to
> > > > > > > cleanly modprobe -r pvrusb2 every time so far, but if the issue is on the DVBside of the
> > > > > > > fence, then the old 29032 model I've just tried won't exhibit that issue. So alot more
> > > > > > > characterization to do.Diego: It would useful if you could post to me the section of your
> > > > > > > /var/log/kern.log (orequivalent) should all the kernel messages from the point when you
> > > > > > > plug in the device to whenthe fireworks are happening after trying to tear down. If I
> > > > > > > find that same pattern here thenwe'll know for sure that we are chasing the same issue. -
> > > > > > > Mike
> >
>
> --
>
> Mike Isely
> isely @ isely (dot) net
> PGP: 03 54 43 4D 75 E5 CC 92 71 16 01 E2 B5 F5 C1 E8
> _______________________________________________
> pvrusb2 mailing list
> pvrusb2 at isely.net
> http://www.isely.net/cgi-bin/mailman/listinfo/pvrusb2
>
--
Mike Isely
isely @ isely (dot) net
PGP: 03 54 43 4D 75 E5 CC 92 71 16 01 E2 B5 F5 C1 E8
More information about the pvrusb2
mailing list