[pvrusb2] Ability to fully reset a PVRUSB2 Device

Diego Rivera diego.rivera.cr at gmail.com
Sat Sep 21 20:18:39 CDT 2019


Thanks for the update!
It occurred to me: what if for #3, instead of the driver not handling the error, it's simply
expecting a different/new (type of) error to be raised in order to go through a code path that leads
to it not getting borked? Bah ... I'm sure you've thought of this ☺
Cheers!
On Sat, 2019-09-21 at 20:14 -0500, Mike Isely wrote:
> An update on this...
> 1. There are two kernel threads involved.  One manages contexts, the other is involved in a kernel
> work queue for managing the hardware.  A week ago I first thought it was that context-managing
> thread, but now it appears to be that second thread which is jamming, triggering a kernel oops and
> then aborting, leaving the driver in a fubar state.
> 2. The v5.3.1 kernel happens to now include an upstream fix that deals with a potential null
> pointer dereference problem in the sysfs part of the pvrusb2 driver.  This is a new change, since
> at least 5.2.13 (the version I'm focusing on right now).  This would be something that gets hit on
> tear-down so without that fix things MIGHT go awry.  But right now I don't know if that is the
> same problem we're looking at here.  This is because...
> 3. After turning on additional trace print, I've noticed another problem that might be masking
> things.  Some background...  The pvrusb2 driver doesn't do "everything" on its own.  Rather, like
> many v4l drivers, it relies on common external v4l chip-level drivers to self-manage various parts
> of the video pipeline.  In these cases the pvrusb2 driver provides a datapath for all these things
> to reach the hardware, via an I2C master interface that is carried over the USB cable (tunneled,
> effectively).  Every chip-level driver in v4l that accesses stuff on the pvrusb2-related hardware
> does so through this pvrusb2-provided I2C interface.  Well when you unplug the device / kill its
> power / whatever, obviously that datapath is severed.  When this happens any further attempts to
> access that I2C master interface is met with an EIO error back to the caller (and you'll see a
> kernel log message "pvrusb2: Attempted to execute control transfer when device no ok").  During
> tear-down that's actually expected.  However the tear-down can't complete until all these chip-
> level drivers in v4l stop trying to use this interface.  And somebody these isn't giving up - the
> driver is getting into what appears to be an infinite loop of these errors and never getting
> out.  This leads me to suspect a v4l chip-level driver may have a problem dealing with a hot-
> unplug situation.  Given that those drivers are managed outside of the pvrusb2 driver (for obvious
> reasons), it's possible that a change in one of those might be a contributor to the problem here.
> So I'm trying to suss out #3 above first.  That should hopefully clear the air to solve #1 and
> figure out if #2 is related to any of this.
>   -Mike
> 
> On Mon, 9 Sep 2019, Mike Isely wrote:
> > Stay tuned.  And pester me again if I go quiet for too long.
> > The pvrusb2 driver sets up a single internal kernel thread to take care of various bits of
> > background activity.  That thread also performs part of the setup and most of the tear-down when
> > a device is hotplugged / hot-unplugged.  The oops is definitely happening in that thread - which
> > is a good thing because it means that it should be possible to rule out lots of bizarre
> > interactions involving other threads calling into the driver.  I am going to add printk's before
> > each step of the tear-down process so I can start to get an idea where it is going awry.  I hope
> > to do that tonight.
> >   -Mike
> > 
> > On Sun, 8 Sep 2019, Diego Rivera wrote:
> > > No problem! I can imagine how normal life has you pegged down, just like it does with us
> > > all!Thanks for circling back to it, though. Is there anything I can do on my end to help
> > > you?Cheers!
> > > On Sat, 2019-09-07 at 14:26 -0500, isely at isely.net wrote:
> > > > Hi Diego,I am sorry.  I had gotten completely distracted away from this.I just updated to
> > > > the latest kernel and have confirmed that it's still getting an oops when thedevice is hot-
> > > > unplugged.  I'm looking at it right now.  At first glance this looks like a fairlynasty
> > > > tear-down race - which long ago didn't used to be there.  So there has to be some kind
> > > > ofenvironmental change leading to this behavior.  -MikeOn Wed, 21 Aug 2019, Diego Rivera
> > > > wrote:
> > > > > Hi, Mike!Any luck with this? I haven't poked you in some time so I figured I'd check to
> > > > > see ifyou've had theopportunity to debug this anymore, and if there's any way I can help
> > > > > with theprocess...Let me know!Cheers!On Sat, 2019-04-20 at 20:16 -0600, Diego Rivera
> > > > > wrote:
> > > > > > This is the result of a 2nd attempt with a hot-unplug.  I don't see many differences
> > > > > > beyondthevalues of some registers changing between one instance and the other.Cheers!-- 
> > > > > > 
> > > > > > Diego RiveraOn Sat, 2019-04-20 at 20:09 -0600, Diego Rivera wrote:
> > > > > > > Guinea pig #1 responding as ordered, sir!☺One is the kernel log from connection, the
> > > > > > > otheris what happens if I try to do a modprobe-r.  I noticed there's a call trace with
> > > > > > > registers- I'm wondering if I need to add more symbolspackages so that trace can be
> > > > > > > more verbose andoffer up more info. Thoughts?Let me know if you want me to try
> > > > > > > anything else.  I'm going toproduce the output now for hot-unplug of the same device,
> > > > > > > see how that differs.Cheers!-- 
> > > > > > > 
> > > > > > > Diego RiveraOn Sat, 2019-04-20 at 20:26 -0500, isely at isely.net wrote:
> > > > > > > > Status update.  Nothing really useful to report except that I am seeing some
> > > > > > > > screwybehaviorjust on hotplug / hotunplug operations with the device just sitting
> > > > > > > > idle not beingtouched byanything.  In this case I tested an old 29032 model - a very
> > > > > > > > early module butit's a usefultest subject because it is simpler than the HVR-1950
> > > > > > > > yet still exercises mostof the keypieces of the driver.  I ran a freshly compiled
> > > > > > > > 5.0.9 kernel (latest stable) forthis test.Sorry this has taken so long.  As was
> > > > > > > > guessed earlier, I haven't worked on thisin a very longtime and I had to unbox a lot
> > > > > > > > of stuff.  I also spent far too much timetoday setting up aseparate purpose-built
> > > > > > > > computer which I can trash / crash / hang withwild abandon withoutlosing anything of
> > > > > > > > value.  This approach allows me to keep my devenvironment on a machineseparate from
> > > > > > > > the one that is running test kernels.I was able tocleanly modprobe -r pvrusb2 every
> > > > > > > > time so far, but if the issue is on the DVBside of thefence, then the old 29032
> > > > > > > > model I've just tried won't exhibit that issue.  So alot morecharacterization to
> > > > > > > > do.Diego: It would useful if you could post to me the section of
> > > > > > > > your/var/log/kern.log (orequivalent) should all the kernel messages from the point
> > > > > > > > when youplug in the device to whenthe fireworks are happening after trying to tear
> > > > > > > > down.  If Ifind that same pattern here thenwe'll know for sure that we are chasing
> > > > > > > > the same issue.  -Mike
> > 
> > -- 
> > Mike Iselyisely @ isely (dot) netPGP: 03 54 43 4D 75 E5 CC 92 71 16 01 E2 B5 F5 C1
> > E8_______________________________________________pvrusb2 mailing listpvrusb2 at isely.net
> > http://www.isely.net/cgi-bin/mailman/listinfo/pvrusb2
> > 
-- 



Diego Rivera

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: This is a digitally signed message part
URL: <http://www.isely.net/pipermail/pvrusb2/attachments/20190921/76ea07f2/attachment.sig>


More information about the pvrusb2 mailing list