Re: [patch 2/2] w1: unlock correct lock on error in w1_seq_show()

From: David Fries
Date: Tue Jun 02 2015 - 22:40:02 EST


On Tue, Jun 02, 2015 at 09:54:10PM -0400, Matt Campbell wrote:
> Thanks for the response, I will try to answer your questions.
>
> The other function besides temp conversion and EEPROM write that
> requires a delay is the initial CHAIN on command. The datasheet says
> to "Wait for chain to charge" but it does not offer ANY indication on
> how to do so. I did discover that my sleep was a sleep(0) long after
> I submitted the patch, but as you observed it was not hurting
> anything.

Then it should be removed as it gives the wrong impression and that
variable shouldn't be used in the slave code anyway.

> The 64 was, in all honesty, a starting point that I picked for lack of
> finding a better option. I only currently have 10 sensors available
> for me to test with, so I was not able to do any significant tests to
> see what would happen if I raised that limit and the loop did not exit
> properly at the end of the chain. I was also not able to find an edge
> case that would cause the loop to not find a match. The w1_seq file
> only shows for the one type sensor that supports it so in the absence
> of a bus error, a match is guaranteed.

That's easy enough to test, comment out your if...break; in the loop.

Is there a good reason that the loop doesn't terminate when the device
you're after sl->reg_num.id is found? By making your loop terminate
on a different condition, W1_42_FINISHED_BYTE, if the slave you are
after isn't found you'll still return seq of 0. I'm wondering if seq
should be initialized to -1.

Do this, make up the right id of the correct family code and manually
tell the kernel it is there. That would be the same as if you removed
a chip before the kernel's search timed it out. Then do your sequence
scan on your devices including this new one which isn't there.
cd /sys/devices/w1_bus_master1
echo 42-000001234567 > w1_master_add

The devices return the rom numbers, I wonder about returning the rom
list (which is in the sequence), instead of the sequence number for a
device if that would be more or less useful to users. You would be
able to do it in one pass.

> The output of the function in the event the pins are hooked up as GPIO
> and not a chain is undefined. Since the chips are very tiny,
> prototyping this function would require me to design and order some
> boards that I never was able to use. To my knowledge it is not
> possible to determine that things are wired this way in the software.

Something like "Output is undefined if all direct sequence sensors on
this one wire bus aren't part of one chain. GPIO control of the two
pins is not support by this driver." for the comment above
w1_seq_show, in addition to the other comments I was asking about. I
see now you have some documentation in
Documentation/w1/slaves/w1_therm already that I didn't realize
earlier.

> Apologies, this is my first foray in submitting any sort of kernel
> code so I am not sure of exactly how I would submit a fixup patch for
> gregkh's tree. If someone wants to walk me through that off list I
> would be most appreciative!

git remote add gregkh_char-misc git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
git fetch gregkh_char-misc
git rebase remotes/gregkh_char-misc/char-misc-next
and add a new commit from there.

> Matt Campbell
> MattRCampbell@xxxxxxxxx
>
>
> On Tue, Jun 2, 2015 at 9:07 PM, David Fries <David@xxxxxxxxx> wrote:
> > The source code is as good as it gets, that's why I posted those two
> > lines from the w1.h file. Maybe it could be expanded upon, mutex is
> > more for adding or removing slaves and the like modifications to
> > w1_master data structure, where bus_mutex is what make sure something
> > else doesn't try to talk on the bus while the current device is using
> > it. What really matters is how other code is using the locks. For
> > instance if you look at w1_therm.c bus_mutex is the only one used.
> > Then you look at what your routine is doing, and what the existing
> > routine is doing, they are both reading/writing from the one wire bus,
> > and then return their output to the user, so then use the same lock.
> > If it's the wrong lock then it's at least consistent and easier to
> > cleanup when someone figures out it's the wrong lock. If you think
> > it's the wrong lock raise a flag and ask.
> >
> > My patch review comments.
> >
> > The temperature sensor uses a strong pullup or delay of 750 ms to do
> > the temperature conversion. A quick read through of this sensor has
> > temperature conversion and EEPROM write as the only operations
> > requiring a significant time delay, everything else is in the us, so I
> > would expect this to not require any delay.
> >
> > Read the comments to w1_next_pullup in w1_io.c as pullup_duration will
> > always be 0 in msleep(sl->master->pullup_duration); which isn't likely
> > what you expect. Apparently it is working for you so that further
> > reinforces my previous point.
> >
> > Where did the 64 come from? I have 14 temperature sensors in my
> > single family residence right now, and it would take another 7 to
> > cover each area/room. I could imagine an actual industrial setting to
> > far exceed the 64 count, but I don't work in that area so I don't know
> > the practical limits. How long does it take to read if it loops
> > through 64 times without finding a match? What is reg_num value if a
> > sensor doesn't reply? If it is all 00's or all ff's would that be a
> > better test for the end of the line? How long would it take to loop
> > 16384 times? Maybe you stop if it reaches 16384 or if the end of line
> > value is read 64 times. I also assume this is pretty much a static
> > answer, so you aren't going to be querying it except once per device,
> > once per program setup and rarely or infrequently after that so the
> > time shouldn't matter too much. Maybe the program queries again if a
> > sensor is detected or removed.
> >
> > I would like to see some documentation (comment) at the top of
> > w1_seq_show as to what the sysfs number returns means. I can see the
> > variable is named seq, and it returns it as a signed number. What
> > does a negative value mean? Are they expected? Will the first device
> > in the chain be 0 or 1, and what defines what end that is? What kind
> > of value is returned if the pins are connected as GPIO and not in the
> > direct sequence configuration? Maybe it needs a comment that all
> > sensors of this type must be in the direct sequence configuration for
> > this routine to give meaningful values. "Place all devices in CHAIN
> > state" that will address all devices including other types of devices,
> > would you expect this to cause a problem with other devices?
> >
> > Line up the constants after #define to the same tab location before
> > w1_seq_show.
> >
> > Do you plan to send in a mutex fixup patch? In my experience gregkh's
> > tree only accepts patches to what is already there, he won't swap out a
> > patch he already has (he could as it isn't in Linus's tree, but he
> > doesn't).
> >
> > On Tue, Jun 02, 2015 at 08:35:41AM -0400, Matt Campbell wrote:
> >> David, not arguing just curious how you decide which lock to use? Is there
> >> a standard documented?
> >>
> >> --m
> >>
> >> Matt Campbell
> >> MattRCampbell@xxxxxxxxx
> >>
> >> On Tue, Jun 2, 2015 at 8:12 AM, David Fries <David@xxxxxxxxx> wrote:
> >>
> >> > Thanks for including me, I'm going to have to nack this though, it is
> >> > an error, but bus_mutex is correct, mutex is incorrect, so the other
> >> > two lock/unlock in that routine are the ones that need to be changed.
> >> >
> >> > w1.h
> >> > * @mutex: protect most of w1_master
> >> > * @bus_mutex: pretect concurrent bus access
> >> >
> >> > On Mon, Jun 01, 2015 at 12:56:32PM +0300, Dan Carpenter wrote:
> >> > > Smatch complains that we don't unlock "master->mutex" on this error
> >> > > path. It looks like it is a typo and we unlock ->bus_mutext where
> >> > > ->mutex was intended.
> >> > >
> >> > > Fixes: d9411e57dc7f ('w1: Add support for DS28EA00 sequence to w1-therm')
> >> > > Signed-off-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> >> > >
> >> > > diff --git a/drivers/w1/slaves/w1_therm.c b/drivers/w1/slaves/w1_therm.c
> >> > > index 3351be6..79ecaf8 100644
> >> > > --- a/drivers/w1/slaves/w1_therm.c
> >> > > +++ b/drivers/w1/slaves/w1_therm.c
> >> > > @@ -412,7 +412,7 @@ static ssize_t w1_seq_show(struct device *device,
> >> > > c -= snprintf(buf + PAGE_SIZE - c, c, "%d\n", seq);
> >> > > return PAGE_SIZE - c;
> >> > > error:
> >> > > - mutex_unlock(&sl->master->bus_mutex);
> >> > > + mutex_unlock(&sl->master->mutex);
> >> > > return -EIO;
> >> > > }
> >> > >
> >> >
> >> > --
> >> > David Fries <david@xxxxxxxxx>
> >> >
> >
> > --
> > David Fries <david@xxxxxxxxx>

--
David Fries <david@xxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/