Discussion:
[PATCH 0/4] add task handling notifier
Christoph Hellwig
2007-12-23 12:26:21 UTC
Permalink
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.

Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.

For some reason neither ever made a lot of progess (performance
problems?).

As said by other we really should have a register/unregister API instead
of exposing the implementation. Also please don't export anything until
we actuall grow modular users in the tree.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Andrew Morton
2007-12-25 22:05:26 UTC
Permalink
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
For some reason neither ever made a lot of progess (performance
problems?).
I had it in -mm, sorted out all the problems but ended up not pulling the
trigger.

Problem is, it adds runtime overhead purely for the convenience of kernel
programmers, and I don't think that's a good tradeoff.

Sprinkling direct calls into a few well-known sites won't kill us, and
we've survived this long. Why not keep doing that, and save everyone a few
cycles?
Jan Beulich
2008-01-08 13:38:03 UTC
Permalink
Post by Andrew Morton
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
For some reason neither ever made a lot of progess (performance
problems?).
I had it in -mm, sorted out all the problems but ended up not pulling the
trigger.
Problem is, it adds runtime overhead purely for the convenience of kernel
programmers, and I don't think that's a good tradeoff.
Sprinkling direct calls into a few well-known sites won't kill us, and
we've survived this long. Why not keep doing that, and save everyone a few
cycles?
Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Andrew Morton
2008-01-08 22:14:24 UTC
Permalink
On Tue, 08 Jan 2008 13:38:03 +0000
Post by Jan Beulich
Post by Andrew Morton
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
For some reason neither ever made a lot of progess (performance
problems?).
I had it in -mm, sorted out all the problems but ended up not pulling the
trigger.
Problem is, it adds runtime overhead purely for the convenience of kernel
programmers, and I don't think that's a good tradeoff.
Sprinkling direct calls into a few well-known sites won't kill us, and
we've survived this long. Why not keep doing that, and save everyone a few
cycles?
Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.
Ah. That's a brand new requirement.

The requirement which I thought we were addressing was "clean the code up
by adding a notifier chain so multiple subsystems don't need to patch in
hard-coded calls".

My contention is that the code clarity which this gains isn't worth the
runtime cost.



Now we have a new requirement: "allow out-of-tree code to hook into these
spots without needing to patch those few callsites".

I think we'd need a pretty detailed description of the pain which this
would relieve before we would take such an extraordinary step. What are
those (unidentified) add-on features doing at present? Patching calls into
fork.c/exec.c/exit.c?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Paul Jackson
2008-01-09 00:03:09 UTC
Permalink
What are those (unidentified) add-on features doing at present?
Patching calls into fork.c/exec.c/exit.c?
Most likely. I suspect we have general agreement and awareness
that such patching is not something that sells well in Linux-land.
And for good reason in my personal view ... such patching by loadable
modules could open the door to compromising the integrity of Linux in
ways that could be dangerous.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <***@sgi.com> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Andrew Morton
2008-01-09 00:31:20 UTC
Permalink
On Tue, 8 Jan 2008 18:03:09 -0600
Post by Paul Jackson
What are those (unidentified) add-on features doing at present?
Patching calls into fork.c/exec.c/exit.c?
Most likely. I suspect we have general agreement and awareness
that such patching is not something that sells well in Linux-land.
And for good reason in my personal view ... such patching by loadable
modules could open the door to compromising the integrity of Linux in
ways that could be dangerous.
err, no.

What I meant was that the providers of these mystery features are
presumably also patching into fork.c/exec.c/exit.c at the source code level
so as to enable the mystery features within their overall kernel package.

If so, this doesn't sounds terribly onerous.
Matt Helsley
2008-01-09 02:47:00 UTC
Permalink
Post by Andrew Morton
On Tue, 08 Jan 2008 13:38:03 +0000
Post by Jan Beulich
Post by Andrew Morton
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
For some reason neither ever made a lot of progess (performance
problems?).
I had it in -mm, sorted out all the problems but ended up not pulling the
trigger.
Problem is, it adds runtime overhead purely for the convenience of kernel
programmers, and I don't think that's a good tradeoff.
Sprinkling direct calls into a few well-known sites won't kill us, and
we've survived this long. Why not keep doing that, and save everyone a few
cycles?
Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.
Ah. That's a brand new requirement.
In all fairness it's not really a brand new requirement -- just one that
wasn't strongly emphasized during prior attempts to get something like
this in.

I had a mostly-working patch for this on top of the Task Watchers v2
patch set. I never posted that specific patch because it had a race with
module unloading and the fix only increased the overhead you were
unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
description for Task Watchers v2 (http://lwn.net/Articles/207873/):

"TODO:
...
I'm working on three more patches that add support for creating a task
watcher from within a module using an ELF section. They haven't recieved
as much attention since I've been focusing on measuring the performance
impact of these patches."

<snip>

Would tainting the kernel upon registration of out-of-tree "notifiers"
be more acceptable?

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Andrew Morton
2008-01-09 03:22:07 UTC
Permalink
Post by Matt Helsley
...
Post by Andrew Morton
Post by Jan Beulich
Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.
Ah. That's a brand new requirement.
In all fairness it's not really a brand new requirement -- just one that
wasn't strongly emphasized during prior attempts to get something like
this in.
I had a mostly-working patch for this on top of the Task Watchers v2
patch set. I never posted that specific patch because it had a race with
module unloading and the fix only increased the overhead you were
unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
...
I'm working on three more patches that add support for creating a task
watcher from within a module using an ELF section. They haven't recieved
as much attention since I've been focusing on measuring the performance
impact of these patches."
<snip>
Would tainting the kernel upon registration of out-of-tree "notifiers"
be more acceptable?
How does that work? module.c does the register/deregister on behalf of the
module?

I certainly encourage people to disagreee with me here, but my current
thinking is:

- the cleanup aspect isn't worth the runtime overhead and

- the support-modular-users aspect is largely new and would need a lot
more description and justification (with examples) before we can even
begin to evaluate it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jan Beulich
2008-01-09 09:52:01 UTC
Permalink
Post by Andrew Morton
Post by Jan Beulich
Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.
Ah. That's a brand new requirement.
I'm sorry, but I didn't feel this was important, as I didn't expect the cleanup
effect to cause much debate...
Post by Andrew Morton
I think we'd need a pretty detailed description of the pain which this
would relieve before we would take such an extraordinary step. What are
those (unidentified) add-on features doing at present? Patching calls into
fork.c/exec.c/exit.c?
Yes. And the unidentified feature is NLKD. But as with other notifiers (most
notably the module unload one), all reasonable kernel debuggers should
need them (or do explicit patching of the mentioned source files). As I
explained before, I think that if kernel debuggers aren't allowed into the
tree, they should at least be allowed to co-exist (since the argument of
requiring in-tree users and submitting code for mainline inclusion is void
if political/personal reasons exclude certain code from even being
considered for inclusion).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Christoph Hellwig
2008-01-09 10:03:44 UTC
Permalink
Post by Jan Beulich
Yes. And the unidentified feature is NLKD. But as with other notifiers (most
notably the module unload one), all reasonable kernel debuggers should
need them (or do explicit patching of the mentioned source files). As I
explained before, I think that if kernel debuggers aren't allowed into the
tree, they should at least be allowed to co-exist (since the argument of
requiring in-tree users and submitting code for mainline inclusion is void
if political/personal reasons exclude certain code from even being
considered for inclusion).
We already have a kernel debugger in tree, xmon. There's anotherone
hopefully getting in soon (kgdb), so they can do the right thing as the
other subsystems.

And as usual, we're not going to provide hooks for out of tree mess.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Matt Helsley
2008-01-09 02:24:07 UTC
Permalink
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
Apologies for the late reply -- I haven't had internet access for the
last few weeks.
Post by Christoph Hellwig
For some reason neither ever made a lot of progess (performance
problems?).
Yeah. Some discussion on measuring the performance of Task Watchers:
http://thread.gmane.org/gmane.linux.lse/4698

The requirements for Task Watchers were:

Allow sleeping in most/all notifier functions in these paths:
fork
exec
exit
change [re][ug]id
No performance overhead
One "chain" per path ("I only care about exec().")
Easy to use
Scales to large numbers of CPUs
Useful to make most in-tree code more readable. Task Watchers took
direct calls to these pieces of code out of the fork/exec/exit paths:
audit
semundo
cpusets
mempolicy
trace irqflags
lockdep
keys (for processes -- not for thread groups)
process events connector
Useful for loadable modules

Performance overhead in microbenchmarks was measurable at around 1% (see
the URL above). Overhead on benchmarks like kernbench on the other hand
were in the noise margins (which were around 1.6%) and hence I couldn't
determine the overhead there.

I never got the loadable module part completely working due to races
between notifier functions and the module unload path. The solution to
the races seemed to require adding more overhead to the notifier
function paths (SRCU-like grace periods).

I stopped pushing the patch set because I hadn't found any new
optimizations to offset the overheads while still meeting all the
requirements and Andrew still felt that the "make it more readable"
argument was not sufficient to justify its inclusion.

Jan, instead of adding notifiers could utrace be used or made to work
for modules? Also, please add me to the Cc list for any reposts of the
entire series. Thanks!

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Matthew Helsley
2008-01-09 03:27:23 UTC
Permalink
Post by Matt Helsley
Post by Christoph Hellwig
With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.
I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.
Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.
Apologies for the late reply -- I haven't had internet access for the
last few weeks.
Post by Christoph Hellwig
For some reason neither ever made a lot of progess (performance
problems?).
http://thread.gmane.org/gmane.linux.lse/4698
fork
exec
exit
change [re][ug]id
No performance overhead
One "chain" per path ("I only care about exec().")
Easy to use
Scales to large numbers of CPUs
Useful to make most in-tree code more readable. Task Watchers took
audit
semundo
cpusets
mempolicy
trace irqflags
lockdep
keys (for processes -- not for thread groups)
process events connector
Useful for loadable modules
Performance overhead in microbenchmarks was measurable at around 1% (see
the URL above). Overhead on benchmarks like kernbench on the other hand
were in the noise margins (which were around 1.6%) and hence I couldn't
determine the overhead there.
I never got the loadable module part completely working due to races
between notifier functions and the module unload path. The solution to
the races seemed to require adding more overhead to the notifier
function paths (SRCU-like grace periods).
I stopped pushing the patch set because I hadn't found any new
optimizations to offset the overheads while still meeting all the
requirements and Andrew still felt that the "make it more readable"
argument was not sufficient to justify its inclusion.
Oops. It's been nearly two years so I've forgotten exactly where Task
Watchers v2 was when I stopped pushing it. After a bit more searching I
found a more recent posting:
http://lkml.org/lkml/2006/12/14/384

And here's why I think the microbenchmark results improved to the point
there was a small performance improvement over mainline:
http://lkml.org/lkml/2006/12/19/124

I seem to recall kernbench was still too noisy to tell.

The patch allowing modules to register Task Watchers still isn't posted
there for the reasons I've already described.

Cheers,
-Matt Helsley
Loading...