Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Problems debugging pscheduler: "Run was preempted."

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Problems debugging pscheduler: "Run was preempted."


Chronological Thread 
  • From: Mark Feit <>
  • To: Brian Candler <>, "" <>
  • Subject: Re: [perfsonar-user] Problems debugging pscheduler: "Run was preempted."
  • Date: Thu, 5 Sep 2019 12:16:55 +0000

(Sorry; I thought I’d sent this, but it got buried in a bunch of other windows.  I’ll reply to your others separately.)

 

Brian Candler writes:

/var/log/pscheduler/pscheduler.log just says:

Sep  1 16:29:24 perf1 runner INFO     6945: Run was preempted.

OK, at this point I'm stumped.  I think I've dug further into the innards of pscheduler than any end-user really is supposed to do.  And all I've found is: both ad-hoc and scheduled tests are being dropped on the floor, with message "Run was preempted.", and I believe this is because run_can_proceed() is returning false.

I think you’ve just passed the programming section of the interview.  ;-)

4.2 includes a new feature in the limit system that allows tasks where there is potential schedule contention (which, for all practical purposes, is throughput) to be prioritized.  This was announced in the release notes and the mechanics are covered in http://docs.perfsonar.net/release_candidates/4.2.0/config_pscheduler_limits.html#priorities-which-runs-happen-and-which-do-not.  The original use cases were to preempt repetitive tasks so ad-hoc testing could happen faster and to allow remote, ad-hoc testing on systems where the repetitive stuff is important.

What wasn’t made clear is that the configuration that ships with the toolkit gives marginally-higher priority to runs of tasks that originate on the local system.  That’s on me since I put it in there.  I may disable that in 4.2.1 so we can regroup.

Since throughput involves two systems, any run will need time scheduled at both ends.  Because the receiving end isn’t where the task originated, the run at that end could end up with a lower priority than some other and will be preempted.  The scheduler attempts to work around this where it can if the task is given enough slip, but on systems with congested schedules, it may not be able to and the run is preempted.  I’m going to have a look at whether or not the first participant should explicitly ask the others for whatever priority it got.  There are some implications around having to trust whoever’s asking for a higher priority not to abuse it, so making a decision on that will require some thought.

Going back to what I said at the start: whilst I would appreciate hints to help fix the specific problem here, I also think that pscheduler could do a better job of reporting problems.  If it decides that it's not a good idea to launch a tool, I think it should say *why* it has decided this.  Or at least document this error message: googling "+pscheduler run was preempted" turns up nothing.

The ”run was preempted” message is new with the priority feature.  I have an informal list of what the task states mean (https://github.com/perfsonar/pscheduler/wiki/Run-States), but no entry in my informal list of error messages (https://github.com/perfsonar/pscheduler/wiki/Error-Messages).  I’ll add something to the latter.  I’m not sure what I could add to the message beyond “run was preempted by a higher-priority task” to make it more useful.

pScheduler stores a lot of diagnostic information that you can get at by running ”pscheduler result --diags <RUN-URL>”.  The limit system diagnostics are shown for the lead participant only, but I can slip a change into 4.2.1 that will add it for the others.

--Mark

 




Archive powered by MHonArc 2.6.19.

Top of Page