Full workload + random = boom


This is a bit technical in nature, but the concepts are relatively simple. A Poisson process is a random process that obeys the following rules:

  1. Random events: A stream of events are randomly occurring over some duration of time, such as work arriving in a queue.
  2. Independence: Each random event happens independent of all other events, and
  3. Statically distributed: The chances of an event occurring in a time slice depends only on the size of the slice, and not when the slice starts.

Now for the fun part! And by the fun part, I mean the word problem.

Poisson process puzzle

When free, a worker waits for a task to arrive in the queue. The worker performs tasks serially (no multitasking allowed). A manager adds tasks to the queue according to a homogeneous Poisson process, where the inter-arrival times are exponentially distributed with parameter λ. This just means that the mean arrival time of the next task is 1/λ. For example, you might have λ = 4 tasks per second (on average), and so the mean arrival time would be 0.25 seconds.

Simplifying assumptions:

  1. Each and every task is an identical amount of work which takes the same amount of time to complete.
  2. At the start, the worker is free with no tasks in the queue.

Let t = the time to complete one task.
Let s = some duration of time over which the queue and worker is active.

Q: How many tasks has the worker processed after some duration of time?
A: Define D(x) = the event that x tasks were dequeued by the worker during the interval [0,s].

Q: How many tasks are waiting in the queue after some duration of  time?
A: Define Q(y) = the event that y tasks were in the queue at the end of the interval [0,s].

Q: What is the distribution of D and Q when the mean input is {10%, 90%, 100%, 110%, 99%, ...} of the max output.  In other words: Let t = 0.10/λ ... Let t = 0.90/λ ... Let t = 1.00/λ ...  and so on.

Q: How is the variance of D affected as the ratio t/s shrinks?

Q: With a full 100% workload streaming in, what happens to the queue size over time?

Power benchmark becomes a practical application of Poisson puzzle

Many of the surprising aspects of answers to these Poisson puzzlings were in fact discovered experimentally, while the power_ssj2008 benchmark was under development. We uncovered the answers as persistent anomalies (suspected bugs), and we had to work backwards to determine the questions that made sense of it all.

Incidentally, this benchmark has been around for some time now.

This benchmark has the worker running at s = 4 minutes per interval. In actuality t is on the order of milliseconds, but is system dependent and multi-valued. Specifically, multiple job types have differing length and the exact completion rates are dependent on the server measured. Also, there are typically N workers (one per processor thread). By measuring a system’s full workload (infinite queuing is simulated) for 3 intervals, the benchmark estimates a fixed value for λ that would cause the mean input to equal the max output (t = 1/λ). Then, the manager targets workloads running an interval measurement at each of t = 100%/λ, t = 90%/λ, t = 80%/λ and all the way down to t = 10%/λ.

Puzzler now ponders personalized Poisson processes pessimistically

Here is another practical application of this puzzle, that I would be remiss not to express:

I'll be the worker who answers the queue in this example (for example, my work e-mail). I will set the start time at Monday morning, and just say t = 1/2 hour and s = 50 hours to keep it simple. Let λ = 2 tasks per hour to represent that the mean arrival rate equals my maximum work output. At some point in the work week, I need to say the week is almost over! I am only handling what's currently in the queue and I will decline all new tasks until next week. So when (on average) would I need to stop accepting tasks in the queue?

The evidence is overwhelming that I haven’t figured this one out yet! Until I do, I will go on with the feeling that it's all one big conspiracy to drown me in tasks!