Progress Bars haven’t progressed much for over a decade.
Let me tell about the Progress Bar component that I will definitely one day develop, with the one proviso: I will probably never develop it.
Each different kind of task that the Progress Bar encounters will have a unique id, so the component knows when it is tracking a repeat of a task.
Of course, the progress bar will accept from the client software frequent calls containing a “percentage complete value” for clients that have a bounded task – e.g. downloading 43MB.
As a nice gesture, it will also accept two values – units of progress and goal, so the client can say “I have downloaded 10,000,000 bytes of
45,088,768.” and have the progress bar take care of calculating percentages for it.
Or, it will simply accept a progress unit. “I have downloaded a packet, and I don’t know how many there will be.”
Or, it will simply accept a step name. “I have finished boiling the kettle.” (It will store these to get an historical picture of the steps involved. Described further below.)
Or, it will accept a pipe of log output from a naive command that doesn’t even know it is being monitored, and a third-party can specify regular expressions to extract any of the above.
Or, with some magic I don’t yet understand, it will store the entire log file, and, with a large enough sample, try to automatically extract features correlated to progress.
Or, it will be able to poll external items (like the file size of the file being downloaded, or a count of records in a table). Also get a feel for CPU load and RAM usage.
Oh, and it will accept these calls directly, or over a network, so I can track complex processes that cross machines.
Clients will be able to indicate the structure of tasks and subtasks. So a client might say “First, I will be formatting a USB stick, then writing the database 1 on it.” or “First, I will be formatting a USB stick, and then writing database 2 on it.”, so the Progress Bar can recognise that the first task is very similar. It can also indicate the two subtasks are running concurrently, so the progress bar can provide estimates of when the second one – whichever that might be – will finish.
Sure, it could do what most progress bars do, and either (a) show the percentage progress linearly progressing to 100%, or (b) show other progress by spinning a icon. In fact, the first few times it sees a task it would do that.
It could try to estimate the completion time by looking at progress and time elapsed. In fact, the first few times it sees a task, it would do that.
But as it sees a task multiple times, it will start growing its database of historical data.
If there is a percentage progress provided, does this task really hit the 50% mark halfway through the run? Does the first or last unit of progress take longer than you might expect with initialisation and clean up? The component would collect evidence to adjust the displayed progress appropriately, deviating from what the naive client thinks.
What about when the client has no idea how long a task will take, or even that it is being monitored? That is when the real value of this component comes about. Rather than trying to modify a program to know how it is progressing, this is outsourced to the Progress Bar component.
How long does the task normally take? If step names are provided, how long (proportionally and absolutely) does each step take? How is progress typically affected by RAM, CPU or I/O usage by other processes? If step 1 runs slowly, is that correlated with step 3?
Are there any clues that can be garnered? i.e. If packets are received, how many are normally received? Does the output contain the number of files to be processed? Does it contain the phrase “Connection lost. Trying to reconnect.” How much time does that normally add?
If a client can tell us how it is progressing against a goal, that is terrific. But, when it has no firm metric to count (e.g. compile this file, and all the files it depends upon), we can either litter the compiler with naive attempts to estimate the work required, or use an overseeing component that knows more about how long it normally takes to compile the code than the compiler does.
Here is the genius part.
The output is not a single bar showing progress from 0-100%. We are more sophisticated than that. (Well, I am. For all I know, you might still be responding to progress bars that lie to you.)
Instead there will be a graph. The graph shows a distribution curve. How long does it normally take to create a summary coversheet for a TPS report? It varies. Each of the previous attempts is plotted against a time axis. The median and the 90th percentile are marked, so you know what to expect. As progress is made, the estimate is updated. For example, if you make it to the median, and it is not finished, it can examine the historical record of previous times the report wasn’t finished by the median mark, and give you a new median based on this information.
Oh, and when it is finished, it will play a notification. The client code doesn’t know if it is part of a larger task or not, so it doesn’t know whether to play a completion sound, but the all-encompassing progress bar knows.
One day, I will create the greatest progress bar since the last one. Its killer advantages are (a) it can retrofit clients that don’t have a notion of percentage complete, and uses historical data to make estimates for them, (b) it displays the range of possible outcomes, rather than simplifying to a single misleading number.
My current progress towards developing this component is 0%.