Propulsion Monitoring, Smart Modelling, Big Data, Noon Reports, High Frequency Data, Clone Records & Random Sampling

Lately I have read some articles claiming that for propulsion monitoring there is the need for high frequency measurements as opposed to low frequency noon reports.

I think that there is a huge misunderstanding and a lot of confusion in the above statement. Some clarifications are needed.

As usual some comments are "tongue in cheek", I hope the readers will not mind... but the article is not a joke, quite the contrary.

Noon reports are a simple accounting and management tool (and a tool of the past in that respect). Noon reports are totally inadequate for any meaningful propulsion monitoring activity. The fact that even today noon reports are still used by the majority of shipping companies to pretend doing some sort of propulsion monitoring is the telltale about the sorry condition of the shipping industry and about the lack of scientific knowledge and education.

I underline the word "scientific", excuse me if I do, I know it's not sexy, I know many will frown upon hearing it, but I just cannot help it. Shipping suffers from too much praxis and too little scientific background.

Noon reports contain both lumpsum and average quantities related to a very large time interval (distance run is lumpsum, time elapsed is lumpsum, speed is average). The problem with average quantities is that they are true to the physical equation describing a phenomenon only if such phenomenon is linear in nature.

In particular the power vs speed relation is a power law, therefore in taking averages we know that we are introducing errors in the data, even in case of "perfect" measurements.

For the above reason taking averages is a capital sin.

If you are doing it (and if you are using noon reports you are definitely doing it, and averaging on 24 hours!) you are shooting yourself in the foot.

The above should have clarified to the reader that noon reports cannot be used for propulsion monitoring. That is if propulsion monitoring is being done for real and not just to tick some box.

For the very same reason noon reports cannot be use to account for emissions, since sailing 12 hours at 50% MCR and 12 hours at 100% MCR is not equivalent to sailing 24 hours at 75% MCR.

So much for the noon reports.

Now let us focus on so called high frequency data.

First of all in the various article I read there was no clear talking about what frequency is high frequency. 1 per second? 1 per minute? 1 every 5 minutes? 1 every 10 minutes? The fact this is not defined means that the ones talking about high frequency don't have very clear ideas.

Now, if the goal is to replace 1 daily noon reports with one report every 5, 10 minutes it is in principle clear that some of the deficiencies of noon reports will be solved solved, or at leas mitigated in magnitude. Some. But many remains. And in addition this brute force approach is totally inefficient.

In particular if the method used to evaluate the Propulsion data is statistical in nature (as most systems currently offered on the market are) high frequency solves very little.

What is the final purpose of high frequency? To reduce the statistical error? Think again!

The key concept here is that for a population the larger is the number of random samples the lower the statistical error.

Tossing a coin we have 50% probability of getting tails, 50% probability of getting heads, if we toss few times we might not actually get 50% heads and 50% tails, but if we toss many times the statistics will be very close to the 50% probability.

Unless you are Rosencrantz, or was it Guildenstern?

But the above is true for random phenomena, or random sampling. Which is not the matter at hands.

Let us consider current.

Statistically we can consider that for a large number of samples current effects cancel themselves out (which is strictly not true due to non linearities, but let us neglect this as second order).

Current speed and angle off the bow will vary randomly, sometimes current will be against, sometimes it will be in favour, the average affect will cancel out for a large population.

Yes? Not necessarily! Why?

Because current itself is not random!

If a ship is travelling in a quasi constant current at a quasi constant heading (which is the case for a short time window), taking high frequency measurements instead of sparse ones will not help filtering the current out. Why?

Because each time we take a new measurements we don't "toss" the current coin, current is not random in that place at that time: we only end up duplicating a previous measurements. Therefore by increasing the frequency we haven't increased the size of the sample we have cloned the records instead!

So much for high frequency data!

So, at the end of the day, what should be measured? How much data is really needed? how frequently it should be measured? There is no straightforward answer.

It actually all depends on what one is looking at and on what one is looking for.

For pure propulsion monitoring sparse records are sufficient, this is due to the fact that the time constant of the system are high, and changes take time to develop.

But if we want to monitor fuel consumption (which has nothing to do with propulsion monitoring) we need very high frequency measurements of the generated engine power as we need to integrate over time to calculated the generated energy and then compare with the fuel oil consumption. Actually both power measurement and integration is generally performed automatically by the torque-meter, so even this very high frequency measurement could become a low frequency data.

Therefore at the end of the day it is not about big vs small data or high vs low frequency.

It is about understanding the phenomena being investigated, knowing physics, understanding statistics and sampling, building a coherent and efficient model of the phenomena, and devising strategies for taking good measurements of what can be measured, for calculating what cannot be measured but it can be calculated, and for guesstimating what cannot be calculated.

We can call it smart modelling for lack of a better word, (plus smart is such a perfect buzzword).