Validating data on the Geotab GO device

How does Geotab handle the vast amount of different input information? Developer Ian Grzegorczyk discusses the process of validating data for telematics.

By Ian Grzegorczyk

Jan 30, 2018

Updated: Apr 24, 2023

As part of the regular operation of Geotab’s GO device, it has to handle a vast amount of different input information from an array of different sources. Different engine computers and third-party Add-Ons that interface with the device all supply a steady stream of data that needs processing. However, the vast majority of data sampled cannot be trusted at face value and requires closer inspection before it meets Geotab requirements and can be logged. This blog post reviews the process of validating data on the Geotab telematics device.

Taking in a Stream of Data

One of the most complex tasks is properly identifying what valuable data must be encoded within these info streams, from valuable engine parameters such as fuel use, odometer, and engine hours, to other third-party data wishing to be logged with the Geotab GO device’s network connection.

Challenges to Processing Data

All this data streaming into the Geotab GO device needs to be verified in three ways:

The data is what it should be.
Format is correct.
Data is accurate.

All vehicles behave and communicate differently and present data via different formats. Plus, third-party Add-Ons often communicate via custom protocols, which means that almost no data can be trusted without some basic form of validation.

There are a few ways to tackle these challenges. Two common methods are: validating the data relative to an already known and trusted source; or confirming that the data is coming in some fashion that could only be the data the device is looking for.

Validating Data by Comparing to a Trusted Source

How does the Geotab GO device validate data? One of the first steps is that it must compare data to a known, trusted source.

Let’s use engine hour data as an example of what this means. For those unfamiliar with the term, engine hours is a recording of how long the engine on a particular vehicle has been running, including idle time, since it was manufactured — as opposed to the odometer which shows actual miles driven. See our Telematics Glossary for more definitions of commonly used terms.

The problem: While engine hours are supposed to be presented in a standardized form, it rarely is that simple when dealing with any number of different vehicles. There are always exceptions.

In field testing, it has been found that some vehicles report engine hours in the same format, but with a different scaling factor. In fact, some increment their counter 60 times faster than one would expect a normal engine to do. Essentially, this means they are reporting engine minutes, which would make the vehicle appear to be used much more than it actually was.

How this is corrected: Fortunately, there is a relatively straightforward solution to such an issue. The Geotab GO device can save a timestamp from the system’s internally tracked time (i.e. the trustworthy source in this example), and current reported engine hours on vehicle start up. The device then compares it to another timestamp in the future to measure the delta engine hours with the actual passage of time. With a scaling error of 60 times, it becomes fairly easy to determine which scaling the vehicle operates under.

Why does this matter? One of Geotab’s strengths is the ability of the Geotab GO to satisfy most vehicle mandates that require the reporting of engine hours from the on-board engine computer. Through MyGeotab, customers can compare the logged engine hours between two different dates to determine how much the vehicle was operated within the desired range. This value can then be used to satisfy whatever reporting method is required by the relevant oversight authorities governing the vehicle’s commercial use.

Validating Third-Party Data

One aspect to Geotab’s flexibility is its ability to support numerous different add-ons through one of its external IOX adapters. One of these allows for a messaging network to supply data to/from the Geotab GO device (a network akin to the one used to communicate with the engine’s computer but instead with the capability of talking with any external device the customer wishes to hook up).

These often come in the form of external vehicle or equipment sensors which pass their data to the telematics device to use its uplink connection to the server. This allows data from these external devices without network access to be logged alongside normal engine data as tracked by the Geotab GO device.

For this to be done, the Geotab GO device needs to be able to identify what information it is seeing and what it needs to log. Each message sent and received to and from the device has an associated identifying number, commonly referred to as its address.

The problem: Manufacturers have complete freedom in determining what address they use, which can lead to inevitable overlaps as these third parties have no need to consult with each other on what they are using. As a result, data sent by third-party devices often has no correlation with anything the Geotab GO device can acquire from the engine computer. In other words, it means that the device can’t compare to another trusted source of data in order to validate the info it’s receiving.

How this is corrected: The only option left is to watch for the data to come in with a pattern that should only exist if the message also contains the expected data. By working with our third parties, we know how a valid message should be formatted, and after seeing a count of valid messages in a row the source can be validated and trusted moving forward.

Why does this matter? Geotab has agreements with a variety of these external manufacturers to support the passthrough of their data. It is prudent to report the same piece of data on the same address every time to allow the listener to understand what it is seeing. Basically, this validation is what makes the third-party data usable and seen alongside other telematics data.

Data Validity: A Telematics Cornerstone

These are just a couple examples of many in how the Geotab GO device can look at a more reliable source of data and use it to validate a piece of information that is not yet trusted, as well as validate data coming in from third-party sources. By subjecting info streams to these rigors, we can eliminate almost all false-positive reportings of data. This is part of how Geotab strives to provide the best telematics reporting possible.

Data validity is a fundamental pillar of providing accurate, usable reports to our customers. Regardless of how it is achieved, validation will always be a vital cornerstone to providing dependable telematics data.

Don't miss out. Sign up here to get our monthly newsletter with Geotab software/firmware updates, news and best practices.

Do You Have a Big Data Graveyard?

How Is Software Testing Done at Geotab?

Get Started With the MyGeotab API Wrapper