Home alone with data longevity

For many years “Home Alone” has been known as a beloved Hollywood comedy, featuring a schlimazel kid stumbling towards through a tornado of adventures towards a satisfying victory. No politics or controversy, just pure unadulterated fun.

Then the TV networks had surreptitiously deleted some of the now-controversial characters from the film. This instantly made this movie a sad symbol of our over-politicized times.

It also woke up a lot of peasants to an obscure topic of data longevity.

As George Orwell speculated in 1984, media integrity can and should be doubted whenever it is accessed by intermediaries (such as Ministry of Truth). It is much more reliable to keep originals in your own basement, under lock and key.

Moreover, when dealing with important digital content, mere availability and integrity at present may not be enough. The content must also be stored in long-term consumption-ready format, such as PDF for documents and books, and MP4 for videos and movies.

In many industries, especially aerospace and defense, OEM’s type certificate/release data is stored for decades, including both CAD designs and associated project information. This data must be readily presentable for a number of scenarios, e.g. an FAA review.

And just like with the movies, such OEMs must deal with availability, integrity and future interpretation of that data.

Some companies cover their risks by storing tons of paper drawings and other documents in their physical vaults, the same way as some people still keep their VHS tapes. Others want to fully enjoy the 3D MBD paradigm, which means storing digital data only. And they face are a few challenges.

While it makes absolute sense to use best of the breed solutions like CATIA or NX during the design phase, storing released CAD data in its native application format is hardly advisable. Besides a significant hardware investment, this also makes the OEM an eternal hostage of the original and often proprietary software in which this data was created. For example, even though the CATIA V4 release data for an F-16 aircraft may be safely stored in a WW3-certified data center, opening and using this data still requires an increasingly expensive to maintain CATIA V4 AIX environment.

A solution for the above predicament is to use popular product data exchange and long-term archiving standards.

Here STEP and JT come to mind first. In the beginning, STEP was the only format supported under the ISO 10303. It initially focused on the minimum elements that need to be maintained for 3D objects like geometry or the Brep. PMI was a major piece added later (PMI is information related to manufacturing processes, tolerances, annotations, and other manufacturing-specific details that are associated with a product model).

While STEP is a great comprehensive format for a long-term archival, it is far from perfect for visualization purposes. This is where JT shines since it works well with a wide variety of popular viewers. JT became a completely legitimate alternative to STEP for long term data archival when it was recently added to the relevant ISO standard. Because JT is more lightweight compared to STEP, it also provides space and energy saving benefits.

There are certainly nuances. For example, not all proprietary 3D CAD features are part of the standard: they may exist in CATIA or NX, but would not be transferred to STEP or JT. Thus, if you want to have your CAD data long-term archivable, it is important to avoid using proprietary features during the design stage that are not supported by STEP and JT.

Separately, any conversion from the native CAD format to STEP or JT should involve a concerted validation process covering geometry, PMI and more. Trust but verify.

After dealing with CAD, let’s remember there are vast collections of PDF and MS Office files, both from OEMs’ and their suppliers. However not all PDFs and not all MS Office files are equally good for the long-term archival. Again, sticking to the standards and minimizing dependencies on the external artifacts like fonts. In that sense, respectively, only PDF/A compliant and MS Office ISO/IEC-29500 compliant docx and xlsx files offer a reliable baseline.

Improving on the baseline requires using specialized frameworks like veraPDF – a fantastic open-source effort to validate any PDF document against the entire PDF/A standard.

In terms of political climate, I can personally advise you on a particular second hand bookstore in London where you can get many great books in their unaltered form for a fair price. As for the engineering data longevity conversation, the Senticore team is always an email or phone call away. Talk to us, and you will never feel that you are home alone.