Smoke over the water – and old designs
They say it is better not to know how sausages and politics are made. Similarly, when you like a song, sometimes it is better not to know the lyrics. I remember the moment I delved into “Smoke Over the Water” and suddenly realized it told a rather primitive story about a hotel fire in Switzerland. I tried in vain to forget these lyrics and return to my innocent pleasure of the mesmerizing guitar sounds mapped to a single magic line – to no avail.
However I am here not about music, but about the present day pains of dealing with very, very old designs. Some of them may have even witnessed Khrushchev’s famous UN shoe-show and the first space flights. Hardware built using those designs participated in all kinds of 20th century adventures, defending the sacred apple pie against forces of evil.
In an ideal world, it is possible to move immediately to a new generation of weapon systems all across the board, but sometimes even the US government’s budget is not large enough. Therefore, they are forced to continuously extend older platforms’ service life for another 10-20 years, again and again.
Defense companies are still maintaining designs made with pre-CAD tools. Some use “hard copy” microfiche technology – 40 years old or worse. The antique 3×5 microfiche cards contain several A-size typed reports with library type documents, text and images. Some information on these cards is even handwritten.
There is a limit to what can be done with microfiche cards, and that limit directly correlates to the quality of the equipment used to scan the cards. Fortunately, there are a few places around the US where these high-end microfiche batch processing scanners are still available.
Other defense companies are dealing with the consequences of their previous semi-successful attempts to digitize the original paper drawings by scanning them into PDF or TIFF files. While no longer sitting as dead-cold reams of paper weighing many tons, these files are still a far cry from being truly integrated into the modern enterprise. There is no good process to recognize and integrate these drawings’ textual data like part numbers, FT&A etc. into an enterprise search, PLM, ERP, MES and the entire digital thread paradigm.
Extracting textual data from these files is actually not that easy: scans of stacked calc paper drawings or microfiche show all kinds of ghosting garbage coming from the the other side of the thin paper and sometimes, from other sheets of paper stacked underneath. Where our hard-wired Neanderthal human brain recognizes the word “CODE” surrounded by a dirty cloud of smoke, an award-winning commercial OCR often sees “eODE” or “e0DE”. Another problem is that modern OCR products are not trained on the specifics of engineering domain. Drawings typically use engineering block monotype fonts, which do not convert well. OCR products do not recognize special abbreviations not present in their standard dictionaries. Spacing of fixed-width characters leads “1234” to be interpreted as “1 2 3 4”.
Regular OCR applications cannot accommodate these and a vast quantity of other problematic issues. Teaching them a particular case will not guarantee success with other slightly different cases. Purely neural-network based approach is prohibitively expensive, because it requires very significant calibration and a massive learning data-set.
What I am driving at? Human assistance is still required for data extraction, and no amount of purely AI based OCR will help to mine all of the alpha-numeric treasure hiding in the legacy scans. The most productive approach for very old designs is a human-led, AI-assisted intelligent character recognition (ICR) solution. Luckily, there are a number of commercial and open-source libraries that can form a good foundation for such a system.
The system can work on building the new dictionaries and generating “confidence level” for a particular type of conversion. That confidence will be used on the next step to identify drawings for acceptance/rework without human intervention.
Based on Senticore‘s experience with human-led AI-assisted processes, if done properly, the eventual ICR accuracy rate approaches 99%. We have done successful projects for select clients where this was achieved (to the clients’ elation).
Now, let’s talk about money? Aerospace OEMs focusing on the non-government market must maintain a huge fleet of products that have independent owners. Revisiting that old data happens on an almost daily basis. They need to keep their demanding customers happy, so converting legacy designs (as I explained once about CATIA V4 scenario) is important for efficiency, customer satisfaction, and the resulting bottom line. It therefore makes sense for them to have the legacy data conversion as complete and associated confidence level and accuracy rate as high as possible.
For defense contractors, aircraft and ships have a definitive lifetime, and there is only one customer – the government. Budgetary restrictions are a norm, and explaining a drive to efficiency might be a challenge.
When there is a problem with an aircraft or a ship, defense contractors have very competent liaison engineers who can pull a scanned design file and do whatever is needed to resolve the issue. Still, a good extraction and indexing system can drastically reduce the siloed tribal knowledge required to find all the old pieces of data that they need for a given problem. Incidentally, this system could also aid in a cross collaboration between services and ventures.
For these companies it may make sense to focus on extracting and indexing at least what is most important to critical operations, and leave the rest to history paper bins.
Where does Senticore come into this picture? We understand the engineering side of this story, we are highly proficient with the relevant AI technology stack. We have unique proven experience in developing successful conversion systems on budget and on time, reaching the desired accuracy and quality. This means calibrating ICR accuracy rate and human assistant involvement according to the customer’s priorities.
There is a way out of the legacy design data swamp to the clear waters, and it can be reached without smoke. Talk to us about helping you save money and increase efficiency in maintaining old platforms. You will need that money for new shiny toys to dive to new depths and fly to new heights. After all, Mars is waiting.