Taking a lot of data out of 3DX – quickly

Some people say 3DX is better than the sliced bread, while others think there is still place for minor improvements. For example, if you want to download massive amounts of information from 3DX (as a part of an ETL process), both metadata and files, and you are dealing with a large database, you may want to speed things up a bit.

In that context we found a method from the realm of BigData, a setup around apachespark. Spark gives you ability to massively process calls, including distributing load on several computers, organize workflows, sophisticated business logic and queues, and embed blocks of code in different languages.

That way we were able to dramatically increase performance by using Python for text and file processing, Java and MQL and SQL for accessing 3DX – all in the same framework.

The only drawback of the method: it is rather complicated to set up and configure. But once the system is in place, it works like a clock, churning through large data volumes with amazing speed and reliability.