Forum Discussion

AngelaIp's avatar
AngelaIp
Ideator I
8 months ago

Digital twins and large data quantities - what are your performance tips to deal with millions of (federated) datasets?

Hi community,

Aras currently posts a lot of digital twin related content and we jumped on the hype train. 

We right now combine EBOM/MBOM with ERP, production and test data. User can see data of individual products or can do all kinds of reporting over the complete data. We are still at the beginning of this concept. Right now people are mainly interest in reports and calculations.

It´s nothing we haven´t done before. Certain data was already federated in Innovator in the past. But users had to connect certain data from the various sources manually or ask an admin to create reports that are more sophisticated. 

The digital twin concept allows more freedom to approach. But the bottleneck is currently the test data. We do automated tests of electronic components. A few hundreds tests per individual product are common. So the amount of data is very huge (millions+). Certain queries and reports that are currently possible might probably fail in the future due to the increasing amount of data. It´s not an issue right now, but I want to be prepared for the future. 

What are the best practices to optimize performance regarding large amounts of federated data?

I checked the official Platform Specification that is available online. It contains many hints regarding hardware recommendations for a certain amount of users, Vaults, etc. It´s a useful guide, but it doesn´t cover federated data and digital twin use cases yet.

What do you recommend? What can we do on database / infrastructure level?

1. Optimize database structures (better query design, indexing, caching, etc.)?
2. Throw more hardware at the problem?

This one is not necessarily an Innovator issue. PLM and SQL are designed to deal with a huge amount of data. There are big corporations which thousands of users and endless data vaulted over the whole world. 

But I would be happy about anyone that can give some additional insights, ideas or recommendations! Many thanks!!

Angela


2 Replies

  • I must admit, I’m sometimes a bit disappointed by this forum’s community.

    If I had asked the same question on Reddit or 4chan, at least I’d have received three memes, two completely unrelated life hacks and a lot of bright ideas to make everything worse, just for the laughs.[emoticon:98f842990f454422aa6a04953955f9d9]

    Maybe let’s flip the question: What is the worst I could possibly do regarding external data sources which huge amount of data?

    • AngelaIp's avatar
      AngelaIp
      Ideator I

      Does nobody have digital twins here?

      If your answer is "sorry not yet" : Add digital twins to your road map! It´s one of the most powerful use cases.