Simulation Data Mining

simcloudAp

Simulation Data Mining Approach used in the SimCloud Project

The goal of simulation data mining is to apply data mining techniques to simulation results for generating knowledge and decision rules. In this context I focus on engineering problems, but I am not limited to that aspect. Another topic is to apply data mining techniques to modelling and simulation tasks themselves. So e.g. model synthesis, model diagnostics or parameter and behavior tuning in numerical methods.

Successful projects already performed in this area are e.g.

Matilda

Mining and Storing BIG FEM Data (Matilda Project - joint work with the work group of Benno Stein)

Machine learning (ML) and data mining are very closely related. While machine learning has a long and old tradition, the term data mining showed up some years ago. When people talk about data mining, they mean the techniques and algorithms to discover structures and patterns in (large) data sets using among other things methods from machine learning.

In machine learning we distinguish reinforced, supervised and unsupervised learning.

The base of supervised learning are sets of given pairs of inputs and outputs provided by a "teacher" as examples. The challenge here is to learn a function that maps inputs to outputs.

Performing reinforcement learning we let a program interact - e.g. in the sence of a computer agent - with a dynamic environment to reach a defined goal. There are no teachers or examples available to help him. But there are things like rewards for good work.

In the case of supervised learning we are in general in the field of machine learning. The field between data mining and machine learning is mainly unsupervised learning. In this case no examples or labeled data are provided for the algorithm. So the used approach has to find structures on its own in the input variables. So in awe unsupervised learning has the goal to find structures or patterns in data sets.

So here a clear separation between machine learning and data mining is not always possible. Most people tend to speak of data mining, if the work does not fit anymore in the main memory of an avarage computer and/or if the data that is used in this context comes from data bases. Methods from machine learning are often found in data mining application, and vice versa.

Here you will find a selection of publications on the subject.

  • An Approach for Load Balancing for Simulation in Heterogeneous Distributed Systems using Simulation Data Mining joint work with Irina Bernst, Patrick Bouillon and Christof Kaufmann published in the Proceedings of the IADIS International Conference on Applied Computing 2014; ISBN - 978-989-8533-25-8; pages 254-259 BibTex, PrePrint version (pdf)
  • Learning Overlap Optimization for Domain Decomposition Methods joint work with Steven Burrows, Benno Stein, Michael Völske and Ana Belén Martínez Torres published in of the Seventeenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2013), volume 7818 of LNAI, pages 438-449, Gold Coast, Australia, April 2013. Springer. BibTex, PrePrint version (pdf)
  • Simulation Data Mining for Supporting Bridge Design joint work with Steven Burrows, Benno Stein, David Wiesner and Katja Müller.published in Proc. Australasian Data Mining Conference (AusDM 11), Ballarat, Australia pages 71-79, December 2011. ACM. ISBN 978-1-921770-02-9 BibTex, PrePrint version (pdf), AusDM Online Version

The complete list of publications can be found here.