SU2 DataMiner Configuration Base Class#
SU2 DataMiner uses a configuration class in order to store important information regarding the data generation, data mining, and manifold generation processes. This page lists some of the important functions of the Config class which acts as the base class for configurations specific to the application such as NICFD and FGM FGM, for which additional settings can be specified.
Storage Location and Configuration Information#
During the various processes in SU2 DataMiner, data are generated, processed, and analyzed. All information regarding these processes is stored on the current hardware in a user-defined location. SU2 DataMiner configurations can be saved locally under different names in order to keep track of various data sets and manifolds at once. The following functions can be used to manipulate and access the storage location for fluid data and manifolds of the SU2 DataMiner configuration and save and load configurations.
- SetConfigName(self, config_name: str)#
Set the name for the current SU2 DataMiner configuration. When saving the configuration, it will be saved under this name.
- Parameters:
config_name (str) – SU2 DataMiner configuration name.
- GetConfigName(self)#
Get the name of the current SU2 DataMiner configuration.
- Returns:
SU2 DataMiner configuration name.
- Return type:
str
- SaveConfig(self)#
Save the current SU2 DataMiner configuration.
The code snippet below shows how to create multiple SU2 DataMiner configurations with different settings. Two SU2 DataMiner configurations are created titled test.cfg and test_2.cfg which are locally stored in binary format.
from su2dataminer.config import Config
c = Config()
c.SetConfigName("test")
# define settings #
c.SaveConfig()
c.SetConfigName("test_2")
# adjust settings #
c.SaveConfig()
The following functions regard the location where fluid data and other information are stored during the various steps in the SU2 DataMiner workflow. The current working directory is used by default and a warning will show when the specified directory is inaccessible.
Important
SU2 DataMiner has been configured for Linux operating systems, so the path separator is currently hard-coded as /. Windows compatibility is a work in progress.
- SetOutputDir(self, output_dir: str)#
Define the output directory where all raw and processed fluid data and manifold data are saved.
- Parameters:
output_dir (str) – output directory.
- Raises:
Exception – if provided directory does not exist on the current hardware.
- GetOutputDir(self)#
Get the output directory where raw and processed fluid data and manifold data are stored.
- Raises:
Exception – if the output directory of the current SU2 DataMiner configuration is not present on the current hardware.
- Returns:
output directory.
- Return type:
str
For training artificial neural networks and generating tables, SU2 DataMiner accumulates all generated fluid data into comma-separated ASCII files. The header of these files can be specified with the following functions.
- SetConcatenationFileHeader(self, header: str = 'fluid_data')#
Define the file name header of the processed fluid manifold data.
- Parameters:
header (str, optional) – manifold data file header, defaults to “fluid_data”
- GetConcatenationFileHeader(self)#
Get the file name header of the processed fluid manifold data.
- Returns:
fluid manifold data file header.
- Return type:
str
Within the Python API, important settings in the configuration can be displayed in the terminal with the following function, which can also be called from the terminal itself by running
>>> DisplayConfig.py --c <CONFIG FILE NAME>
- PrintBanner(self)#
Print the main banner for the SU2 DataMiner configuration in the terminal.
Training Data Sets and Learning Rate#
The data-driven fluid modeling applications of SU2 DataMiner involve the use of multi-layer perceptrons to calculate the thermodynamic state of the fluid during flow calculations in SU2. The values of the weights and biases, the hidden layer architecture(s) and other parameters needed to train the network can be stored in and retrieved from the SU2 DataMiner configuration.
The fraction of data samples used for training, testing, and validation during machine learning processes can be accessed with the following functions. The training and testing samples are selected by shuffling the unique samples in the complete fluid data set and dividing it into the training, testing, and validation data sets, which are locally saved in separate files. The fraction of samples reserved for the validation data set is calculated as
- SetTrainFraction(self, input: float = 0.8)#
Define the fraction of fluid data used for MLP training.
- Parameters:
input (float, optional) – fluid data train fraction, defaults to 0.8
- Raises:
Exception – if provided value lies outside 0-1.
- GetTrainFraction(self)#
Get the fraction of fluid data used for MLP training.
- Returns:
fluid data train fraction.
- Return type:
float
- SetTestFraction(self, input: float = 0.1)#
Define the fraction of fluid data used for MLP prediction accuracy evaluation.
- Parameters:
input (float, optional) – fluid data test set fraction, defaults to 0.1
- Raises:
Exception – if provided value lies outside 0-1.
- GetTestFraction(self)#
Get the fraction of fluid data used for MLP accuracy evaluation.
- Returns:
fluid data test fraction.
- Return type:
float
The following functions are used to specify the parameters used for training the networks on the fluid data. Currently, SU2 DataMiner uses supervised, gradient-based methods for training where the learning rate follows the exponential decay method
where \(r_{\mathrm{l,0}}\) is the value of the initial learning rate, \(d\) the learning rate decay parameter, \(i\) the iteration, and \(N_\mathrm{d}\) the number of decay steps. The number of decay steps is automatically calculated as
where \(N_\mathrm{e}\) is the number of epochs, \(N_\mathrm{train}\) the number of samples in the training data set, and \(N_\mathrm{b}\) the number of samples in each training batch.
The value of the initial learning rate is calculated as an exponent with base 10 using
and the value of \(\alpha\) can be accessed and specified through the following functions
- SetAlphaExpo(self, alpha_expo_in: float = -1.8269)#
Define the initial learning rate exponent (base 10).
- Parameters:
alpha_expo_in (float, optional) – log10 of initial learning rate, defaults to -1.8269
- Raises:
Exception – if provided value is positive.
- GetAlphaExpo(self)#
Get the initial learning rate exponent (base 10).
- Returns:
log10 of initial learning rate.
- Return type:
float
The value of the learning rate decay parameter \(d\) is accessed through
- SetLRDecay(self, lr_decay_in: float = 0.98959)#
Set the exponential learning rate decay parameter for MLP training.
- Parameters:
lr_decay_in (float, optional) – Exponential learning rate decay parameter, defaults to DefaultProperties.learning_rate_decay
- Raises:
Exception – if the learning rate decay parameter is not within zero and one.
- GetLRDecay(self)#
Get the exponential learning rate decay parameter for MLP training.
- Returns:
Exponential learning rate decay parameter.
- Return type:
float
The networks are trained using batches of training data. The number of samples in each training batch are calculated with
and the exponent \(b\) can be specified through
- SetBatchExpo(self, batch_expo_in: int = 6)#
Set the mini-batch size exponent (base 2) for MLP training.
- Parameters:
batch_expo_in (int, optional) – Mini-batch size exponent (base 2) used for MLP training, defaults to DefaultProperties.batch_size_exponent
- Raises:
Exception – if provided value is lower than or equal to zero.
- GetBatchExpo(self)#
Get the MLP training mini-batch size exponent.
- Returns:
mini-batch size exponent (base 2)
- Return type:
int
Multi-Layer Perceptrons#
The following functions regard the specification of multi-layer perceptrons (MLP) within SU2 DataMiner. The number of nodes in the hidden layers of the network can be specified with
- SetHiddenLayerArchitecture(self, hidden_layer_architecture: list[int] = [20, 20, 20])#
Define the hidden layer architecture of the multi-layer perceptron used for the MLP-based manifold.
- Parameters:
hidden_layer_architecture (list[int], optional) – listed neuron count per hidden layer, defaults to [20,20,20]
- Raises:
Exception – if an empty list is provided or if input contains non-integer data or the number of nodes is less than 1.
- GetHiddenLayerArchitecture(self)#
Get the hidden layer architecture of the multi-layer perceptron used for the MLP-based manifold.
- Returns:
list with number of neurons per hidden layer.
- Return type:
list[str]
Currently, SU2 DataMiner supports the use of a single activation function for the nodes in all hidden layers of the network, while a linear function is automatically applied to the input and output layer. The activation functions currently supported by SU2 DataMiner are:
“linear” (linear function) \(y = x\)
“elu” (exponential linear unit) \(y=\begin{cases}x,\quad x>0\\e^x\quad x\leq 0\end{cases}\)
“relu” (rectified linear unit) \(y=\begin{cases}x,\quad x>0\\0\quad x\leq 0\end{cases}\)
“tanh” (hyperbolic tangent) \(y=\tan^{-1}(x)\)
“exponential” \(y=e^x\)
“gelu” (Gaussian error linear unit) \(y=\frac{x}{2}\left(1 + \mathrm{erf}\left(\frac{x}{\sqrt{2}}\right)\right)\)
“sigmoid” \(y=\frac{e^{x}}{1 + e^{x}}\)
“swish” (sigmoid linear unit) \(y=x\frac{e^{x}}{1 + e^{x}}\)
- SetActivationFunction(self, activation_function_in: str = 'gelu')#
Define the hidden layer activation function for the MLP-based manifold. See Common.Properties.ActivationFunctionOptions for the supported options.
- Parameters:
activation_function_in (str, optional) – hidden layer activation function name, defaults to “gelu”
- Raises:
Exception – if the provided name does not appear in the list of available activation function options.
- GetActivationFunction(self)#
Get the hidden layer activation function name.
- Returns:
hidden layer activation function name.
- Return type:
str
The weights and bias values are stored in the SU2 DataMiner configuration after training and they can also be manually accessed through the following functions
- SetWeights(self, weights: list[ndarray[float]])#
Store the weight values of the neural network.
- Parameters:
weights (list[np.ndarray[float]]) – weight arrays for the network hidden layers.
- Raises:
Exception: if an empty list is provided or the weights arrays are improperly formatted.
- SetBiases(self, biases: list[ndarray[float]])#
Store the bias values of the neural network.
- Parameters:
weights (list[np.ndarray[float]]) – bias arrays for the network hidden layers.
- Raises:
Exception: if an empty list is provided or contains empty arrays.
- GetWeightsBiases(self)#
Return values for weights and biases for the hidden layers in the MLP.
- Returns:
weight arrays, biases arrays
- Return type:
tuple(np.ndarray[float])
All the settings regarding the training parameters, architecture, weights, and biases of the trained network can be stored automatically after training a network from the Trainer class
- UpdateMLPHyperParams(self, trainer)#
Retrieve the weights and biases from the MLP trainer class and store them in the configuration class.
- Parameters:
trainer (TrainMLP) – reference to trainer class used to train the network.
To use the network stored in the configuration in SU2 simulations, the network information needs to be written to an ASCII file such that it can be loaded in SU2 through the MLPCpp module. All relevant information about the network is automatically written to a properly formatted ASCII file using
- WriteSU2MLP(self, file_name_out: str)#
Write ASCII MLP file containing the network weights and biases from the data stored in the configuration.
- Parameters:
file_name_out (str) – MLP file name