Enhanced Reinforcement Learning-Based Shape Optimization with Flow Field Information
Please login to view abstract download link
Reinforcement Learning (RL) is a subfield of Machine Learning (ML), which contains methods that use trial and error to explore an environment via actions representing the Degrees of Freedom (DOFs). After every performed action, the environment provides the RL algorithm with information about its state - observation - resulting from the action. During the learning process, additionally, a reward is generated. This reward informs the RL algorithm about the fitness of the applied action for the given environment‘s state. RL is used in many applications, including robotics, control theory, and playing games. Moreover, RL has recently gained traction in the field of design optimization. Viquerat et. al [1] give an overview of multiple areas of fluid mechanics in which RL is currently applied, including the optimization of shapes inside flow fields. Fricke et al. [2] extend the application of RL to the optimization of flow channels in profile extrusion dies, introducing the ReLeSO framework, which enables its users to perform shape optimization via Reinforcement Learning. So far, within shape optimization, the observation returned to the RL algorithm after each action is still limited compared to other applications. Usually, it is restricted to only a few key performance metrics such as flow homogeneity or lift/drag coefficients. In addition, the geometry of the to-be-optimized object can only be included into the learning process via the parameterization used for the optimization. Both limitations can be resolved by providing the full flow field to the RL algorithm as an observation. It is beneficial to not only provide Finite Element solutions but to preprocess the flow field, e.g., by employing a Convolutional Neural Network (CNN)-based feature extractor. Within our ReLeSO framework, the flow field is converted into an RGB image incorporating up to three flow field variables, which is then provided as an observation to the agent. We apply the proposed method to the second use case presented in [2] and show, that the RL algorithm is capable of learning a suitable strategy given the new observations.