Now showing 1 - 10 of 51
  • Publication
    Open Access
    Balancing energy and performance: efficient allocation of solver jobs on high-performance computing systems
    (Universitätsbibliothek der HSU/UniBw H, 2025-07-08) ; ;
    Many combinatorial optimization methods and related optimization software, particularly those for mixed-integer programming, exhibit limited scalability when utilizing parallel computing resources, whether across multiple cores or multiple nodes. Nevertheless, high-performance computing (HPC) systems continue to grow in size, with increasing core counts, memory capacity, and power consumption. Rather than dedicating all available resources to a single problem instance, HPC systems can be leveraged to solve multiple optimization instances concurrently – a common requirement in applications such as stochastic optimization, policy design for sequential decision making, parameter tuning, and optimization-as-a-service. In this work, we study strategies for efficiently allocating solver jobs across compute nodes, exploring how to schedule multiple optimization jobs across a given number of cores or nodes. Using metrics from performance monitoring and benchmarking tools as well as metered PDUs, we analyze trade-offs between energy consumption and runtime, providing insights into how to balance computational efficiency and sustainability in large-scale optimization workflows.
  • Publication
    Metadata only
    Node-level performance of adaptive resolution in ls1 mardyn
    (Springer, 2025-07-05) ;
    Hocks, Alex
    ;
    In this work we present a node-level performance analysis of an adaptive resolution scheme (AdResS) implemented in ls1 mardyn . This is relevant in simulations involving a very large number of particles or long timescales, because it lowers the computational effort required to calculate short-range interactions in molecular dynamics. An introduction to AdResS is given, together with an explanation of the coarsening technique used to obtain an effective potential for the coarse molecular model, i.e., the Iterative Boltzmann Inversion (IBI). This is accompanied by details of the implementation in our software package, as well as an algorithmic description of the IBI method and the simulation workflow used to generate results. This will be of interest for practitioners. Results are provided for a pure Lennard-Jones tetrahedral molecule coarsened to a single site, validated by verifying the correct reproduction of structural correlation functions, e.g. the radial distribution function. The performance analysis builds upon a literature-driven methodology, which provides a theoretical estimate for the speedup based on a reference simulation and the size of the full particle region. Additionally, a strong scaling study was performed at node level. In this sense, several configurations with vertical interfaces between the resolution regions are tested, where different resolution widths are benchmarked. A comparison between several linked cell traversal routines, which are provided in ls1 mardyn , was performed to showcase the effect of algorithmic aspects on the adaptive resolution simulation and on the estimated performance.
  • Publication
    Metadata only
    Static load balancing for molecular-continuum flow simulations with heterogeneous particle systems and on heterogeneous hardware
    Load balancing in particle simulations is a well-researched field, but its effect on molecular-continuum coupled simulations is comparatively less explored. In this work, we implement static load balancing into the macro-micro-coupling tool (MaMiCo), a software for molecular-continuum coupling, and demonstrate its effectiveness in two classes of experiments by coupling with the particle simulation software ls1 mardyn. The first class comprises a liquid-vapour multiphase scenario, modelling evaporation of a liquid into vacuum and requiring load balancing due to heterogeneous particle distributions in space. The second class considers execution of molecular-continuum simulations on heterogeneous hardware, running at very different efficiencies. After a series of experiments with balanced and unbalanced setups, we find that, with our balanced configurations, we achieve a reduction in runtime by 44% and 55% respectively.
  • Publication
    Metadata only
    An improved simulation methodology for nanoparticle injection through aerodynamic lens systems
    (American Institute of Physics, 2025-03-26) ;
    Samanta, Amit K.
    ;
    Amin, Muhamed
    ;
    ;
    Küpper, Jochen
    ;
    Aerosol injectors applied in single-particle diffractive imaging experiments demonstrated their potential in efficiently delivering nanoparticles with high density. Continuous optimization of injector design is crucial for achieving high-density particle streams, minimizing background gas, enhancing x-ray interactions, and generating high-quality diffraction patterns. We present an updated simulation framework designed for the fast and effective exploration of the experimental parameter space to enhance the optimization process. The framework includes both the simulation of the carrier gas and the particle trajectories within injectors and their expansion into the experimental vacuum chamber. A hybrid molecular-continuum-simulation method [direct simulation Monte Carlo (DSMC)/computational fluid dynamics (CFD)] is utilized to accurately capture the multi-scale nature of the flow. The simulation setup, initial benchmark results of the coupled approach, and the validation of the entire methodology against experimental data are presented. The results of the enhanced methodology show a significant improvement in the prediction quality compared to previous approaches.
  • Publication
    Open Access
    hpc.bw benchmark report 2022–2024
    In the scope of the dtec.bw project hpc.bw, innovative HPC hardware resources were procured to investigate their performance for HSU-relevant compute-intensive software. Benchmarks for different software packages were conducted, and respective results are reported and documented in the following, considering the Intel Xeon architecture used in the HPC cluster HSUper, AMD EPYC 7763 and ARM FX700.
  • Publication
    Open Access
    hpc.bw: an evaluation of short-term performance engineering projects
    Increasing amounts of data and simulations in scientific areas enforce the need of improved software performance. The maintaining scientific staff is often not primarily trained for this purpose or lacks personnel and time to address software performance issues. A particular aim of the dtec.bw-funded project hpc.bw is to tackle some of these shortcomings. A pillar of the hpc.bw agenda is the offer of a low-threshold consultancy and development support focused on performance engineering. This paper provides an insight on our related activities. We illustrate the structure of our annual calls for short-term performance engineering projects, we outline our results at the example of the performance engineering project “benEFIT - Numerical simulation of non-destructive testing in concrete”, and we draw a first conclusion on the current procedure.
  • Publication
    Open Access
    xbat: a continuous benchmarking tool for HPC software
    (UB HSU, 2024-12-20)
    Tippmann, Nico
    ;
    Auweter, Axel
    ;
    ; ; ;
    Benchmarking the performance of one’s application in high performance computing (HPC) systems is critically important for reducing runtime and energy costs. Yet, accessing the plethora of relevant metrics that impact performance is often challenging, particularly for users without hardware experience. In this paper, we introduce the novel benchmarking tool xbat developed by MEGWARE GmbH. xbat requires no setup from the user side, and it allows the user to run, monitor and evaluate their application from the tool’s web interface, consolidating the entire benchmarking process in an approachable, intuitive workflow. We demonstrate the capabilities of the tool using benchmark applications of varying complexity and show that it can manage all aspects of the benchmarking workflow in a seamless manner. In particular, we focus on the open-source molecular dynamics research software ls1 mardyn, and the closed-source optimisation package Gurobi. Both packages present unique challenges. Mixed-integer programming solvers, such as those integrated in the Gurobi software, exhibit significant performance variability, so that seemingly innocuous parameter changes and machine characteristics can affect the runtime drastically, and ls1 mardyn comes with an auto-tuning library AutoPas, that enables the selection of various node-level algorithms to compute molecular trajectories. Focusing on these two packages, we showcase the practicality, versatility and utility of xbat, and share its current and future developments.