AcuSolve typically solves a given problem in the first attempt. Fully converged solutions are reliably obtained using AcuSolve's efficient steady-state solver. Nonlinear convergence remains strong even as solutions approach their final result. Two key components contribute to this robustness:
Stable Finite Element Formulation
AcuSolve is based on the Galerkin/Least-Squares (GLS) finite element formulation (see Accuracy). This technology has mathematically and in practice been proven to have superb stability and accuracy properties. It easily handles difficult industrial problems with under-resolved, distorted, and high aspect ratio meshes.
Powerful Iterative Solver
AcuSolve utilizes a unique and proprietary iterative linear equation solver, which allows for efficient and stable solution of the coupled pressure/velocity equation system, arising from linearization of the Navier-Stokes equations. The linear solver is devised based on detailed study of the coupled system. The solver is highly stable, capable of efficiently solving unstructured finite element meshes with high aspect ratio and badly distorted elements, which are commonly produced by automatic mesh generators on complex industrial problems. This practically parameter-free linear solver yields significant improvement in the robustness and convergence of the linear and nonlinear iterations as compared to segregated solver procedures which are the norm in today's commercial incompressible flow solvers.
AcuSolve achieves fast solutions via three mechanisms:
- Solves the fully-coupled pressure/velocity equation system, which yields significant linear and nonlinear convergence speed (see Robustness).
- Architected and implemented from ground up for vector and cache-based super-scalar computers.
- All algorithms are designed for multi-core parallel clusters, using a hybrid distributed/shared-memory (MPI/OpenMP) parallel model. The parallelization is completely transparent to end users.
From the start, AcuSolve was architected and all the algorithms were designed specifically to perform on coarse-grain parallel computers. Domain decomposition is used to distribute the elements and nodes to different processors. Message Passing Interface (MPI) is used to communicate between distributed-memory computers, and shared-memory data copy is used between subdomains of a single shared-memory box. This insures minimum communication costs. All I/O during the computation is done in parallel. The parallelization (and domain decomposition) is completely transparent to the user. Moreover, it is so flexible that the number of processors (subdomains) may be changed at every problem restart.
The flow solver is based on the Galerkin/Least-Squares (GLS) finite element technology developed at Stanford University during late 80's and early 90's. This technology has since been enhanced to better handle large-scale industrial problems.
Finite Element Formulation
The method is based on the finite element Galerkin weighted residual formulation with equal-order interpolation for all solution fields, including pressure. This not only has the advantage of simplicity of coding and integration into scientific and engineering applications, but also is crucial for retaining the accuracy of the underlying scheme.
A least-squares operator is added to the basic method to achieve stability of the divergence-free constraint and the convective terms. Discontinuity-capturing and nonlinear maximum principal operators are added to resolve sharp internal and boundary discontinuities and oscillations that are not resolved by the (linear) least-squares operator. These operators are mathematically designed to yield the needed stability without sacrificing the underlying accuracy and conservative nature of the Galerkin formulation. These operators can be contrasted to artificial diffusion operators adopted by many commercial CFD packages, where stability is achieved at the expense of accuracy and/or conservation.
The method is also designed to maintain local and global conservation of the differential equations. The conservation is achieved for any meshes.
In addition to excellent spatial accuracy, AcuSolve has a second-order time accurate option. When combined with the coupled pressure/velocity linear solver (see Robustness), rapid nonlinear convergence in each time step is obtained. This leads to the realization of the second-order time-accurate solutions. This may be contrasted with segregated type solution schemes used by many commercial CFD packages, where converging the nonlinear iteration at each time step is typically not feasible, and time accuracy is rarely observed.
The resulting technology has a very rich mathematical foundation. In practice, it exhibits a high degree of robustness. It is very accurate, and always conservative.
Backward Facing Step Problem
Solution of the turbulent flow over a backward-facing step at Reynolds number of 40,000 is used to demonstrate the accuracy of the method. The mesh consists of one slice of 3D elements, having 7K brick elements and 15K nodes. The Spalart-Allmaras turbulence model is used here. The reattachment length is an excellent measure of solution accuracy. The computed reattachment length is 7.05 times the step height, which is in excellent agreement with the experimental results of 7. In addition the figure below shows a number of particle paths in the separation region at the step. The mesh is superimposed on the figure for reference. The method has captured not only the main separation eddy, but also two secondary eddies at the corner. The smallest eddy is captured within a radius of three elements. This strongly attests to the accuracy of the methods.
The flow over a backward-facing step is used to demonstrate the conservative nature of the method. An arbitrary cut (on element boundaries) is made in the interior of the mesh through the main separation eddy. With this cut, we have a closed loop consisting of Inflow, Top Section, Bottom Section and Interior Cut.
The problem is converged to 10-5. The table below shows the mass flux and each of the momentum fluxes across each boundary. Note that the total fluxes add up to zero.