Publications | Maxwell Cai's Homepage

ORCID: https://orcid.org/0000-0002-1116-2705

For an up-to-date list of publications, please click here to check out the NASA/ADS system.

List of Publications

ρ-Diffusion: A diffusion-based density estimation framework for computational physics Cai, Maxwell X. ; Lee, Kin Long Kelvin In physics, density ρ(⋅) is a fundamentally important scalar function to model, since it describes a scalar field or a probability density function that governs a physical process. Modeling ρ(⋅) typically scales poorly with parameter space, however, and quickly becomes prohibitively difficult and computationally expensive. One promising avenue to bypass this is to leverage the capabilities of denoising diffusion models often used in high-fidelity image generation to parameterize ρ(⋅) from existing scientific data, from which new samples can be trivially sampled from. In this paper, we propose ρ-Diffusion, an implementation of denoising diffusion probabilistic models for multidimensional density estimation in physics, which is currently in active development and, from our results, performs well on physically motivated 2D and 3D density functions. Moreover, we propose a novel hashing technique that allows ρ-Diffusion to be conditioned by arbitrary amounts of physical parameters of interest.
Hot Jupiter formation in dense star clusters Benkendorff, L.; Flammini Dotti, F.; Stock, K.; Cai, Maxwell X.; Spurzem, R. Hot Jupiters (HJ) are defined as Jupiter-mass exoplanets orbiting around their host star with an orbital period < 10 d. It is assumed that HJ do not form in-situ but ex-situ. Recent discoveries show that star clusters contribute to the formation of HJ. We present direct N-body simulations of planetary systems in star clusters and analyse the formation of HJ in them. We combine two direct N-body codes: NBODY6++GPU for the dynamics of dense star clusters with 32 000 and 64 000 stellar members and LONELYPLANETS used to follow 200 identical planetary systems around solar mass stars in those star clusters. We use different sets with three, four, or five planets and with the innermost planet at a semimajor axis of 5 or 1 au and follow them for 100 Myr in our simulations. The results indicate that HJs are generated with high efficiency in dense star clusters if the innermost planet is already close to the host star at a semimajor axis of 1 au. If the innermost planet is initially beyond a semimajor axis of 5 au, the probability of a potential HJ ranges between 1.5 and 4.5 per cent. Very dense stellar neighbourhoods tend to eject planets rather than forming HJs. A correlation between HJ formation and angular momentum deficit is not witnessed. Young HJs (tage < 100 Myr) have only been found, in our simulations, in planetary systems with the innermost planet at a semimajor axis of 1 au.
A hybrid approach for solving the gravitational N-body problem with Artificial Neural Networks Saz Ulibarrena, Veronica ; Horn, Philipp ; Portegies Zwart, Simon ; Sellentin, Elena ; Koren, Barry ; Cai, Maxwell X. Simulating the evolution of the gravitational N-body problem becomes extremely computationally expensive as N increases since the problem complexity scales quadratically with the number of bodies. In order to alleviate this problem, we study the use of Artificial Neural Networks (ANNs) to replace expensive parts of the integration of planetary systems. Neural networks that include physical knowledge have rapidly grown in popularity in the last few years, although few attempts have been made to use them to speed up the simulation of the motion of celestial bodies. For this purpose, we study the advantages and limitations of using Hamiltonian Neural Networks to replace computationally expensive parts of the numerical simulation of planetary systems, focusing on realistic configurations found in astrophysics. We compare the results of the numerical integration of a planetary system with asteroids with those obtained by a Hamiltonian Neural Network and a conventional Deep Neural Network, with special attention to understanding the challenges of this specific problem. Due to the non-linear nature of the gravitational equations of motion, errors in the integration propagate, which may lead to divergence from the reference solution. To increase the robustness of a method that uses neural networks, we propose a hybrid integrator that evaluates the prediction of the network and replaces it with the numerical solution if considered inaccurate. Hamiltonian Neural Networks can make predictions that resemble the behavior of symplectic integrators but are challenging to train and in our case fail when the inputs differ ∼7 orders of magnitude. In contrast, Deep Neural Networks are easy to train but fail to conserve energy, leading to fast divergence from the reference solution. The hybrid integrator designed to include the neural networks increases the reliability of the method and prevents large energy errors without increasing the computing cost significantly. For the problem at hand, the use of neural networks results in faster simulations when the number of asteroids is ≳70.
Birth cluster simulations of planetary systems with multiple super-Earths: initial conditions for white dwarf pollution drivers Stock, Katja ; Veras, Dimitri; Cai, Maxwell X.; Spurzem, Rainer ; Portegies Zwart, Simon Previous investigations have revealed that eccentric super-Earths represent a class of planets that are particularly effective at transporting minor bodies towards white dwarfs and subsequently polluting their atmospheres with observable chemical signatures. However, the lack of discoveries of these planets beyond a few astronomical units from their host stars prompts a better understanding of their orbital architectures from their nascent birth cluster. Here, we perform stellar cluster simulations of three-planet and seven-planet systems containing super-Earths on initially circular, coplanar orbits. We adopt the typical stellar masses of main-sequence progenitors of white dwarfs ( 1.5M⊙ - 2.5M⊙ ) as host stars and include 8000 main-sequence stars following a Kroupa initial mass function in our clusters. Our results reveal that about 30 per cent of the simulated planets generate eccentricities of at least 0.1 by the time of cluster dissolution, which would aid white dwarf pollution. We provide our output parameters to the community for potential use as initial conditions for subsequent evolution simulations.
Neural Symplectic Integrator with Hamiltonian Inductive Bias for the Gravitational N-body Problem Cai, Maxwell X. ; Portegies Zwart, Simon ; Podareanu, Damian The gravitational N-body problem, which is fundamentally important in astrophysics to predict the motion of N celestial bodies under the mutual gravity of each other, is usually solved numerically because there is no known general analytical solution for N>2 . Can an N-body problem be solved accurately by a neural network (NN)? Can a NN observe long-term conservation of energy and orbital angular momentum? Inspired by Wistom & Holman (1991)'s symplectic map, we present a neural N-body integrator for splitting the Hamiltonian into a two-body part, solvable analytically, and an interaction part that we approximate with a NN. Our neural symplectic N-body code integrates a general three-body system for 10^5 steps without diverting from the ground truth dynamics obtained from a traditional N-body integrator. Moreover, it exhibits good inductive bias by successfully predicting the evolution of N-body systems that are no part of the training set.
Fast and Credible Likelihood-Free Cosmology with Truncated Marginal Neural Ratio Estimation Cole, Alex ; Miller, Benjamin Kurt ; Witte, Samuel J. ; Cai, Maxwell X. ; Grootes, Meiert W. ; Nattino, Francesco ; Weniger, Christoph Sampling-based inference techniques are central to modern cosmological data analysis; these methods, however, scale poorly with dimensionality and typically require approximate or intractable likelihoods. In this paper we describe how Truncated Marginal Neural Ratio Estimation (TMNRE) (a new approach in so-called simulation-based inference) naturally evades these issues, improving the (i) efficiency, (ii) scalability, and (iii) trustworthiness of the inferred posteriors. Using measurements of the Cosmic Microwave Background (CMB), we show that TMNRE can achieve converged posteriors using orders of magnitude fewer simulator calls than conventional Markov Chain Monte Carlo (MCMC) methods. Remarkably, the required number of samples is effectively independent of the number of nuisance parameters. In addition, a property called local amortization allows the performance of rigorous statistical consistency checks that are not accessible to sampling-based methods. TMNRE promises to become a powerful tool for cosmological data analysis, particularly in the context of extended cosmologies, where the timescale required for conventional sampling-based inference methods to converge can greatly exceed that of simple cosmological models such as Λ CDM. To perform these computations, we use an implementation of TMNRE via the open-source code `swyft`.
Oort cloud Ecology II: The chronology of the formation of the Oort cloud Portegies Zwart, Simon ; Torres, Santiago ; Cai, Maxwell X. ; Brown, Anthony We present a chronology of the formation and early evolution of the Oort cloud by simulations. These simulations start with the Solar System being born with planets and asteroids in a stellar cluster orbiting the Galactic center. Upon ejection from its birth environment, we continue to follow the evolution of the Solar System while it navigates the Galaxy as an isolated planetary system. We conclude that the range in semi-major axis between 100au and several 103\,au still bears the signatures of the Sun being born in a 1000MSun/pc3 star cluster, and that most of the outer Oort cloud formed after the Solar System was ejected. The ejection of the Solar System, we argue, happened between 20Myr and 50Myr after its birth. Trailing and leading trails of asteroids and comets along the Sun's orbit in the Galactic potential are the by-product of the formation of the Oort cloud. These arms are composed of material that became unbound from the Solar System when the Oort cloud formed. Today, the bulk of the material in the Oort cloud (∼70\%) originates from the region in the circumstellar disk that was located between ∼15\,au and ∼35\,au, near the current location of the ice giants and the Centaur family of asteroids. According to our simulations, this population is eradicated if the ice-giant planets are born in orbital resonance. Planet migration or chaotic orbital reorganization occurring while the Solar System is still a cluster member is, according to our model, inconsistent with the presence of the Oort cloud. About half the inner Oort cloud, between 100 and 104\,au, and a quarter of the material in the outer Oort cloud, $\apgt 10^4$\,au, could be non-native to the Solar System but was captured from free-floating debris in the cluster or from the circumstellar disk of other stars in the birth cluster.
Inside-Out Planet Formation: VI. Oligarchic Coagulation of Planetesimals from a Pebble Ring? Cai, Maxwell X. ; Tan, Jonathan C. ; Portegies Zwart, Simon Inside-Out Planet Formation (IOPF) is a theory addressing the origin of Systems of Tightly-Packed Inner Planets (STIPs) via in-situ formation and growth of the planets. It predicts that a pebble ring is established at the pressure maximum associated with the dead zone inner boundary (DZIB) with an inner disk magnetorotational instability (MRI)-active region. Using direct N-body simulations, we study the collisional evolution of planetesimals formed from such a pebble ring, in particular examining whether a single dominant planet emerges. We consider a variety of models, including some in which the planetesimals are continuing to grow via pebble accretion. We find that the planetesimal ring undergoes oligarchic evolution, and typically turns into 2 or 3 surviving oligarchs on nearly coplanar and circular orbits, independent of the explored initial conditions or form of pebble accretion. The most massive oligarchs typically consist of about 70% of the total mass, with the building-up process typically finishing within ∼105 years. However, a relatively massive secondary planet always remains with ∼30−65% of the mass of the primary. Such secondary planets have properties that are inconsistent with the observed properties of the innermost pairs of planets in STIPs. Thus, for IOPF to be a viable theory for STIP formation, it needs to be shown how oligarchic growth of a relatively massive secondary from the initial pebble ring can be avoided. We discuss some potential additional physical processes that should be included in the modeling and explored as next steps.
Deep-learning enhancement of large scale numerical simulations van Leeuwen, Caspar; Podareanu, Damian; Codreanu, Valeriu; Cai, Maxwell X.; Berg, Axel; Portegies Zwart, Simon; Stoffer, Robin; Veerman, Menno; van Heerwaarden, Chiel; Otten, Sydney; Caron, Sascha; Geng, Cunliang; Ambrosetti, Francesco; Bonvin, Alexandre M. J. J. Traditional simulations on High-Performance Computing (HPC) systems typically involve modeling very large domains and/or very complex equations. HPC systems allow running large models, but limits in performance increase that have become more prominent in the last 5-10 years will likely be experienced. Therefore new approaches are needed to increase application performance. Deep learning appears to be a promising way to achieve this. Recently deep learning has been employed to enhance solving problems that traditionally are solved with large-scale numerical simulations using HPC. This type of application, deep learning for high-performance computing, is the theme of this whitepaper. Our goal is to provide concrete guidelines to scientists and others that would like to explore opportunities for applying deep learning approaches in their own large-scale numerical simulations. These guidelines have been extracted from a number of experiments that have been undertaken in various scientific domains over the last two years, and which are described in more detail in the Appendix. Additionally, we share the most important lessons that we have learned
DeepGalaxy: Deducing the Properties of Galaxy Mergers from Images Using Deep Neural Networks Cai, Maxwell X.; Bédorf, Jeroen; Saletore, Vikram A.; Codreanu, Valeriu; Podareanu, Damian; Chaibi, Adel; Qian, Penny X. Galaxy mergers, the dynamical process during which two galaxies collide, are among the most spectacular phenomena in the Universe. During this process, the two colliding galaxies are tidally disrupted, producing significant visual features that evolve as a function of time. These visual features contain valuable clues for deducing the physical properties of the galaxy mergers. In this work, we propose DeepGalaxy, a visual analysis framework trained to predict the physical properties of galaxy mergers based on their morphology. Based on an encoder-decoder architecture, DeepGalaxy encodes the input images to a compressed latent space z, and determines the similarity of images according to the latent-space distance. DeepGalaxy consists of a fully convolutional autoencoder (FCAE) which generates activation maps at its 3D latent-space, and a variational autoencoder (VAE) which compresses the activation maps into a 1D vector, and a classifier that generates labels from the activation maps. The backbone of the FCAE can be fully customized according to the complexity of the images. DeepGalaxy demonstrates excellent scaling performance on parallel machines. On the Endeavour supercomputer, the scaling efficiency exceeds 0.93 when trained on 128 workers, and it maintains above 0.73 when trained with 512 workers. Without having to carry out expensive numerical simulations, DeepGalaxy makes inferences of the physical properties of galaxy mergers directly from images, and thereby achieves a speedup factor of ∼105.