From dark to light: The connection between galaxies and their dark matter haloes through cosmic time
Final Report Abstract
We introduce GalaxyNet, a novel deep learning framework utilizing Wide & Deep Neural Networks (WDNNs) and reinforcement learning to model galaxy formation. Unlike traditional methods like hydrodynamic simulations and semi-analytic models (SAMs), which rely on simplified prescriptions and calibrations against observations, GalaxyNet learns directly from observational data, offering a more data-driven and potentially less biased approach. The core of GalaxyNet's innovation lies in its ability to bridge the gap between dark matter halo properties and observable galaxy characteristics. Traditional methods for establishing the stellar-to-halo mass (SHM) relation often rely on pre-defined functional forms, limiting their ability to capture the complex, non-linear relationships inherent in galaxy formation. GalaxyNet overcomes this limitation by learning these relationships directly from the data, resulting in a more nuanced and potentially more accurate SHM relation, especially at high redshifts. The model reveals a lower normalization and shallower low-mass slope for the SHM relation at high redshift compared to existing empirical models, suggesting a recalibration of our understanding of star formation and feedback mechanisms at these epochs. Furthermore, GalaxyNet provides insights into galaxy quenching, indicating that satellite quenching dominates for low-mass galaxies while internal processes like AGN feedback are more important for massive galaxies. GalaxyNet learns from two primary data sources. The first comprises observational datasets of global galaxy properties, including stellar mass functions (SMFs), quenched fractions, cosmic and specific star formation rates (CSFRD & sSFRs), and galaxy clustering. These data provide crucial constraints on the overall galaxy population and its evolution. The second source is high-resolution cosmological N-body simulations, which provide detailed dark matter halo catalogs, including halo merger trees and a comprehensive set of halo properties (e.g., mass, concentration, spin, growth rate). These halo properties serve as input features for the GalaxyNet model. The architecture of GalaxyNet is a WDNN, combining the strengths of deep and wide networks. The deep component, consisting of multiple fully connected layers, learns complex, non-linear relationships, while the wide component, connecting the input directly to the output, captures simpler linear relationships. This combined architecture allows for learning both intricate details and broader scaling relations. The training employs reinforcement learning, where the network acts as an agent predicting galaxy properties based on halo properties. The agent receives a "reward" based on how well the predicted global galaxy statistics match the observational data. This approach avoids biases from simulated galaxies, allowing the model to learn directly from the real Universe. The training process uses optimization algorithms like Particle Swarm Optimization (PSO) to fine-tune the network's parameters and maximize the reward. One key output of GalaxyNet is its prediction of the instantaneous baryon conversion efficiency, which quantifies how efficiently haloes convert infalling baryons into stars. The model reveals a complex dependence of this efficiency on halo mass and redshift, differing from the linear scaling typically assumed in other models. GalaxyNet finds a roughly constant conversion efficiency at high redshifts, followed by a linear decrease at lower redshifts. This finding suggests a more nuanced interplay of gas accretion, star formation, and feedback processes than previously assumed. We further demonstrated GalaxyNet's ability to predict galaxy clustering on large scales using the HugeMDPL simulation. The model accurately reproduces the baryon acoustic oscillation (BAO) signal and predicts the galaxy bias, highlighting its potential for cosmological studies and the interpretation of future large-scale galaxy surveys. Interestingly, GalaxyNet reveals that active galaxies cluster more strongly than passive galaxies at high masses, potentially indicating a link between galaxy activity and halo growth in dense environments. Future directions for GalaxyNet include expanding its predictive capabilities to encompass gas content, morphology, and metallicity. Incorporating the time evolution of halo and galaxy properties using recurrent neural networks (RNNs) is another option, enabling the model to track the detailed formation history of individual galaxies. Exploring alternative cosmologies is also envisioned, allowing GalaxyNet to test different cosmological models. In summary, GalaxyNet presents a paradigm shift in galaxy formation modeling by leveraging deep learning and reinforcement learning to directly learn from observational data. Its predictions offer new insights into the galaxy-halo connection, galaxy quenching, baryon cycling, and large-scale structure, promising a more accurate, robust, and powerful framework for understanding galaxy evolution in the years to come.
Publications
-
GalaxyNet: connecting galaxies and dark matter haloes with deep neural networks and reinforcement learning in large volumes. Monthly Notices of the Royal Astronomical Society, 507(2), 2115-2136.
Moster, Benjamin P.; Naab, Thorsten; Lindström, Magnus & O’Leary, Joseph A.
