The universe generates data at an astronomical scale - literally. Modern cosmological simulations tracking billions of dark matter particles across cosmic time, galaxy formation models with unstructured meshes spanning megaparsecs, and observational surveys capturing light from millions of celestial objects all produce datasets that can easily overwhelm traditional analysis tools. Enter scida, a Python framework designed to wrangle these cosmic big data challenges with the elegance they deserve.

Built on the robust foundation of Dask for parallel computing, scida provides a unified interface that scales seamlessly from your laptop to high-performance computing clusters and cloud infrastructure. The tool handles both particle-based simulations (think N-body dark matter evolution) and complex unstructured mesh data (hydrodynamical galaxy formation), while maintaining full support for physical units through Pint integration. Its extensible architecture means researchers can plug in support for new simulation formats or observational datasets without rebuilding from scratch.

Whether you’re analyzing the IllustrisTNG simulations, processing Gaia survey data, or developing custom cosmological models, scida transforms what used to be weeks of data wrangling into streamlined, reproducible workflows. Published in the Journal of Open Source Software and already gaining traction in the astrophysics community, this tool represents the kind of infrastructure that enables the next generation of cosmic discoveries - where the bottleneck isn’t computational power, but our ability to ask the right questions of an infinite universe.


Stars: 41
💻 Language: Python
🔗 Repository: cbyrohl/scida