Coding for Astronomical Data Analysis

Coding for Astronomical Data Analysis

Introduction to Data Analysis in Astronomy

When certain discoveries, observations build up questions in their mind, scientists turn to the oldest weapon in their arsenal, SCIENTIFIC METHOD. Same goes to the field of Astronomy but things get little complicated. Unfortunately, we cannot make a star go supernova to test our models.  That’s where Data Analysis come in. Astronomers do their research from the data they gather from the light signals received from the equipment that they use but coming to conclusions is where things get spicy. We cannot arrive to conclusions just from the data that we gather. Astronomers must have the imagination and creativity to build up theories and discoveries as the raw data they collect does not cover the whole timeline of the observed objects but they do have their footprints. 

In the ancient times when people used their naked eye to observe the skies, gathered data was so little that the human mind was sufficient to analyze it. As the evolution of the science took place alone the years with projects like  Sloan Digital Sky Survey (SDSS), Pan-STARRS, and with forthcoming Large Synoptic Survey Telescope (LSST) stellar observation is going to a whole new level and also the amount of gathered data is continuously increasing as well. This brought up data analysts the requirement to find aid to analyze the gathered data. This is where the Information Communication Technology come in to astronomical data analysis. In this article I mainly focus on the I.C.T used in astronomical data analysis. 

How programming is used in data analysis 

Although observation field in astronomy requires complicated and updated equipment, analyst’s inventory is pretty much limited to his computer. Analyzing the gathered data is done by the computer programs that are written by analysts. Computer coding languages that are famous such as Java is pretty much useless when it comes to scientific analysis. Instead languages like IDL, Python, C are used. So, these languages are quite important for someone interested in data analysis in any field of science including astronomy.  

There have being many tools designed for the analysis of astronomical data. These were largely defined by wavelength regions with optical astronomers using the Image Reduction and Analysis Facility (IRAF), radio astronomers using the Astronomical Image Processing System (AIPS) and ultraviolet astronomers using the Interactive Data Language (IDL). Now let’s focus on certain important coding languages and concepts that are used in astronomical data analysis. 


As a programming language 

  1. The Interactive Data Language (IDL) 
  2. The Gnu Data Language (GDL) 
  3. The Fawlty Data Language (FDL)  

We will use the term “IDL” throughout this article, with the understanding that the code presented here will work under any of the 3 variants. 

IDL in astronomy 

As IDL has being used in this field for quite some time, there are many numerical and astronomical libraries available. Few of Astronomical data analysis IDL tool resources 

Adaptive Optics Software 

AIT (Astronomical Institute of Tübingen) IDL software 

ATV Image Display Tool 

Jeremy Bailin’s IDL Utilities (JBIU ) 

BANYAN (Bayesian Analyis for Nearby Young AssociatioNs) 

Although Python is taking over the analysis field in astronomy, even today there are many IDL based data analysis tools that are used by astronomers all around the world. 


As a programming language 

Python is an interpreted, high-level and general-purpose programming language.  Its language constructs and object-oriented approach aim to help data analysts write clear, logical code for small and large-scale projects. Python is developed under an OSI-approved open source license, making it freely usable and distributable. That and its’ easily understandable coding system makes scientists motivate themselves on using Python to build necessary tool kits. 

Python in Astronomy 

Python is a great language for science, and specifically for astronomy. The various packages such as,  

  • NumPy, 
  • SciPy, 
  • Scikit-Image 
  • Astropy 

These are all a great testament to the suitability of Python for astronomy, and there are plenty of use cases. Python packages have evolved to such an extent that it is now fairly easy for anyone to build data reduction scripts that can provide high-quality data products. Astronomical data is ubiquitous, and what is more, it is almost all publicly available. 

Python Vs IDL, C, C++ 

Regardless of the preference of different astronomers all these languages have their ups and downs. For instance, while python is easy, open source and free to share, IDL is not. User must get license to operate with IDL. On the other hand, there are many IDL based tool libraries available for astronomers. Python is not all great as well. Python is not as fast as Fortran or C/C++.  

Good in Both worlds: Programmers build hybrid of these languages to raise the efficiency of every data annalistic field including Astronomy. AUTO is a hybrid of Fortran and Python. In AUTO mathematical model, ODE and initial parameters are introduced by Python while Fortran does all the computation.  

Machine Learning 

What is Machine Learning 

Machine learning is a sub area of artificial intelligence, whereby the term refers to the ability of IT systems to independently find solutions to problems by recognizing patterns in databases. When it comes to astronomy when astronomers have to analyze and also process about 1×1024 stars Machine Learning is a life saver. 

Machine Learning in astronomy 

Machine Learning is quickly becoming a popular method to analyze astronomical data. There is a great deal of interest among the astronomical community in the powerful techniques that are now being developed using Machine Learning. As it was stated before modern telescopes collect a large amount of data. With big datasets, come big opportunities. The SDSS, Kepler, and K2 datasets, the recently released Gaia DR2, the forthcoming LSST in the optical, ALMA, MWA, and SKA in the radio, SDO in the EUV, are perfect illustrations of the power of data to unlock new science. It is important for astronomers mainly data analysts to be aware of both the promise of machine learning and to understand its limitations. 

The Hubble Telescope operating since 1990 gathers around 20GB of data per week. Large Synoptic Survey Telescope (LSST) scheduled for 2021, is expected to gather more than 30 terabytes of data every night. But that is nothing compared to the most ambitious project in astronomy, the Square Kilometer Array (SKA) which is expected to produce more than 1 exabyte of data per day. Analyzing this much data would definitely be literally impossible if the concept of Machine Learning isn’t used. 

HAIKU for the finisher 

Uncertainty, what keeps us running

Everyone tries to find perfection. But perfection is a never-ending journey that we embark which makes the world better every step of the way. This goes to astronomers and all the scientists as well. It is important to understand that any discovery could be slightly, moderately and maybe even entirely wrong no matter how much stable it seems at the time. But that’s the beauty of it. With every discovery or proof of a discovery being wrong, we step closer towards the truth, which is after all the end game that never ends. 


Lasana Subasinghe 

Dharmaraja College Kandy 


Leave a Reply

Your email address will not be published. Required fields are marked *