Astronomy enters age of virtual reality

Physics World, March 2002

In the future, astronomers will not need telescopes to study the universe. As Tim Chapman explains, they will instead download observational data from a giant computer archive

The problem facing astronomy is that the amount of data is, quite simply, astronomical. Exponential increases in telescope, detector and computer technology means that the amount of observational data is roughly doubling every year.

Much of the data deluge comes in digital sky surveys at different wavelengths, each of which contains several terabytes of raw data. The Sloan Digital Sky Survey at five wavelengths, for example, will contain 40 terabytes when complete. The proposed Large Synoptic Survey Telescope will meanwhile produce up 10 petabytes a year by 2008 - about the same amount of information as contained in a row of encyclopaedias stretching from London to Los Angeles.

The desire to make the best use of this expensively-obtained data has led astronomers and computer scientists to propose a new way to access and analyse this data - a "virtual observatory" that will effectively recreate the observed universe within a network of computers.

The US-based National Virtual Observatory (NVO) initiative won a $10m grant from the National Science Foundation in October 2001 to prepare the framework for such a project. A month later the Astrophysical Virtual Observatory (AVO), based at the European Southern Observatory in Germany, secured Euro4m funding from the European Commission. Japan and Australia are also setting up large online data archives.

Astronomer Alex Szalay of Johns Hopkins University, who leads the NVO project alongside Paul Messina of Caltech, says the five-year NSF grant was an extremely important step.

"The funding enables us to build some demos or prototypes, to collaborate with what's being built in Europe, and to demonstrate this can be done," he says. "There are a lot of sceptics that say astronomers will never come to agreement about this."

The NVO was hailed by the US National Academy of Sciences in their decadal report on research priorities as the most important small project in astronomy. Small in the high-capital world of observational astronomy means that the project has an estimated cost of $60m over 2000-2010.

The NVO aims to pull together data from dozens of telescopes over decades of observing time, making possible extremely large-scale statistical studies. Likely projects range from comparing conditions in the local and distant universe, to searching for extra-solar planets. Mining such vast data sets will also help discover rare and anomalous objects that would not be discovered by chance.

"There will be a fundamental change in the way we do astronomy," Szalay predicts. "Astronomers are still predominantly getting data through their own observations - I think this will be a thing of the past very soon. Astronomers will be getting their data through the databases."

Providing data in the same format at a range of wavelengths will help lift disciplinary boundaries, Szalay says. "We're not just providing raw images but creating meaningful catalogues so it will be much easier for radio astronomers to access optical data, for example. If you can easily get images in each wavelength, then you can concentrate on the physics."

The first steps towards a fully-fledged virtual observatory have already been made by a number of smaller projects. Nasa's Skyview virtual telescope has been providing images of the sky at a variety of wavelengths since the early 1990s, while Caltech's Digital Sky has served as a small-scale technology demonstrator for the NVO.

"We have demonstrated that cross-matching catalogues can find new and unusual kinds of sources," says Roy Williams, a senior member of the NVO team and architect of Virtual Sky, the educational version of Digital Sky. "We've also demonstrated that putting images together on some general grid can greatly improve the sensitivity of existing data."

Much of the work of the virtual observatory initiative is in creating and promoting common standards so that data from many sources can be handled in a single system. "The problem is there's little coherence between the different ways people put data online," Williams says. "We have to make standards that are easy for people to adopt and powerful enough that we can do cool things that will make people want to adopt them."

Persuading the general astronomical community to commit to using common data standards could be the greatest hurdle in developing a virtual observatory, Williams suggests. "Everybody sees potential but they want to see what's in it for them. Everybody's worried that their particular style of doing things will be cramped somehow."

The current systems maintain all their data at a central location. The main technical challenge of the virtual observatory is to keep the vast databases in their native archives, and provide a seamless way of accessing them all through a single interface.

The UK's contribution to the European AVO targets this problem with the AstroGrid consortium, led by Andy Lawrence at the University of Edinburgh, investigating distributed data archives.

"It has a slightly different flavour to the other virtual observatory projects in that it's very focused on the technology concerned as much as the astronomy itself," says Lawrence. "The technology we're developing is aimed at being part of the toolbox that the international virtual observatory will use. In the short term we're concentrating on selected key databases in the UK, to make them inter-operable and build tools that allow astronomers to do new things with that data."

AstroGrid works on the principle of distributing computer power, just as the national grid distributes electrical power. Users will be able to use their desktop machines to tap into the power of supercomputers and perform complex queries within the databases. A large part of the technical challenge is in agreeing on data formats, and making sure these work with data management systems. Creating and implementing the algorithms to carry out the queries will also be a continuing challenge.

"There's also the great problem about how you give people external access to your machines to query these databases," Lawrence adds. "You need ways to carry round authorisation and authentication. The software to do that doesn't really exist yet."

The various virtual observatory initiatives around the world are working closely together to solve their many problems, with the ESO hosting a five-day conference in June. "Everybody knows that if you want something that looks like the virtual observatory, then you have to solve it as a global problem," Lawrence says. "It would be nonsense to think of building a UK and an American virtual observatory, because the data you want is all over the place. The international virtual observatory will become a reality in maybe six years."