For the first time there are integrated, easy-to-use tools available that empower organisations and research projects to create, manage and publish the vocabularies that they use to describe their data. The best part is that for Australian users, the service is currently free.
One of the major impediments to re-using shared scientific data is that outside of the circle of people who are involved in collecting data for specific projects, the meaning of the data fragments captured is often not readily understood. One scientist’s Great White Shark might be another’s Carcharodon carcharias! Interpretation of data, without a shared lingua franca, must then rely on communication between people so that a common understanding of the dataset can be achieved, before it can even be used.
With today’s technologies it should be feasible to gather together the vast silos of data that are accumulating on our cheap storage devices and to mine these data to generate insights not previously thought possible from more restricted datasets. But to readily harness these technologies requires a change in the way scientific data are recorded. Datasets need to contain not just observations, be these numbers, codes, or text, but should also include embedded reference to descriptions of these things so that their origin and meaning are clear. Controlled vocabularies are an important part of such descriptions and when expressed using the language of the World Wide Web, i.e., using the Resources Description Framework (RDF), we can use this information to automate the process of dataset integration and even dataset transformation. The ANDS tools have now put this possibility within the grasp of all Australian scientists.
eMII, the data management facility of Australia’s Integrated Marine Observing System (IMOS) has been working closely with ANDS to develop this vocabulary tool-set so that the software functionality delivered is capable of supporting a real use-case, like IMOS. eMII has now created and published a range of controlled vocabularies using the ANDS Vocabulary Service. These vocabularies help to describe data captured from the 10 IMOS Facilities and importantly, underpin the search and integration technology used in the public IMOS Ocean Portal.
“This is really fundamental work” says IMOS Director, Tim Moltmann. “People get very excited about the possibilities of big data. But without focused effort on vocabularies, our systems can’t talk to one another, and our scientists can’t realise the full potential of a data rich environment. Our colleagues at ANDS understand this and have done an excellent job in creating this service. And obviously they see ocean data as an important use case. This is another example of the power of the NCRIS program, bringing platform capability together with discipline specific need”.
To visit the new Reseach Vocabularies Australia web site click here.