Metadata Workshop Seeks to Make Mountains of Data More AccessibleJuly 11, 1997By Jon Bashor, JBashor@LBL.gov
Although such information represents a valuable resource, the sheer volume of
data stacking up is making it increasingly difficult for anyone to retrieve
needle-sized files of useful information from these virtual haystacks. Today,
experts in the field will wrap up a four-day workshop organized by the Lab's
Computer Sciences Division and held at UC Berkeley's Clark Kerr Campus.
Participants will make recommendations for standards and practices to improve
access to the data and make it easier for various organizations to share
electronic information.
The issue is so large, it has generated its own terminology. The data created
to describe large piles of information is known as "metadata," and one of the
techniques used to find valuable nuggets is called "data mining."
"There are not only mountains of data to be conquered, but those mountains come
in different varieties," said workshop chairman John McCarthy of the Lab's
Computing Sciences Division. "The problem common to all of these vast libraries
is that it is very difficult to find exactly what you're looking for and to
relate one data set to another. Many organizations still haven't come to grips
with the extent of the problem." McCarthy is one of the researchers credited with
coining the term metadata some 25 years ago.
According to program committee chairman Frank Olken of Berkeley Lab, metadata
can facilitate access, use and sharing of data across cyberspace and time by
systematically describing the content, structure and semantics of data residing
in information systems, databases or files.
The main sponsor of the workshop is the U.S. Environmental Protection Agency
(EPA), which has amassed volumes of environmental data, usually collected on
one specific component--such as air, water or solid waste--making it difficult
to draw together the full picture of environmental conditions for any specific
place. To make and defend policies today, the EPA needs to access data from
many sources, ensure its validity, and integrate many perspectives, such as air
quality, land use, water quality and chemical toxicity.
The workshop is being held under the auspices of the International Organization
for Standardization's Joint Technical Committee on Information Standards. The
wide range of organizations participating in the workshop illustrates the scope
and importance of this issue: the U.S. Census Bureau, Boeing, Xerox, AT&T
Laboratories, the National Institute of Standards and Technology, UC Berkeley,
Stanford University, University of Michigan, Rutgers University, the University
of Maryland, and Lawrence Berkeley and Los Alamos national laboratories.
Search | Home | Questions |