Dedicated to the research, development, implementation, and standardization of metadata for educational and research mathematics.
AMS Panel discussion: Wednesday, January 19. Ballroom Balcony A, Marriott Wardman Park Hotel. 2:15 - 5:15. Immediately followed by American Mathematics Metadata Task Force Meeting.
As more and more online resources are becoming available, finding ones suitable for specific educational purposes is becoming increasingly difficult.Not only must the subject matter be appropriate and the content be accurate, but the resources must also match the educational level and background of the user.Once a suitable Web site, graphic, applet, or other resource has been located, additional problems must be faced if it is to be integrated into a learning environment. There might be software incompatibilities, legal issues, and questions concerning how it will interface with other components. This article offers a non-technical introduction and overview of metadata, an important and fascinating part of the solution to these problems.It gives the definition and examples of metadata, shows how metadata can help non-experts search for online resources,and explains how metadata can assist in the use and re-use of online pedagogic materials.It ends with a discussion of Learning Object Metadata, the IEEE Learning Technology Standards Committee standard that has recently been accepted by the major pedagogic metadata efforts.
Metadata. The word metadata means data about data.Metadata tells us the contents and properties of a collection of data without requiring us to look through the data themselves.The data, for the purposes of this article, are electronic data, primarily those stored and delivered via the Internet and used for educational purposes.The general term for such data is pedagogic hypermedia. This includes Web pages but also graphics, digitized video, Java applets, and documents written for use with specialized software, e.g. worksheets written within a computer algebra system or multiple choice quizzes that can be used within a particular online learning environment (Looms, 1999). Two Fundamental Problems. There are two fundamental problems faced by the would-be user of pedagogic hypermedia.First, appropriate material must be found.Second, the material found must function in the userâ€™s environment. The first problem is that of searching. The second is that of reusability. Experienced Web users are familiar with the problem of searching.It is becoming increasingly difficult to find useful and relevant material on the Web.One problem is that much of the Web is not indexed. This is especially true of educational sites (Thompson, 1999).A second problem is that search engines return large numbers of irrelevant results and often donâ€™t find what is wanted.Metadata cannot help with the first problem but is designed to assist with the second. The problem of reusability is perhaps less recognized but is equally essential and much harder.Technological requirements and standards pose an obvious barrier to using material found on the Internet.An application written for a Unix platform will usually not run on a Windows machine.Metadata can help this problem by appropriately labeling software.A more insidious barrier to reusability is that most academic authors write material for local delivery within highly specialized contexts.Simple practices such as hard-coding navigational aids make it impossible to extract a Web page from one site and insert it another without detailed editing. Awareness of metadata can encourage better practice, but the real contribution of metadata is to support the ability of specialized environments, taken as entities themselves, to communicate effectively with each other. In this article we will explain, in terms that we hope are accessible to anyone who has used the Internet in a classroom, a little more about how metadata works and what it is intended to do.We will also briefly review the current state of pedagogic metadata without going into too many details.This article is meant as an overview and introduction, not a technical reference.
Library Classification Schemes. A useful and germane analogy to searching for hypermedia on the Web is that of finding reference material in a library.A small personal library can be successfully organized in any way that suits the owner, including not being organized at all.Larger collections would be useless were it not for schemes such as the Dewey decimal system and the Library of Congress classification. These schemes are called subject classifications.Labeling a book with its subject classification is an example of metadata. Other metadata that appear on every book in a public or university library include the title, the authors, the ISBN number, the publisher, and the publication date.Notice that all of this information is available without reading a single word of the book and in fact is available in a catalog (usually electronic but formerly consisting of cards) that is physically and conceptually separate from the books themselves. Key Points About Metadata. The library catalog is an excellent example of metadata.Thinking about the types and uses of the information contained in a catalog serves as a guide and illuminates some of the key points about metadata in the electronic medium: The need for metadata is a function of the size of a collection. This observation is particularly relevant to the Web. Estimates of the number of Web pages range from 300 million on publicly available sites (OCLC 1999b) to a total of 800 million (Thompson, 1999). By way of contrast, the Library of Congress (1999) collections include â€œonlyâ€ 17 million books, 2 million recordings, 12 million photographs, 4 million maps, and 50 million manuscripts. Moreover, the Web is growing at a rapid exponential rate (Internet Software Consortium, 1999) and projects like JSTOR (1999) are converting legacy print documents to electronic format. The Web is adding at least an order of magnitude to the size of the search problem. To be useful, metadata must be standardized. The standards need not be perfect, but they must be agreed upon by the user community. It would be a disaster if every library used its own proprietary subject classification system. Metadata describes many aspects of a resource. The subject classification is just one. Authorship, date and place of publication, copyright ownership, and language are also part of the metadata in a card catalog. For electronic media, technological requirements become important, and for educational applications, metadata addressing the pedagogic aspects of a resource play a central role. Metadata may be associated with resources but is not necessarily attached directly to them. The library model, as well as the online model, is based on a catalog of pointers to resources. Each pointer is labeled with a description of the resource to which it points. This allows multiple descriptions and multiple cataloging of the same resource. It also permits descriptions to be extended or modified at a later date without changing the resources themselves. Metadata is needed for intellectual and commercial property rights. Copyrights depend on the ability to identify the copyright owner and bookstores depend on ISBN numbers and other metadata when placing an order. The online situation is more complex (World Wide Web Consortium, 1998 and 1999) because there are multiple business models (e.g., freeware, shareware, commercial ware, and open source software) and copyrights are harder to enforce because of the ease of reproduction, but this makes the existence of appropriate metadata even more important. Despite all of the other metadata associated with a book, the pieces that are most visibly useful are the title and subject classification.The reason is that these facilitate searching.Searching, especially in the online pedagogic setting, is the topic of the next section.
Search Methods. There are currently two standard methods of searching for online documents.One is a keyword search.A search engine can either look for a combination of keywords in the body of a document or in a separate list of keywords associated to the document.(Keywords associated to a document are a form of metadata.) Generalizations of keyword searches include weighted keyword searches and more sophisticated methods like latent semantic analysis (Deerwester et al. 1990) that reduce a large number of possible keywords to a small set which efficiently describes the search space. The other standard type of search uses a browse structure.A browse structure is a tree through which one can descend, moving from the more general to the more specific.Familiar subject classifications are examples of browse structures: one might start with â€œscienceâ€, descend to â€œastronomyâ€, and then descend to â€œlunar and planetary astronomyâ€, and so on.Web search engines like Yahoo (1999) and Infoseek (1999) display chains of categories that constitute parts of a browse structure, as do digital libraries like GEM (1999). Difficulties in the Pedagogic Domain. Both keyword searches and browse structures suffer from the same drawback in the pedagogic domain.To effectively use a keyword search, or to browse a subject classification, the user must be relatively expert in the vocabulary or logical structure of the field of study.Metadata offers at least a partial solution to this.If a metadata description is associated with a pedagogic resource, then a search engine can look for resources with similar descriptions.The user need not specify (or even see) the metadata used for the search. The Problem of Context. The lack of subject knowledge is only one aspect of a more general problem, that of the userâ€™s context.Finding suitable hypermedia often depends on parameters such as the userâ€™s educational level, the language in which a resource is written, time available for using a resource, available technology, cost of the resource, the copyright status of the resource, and any number of other variables.Parameters that speak to these issues are included in standardized metadata schemes and can enable search engines or search agents to do a far better job of returning useful information than is now possible. It is important to point out that parameters such as educational level cannot be utilized unless their values are known for the individual user.This means that the search software must negotiate these values in some way.This is exactly what occurs when a student asks a reference librarian for help.In a typical scenario, a student might make some statements about what type of materials are sought and perhaps mention a class or an assignment.For example, a student might ask for journals on astrophysics.The librarian might then ask for what purpose the journals are needed. The student might respond that they are needed for a report due in a first-year topics in science class.On the basis of this response the librarian might then judge that professional journals are far less appropriate than expository sources, despite the studentâ€™s initial request. If the Web is to be useful repository of information used for educational purposes, it will be necessary to develop online environments that mimic this type of human-human interaction.