Mathematics Metadata Working Group

August Meeting Report

This document reports on work done and decisions made at a meeting held August 10 and August 11, 1999, in Berkeley, California. Please see the last section for acknowledgments and a list of attendees. This document was prepared by Robby Robson.

Comments and Suggestions: Comments and suggestions should be posted to the forum http://forum.swarthmore.edu/dicsussions/math_metadata or sent by email to robby@orst.edu.

Issues covered: The issues covered in the main part of this report are

    1. a mission statement of the Mathematics Metadata Working Group (MMWG)
    2. a brief report on IEEE Learning Object Metadata Standards
    3. the decision to classify mathematics using three separate taxonomies
    4. the detailed structure of these taxonomies
    5. a discussion of other LOM tags and their relevance to the MMWG. Includes rough proposal for which tags to use.
    6. Implementation, work, and cooperation commitments
    7. timelines and commitments for the completion of a first public release of mathematical metadata standards

Attachment: Attached is an annotated version of the LOM 3.5 base scheme.

Requests for comments and notes by the author. Comments on any and all aspects of this document are welcome.

I Mission Statement.

The following mission statement is proposed for the MMWG. It is important to have such a statement for setting future agendas and for explaining MMWG work to potential partners, contributors, and funding agencies. Comments are welcome.

MISSION

The mission of the Mathematics Metadata Working Group (MMWG) is to analyze, formulate, disseminate, and facilitate the implementation and acceptance of metadata standards that support and enhance mathematical research, education, publication, and software development. Standards developed and promoted by the MMWG will be compliant with accepted international standards.

The MMWG is an open group that welcomes the participation and seeks the cooperation of all professional societies, commercial interests, academic institutions, government agencies and individuals who have an interest in its work or a stake in its results.

 

II IEEE Learning Object Metadata Standards

Until recently there have been several organizations developing potentially completing standards for pedagogic metadata. Over the past six months the situation has changed dramatically. The IMS project and ARIADNE have both agreed to use the IEEE Learning Object Metadata, or LOM, as their standards. The latest LOM specifications, together with annotations, appear as an attached document.

It should be pointed out that there are some difficulties in mapping IMS metadata to LOM and that, at least in the opinion of the MMWG, there are portions of the LOM base scheme that are too restrictive or have not anticipated the full range of learners and learning environments. The MMWG struggled with this and eventually decided to follow the lead of NEEDS and to judiciously define the necessary extensions to LOM. These decisions will be explained and communicated to the IEEE Learning Technology Standards Committee and to the IMS project.

The structure of LOM. The LOM base scheme is divided into nine categories defined as follows (source: LOM v3.5):

General. Context-independent features of the resource.

Lifecycle. Authorship, ownership, etc.

Meta-metadata. Describes what metadata scheme(s) are being used.

Technical. Describes the format and the technical requirements needed to use the resource.

Educational. Educational and pedagogical features of the resource.

Rights. Refers to intellectual property rights.

Relation. Describes the relation of the given resource to other resources.

Annotation. Allows for comments on the educational use of the resource.

Classification. Taxonomic classification of the resource. (Could be subject matter, educational objective, accessibility requirements, etc.)

The MMWG has decided to work only on those portions of this scheme that need special attention for the purpose of cataloging and searching mathematical materials. But other parts of the base scheme must be included in software that implements metadata, e.g., that tags documents with metadata or that uses metadata to retrieve documents. The most immediate implementation will be as part of the SMETE portal under development by NEEDS, and most of the more general tags have already been addressed by NEEDS software that will be modified to handle mathematical resources.

III The Classification of Mathematical Content.

The MMWG proposes to classify the content of mathematical resources using three taxonomies corresponding to three different levels of vocabulary. Whereas it might be tempting to view these taxonomies as correspond roughly to American grade levels, they should be thought of in terms of vocabulary usage rather than in terms of subject matter.

History of This Decision. The first idea for classifying mathematical content was to extend the MSC to encompass grade school, high school, and college mathematics. This was quickly seen to be a poor option for a number of reasons:

During its June meeting the MMWG abandoned the idea of extending the MSC and proceeded to propose a division of mathematics into four or five separate taxonomies based on progress through the traditional American educational system. Several issues were left open for further consideration: should some of the levels be merged, what were appropriate names for the levels, and what should their internal structure be.

Further Considerations. At this August meeting the MMWG tackled these issues. The following points were made:

IV. The Classification Taxonomies.

The Proposal. The MMWG decided to simply name the three taxonomies Level 1, Level 2, and Level 3. Each of these will consist of a list of controlled vocabulary together with a set of relationships among these vocabulary. This applies to the MSC, which will be used for Level 3, as well as to the new taxonomies proposed. The taxonomies are described as follows. Comments on these descriptions are welcome.

Level 1.

Controlled Vocabulary: Controlled vocabulary as used and encountered by students and teachers in early mathematics including number sense, arithmetic, and basic geometry and other subjects taught in traditional elementary and middle schools. A rough description of the mathematical level might be "pre-variable mathematics."

Relationships. Each term in the Level 1 controlled vocabulary will be associated to those terms which are specializations, generalizations, and equivalents of the given term. A specialization of a term is one that is more specific, that serves as an example, or that could reasonably be viewed as a subtopic. Examples of specializations are:

A generalization of a term is the opposite of a specialization and consists of a topic under which the given term might fall. Thus the generalizations of pi might include number, circle, geometry, and measurement.

An equivalent term is a term that in many contexts means the same thing. At Level 1 we might want to include both the terms "multiplication table" and "times table", which would be equivalent.

Level 2.

Controlled vocabulary: Controlled vocabulary as used and encountered by students and teachers in mathematics typically encountered after 8th grade and through sophomore level college mathematics courses in North America. Topics include those used for the last year of postsecondary education in the TIMSS benchmarks.

Relationships: As in Level 1. For each term in the Level 2 controlled vocabulary we will identify those terms which are specializations, generalizations, and equivalents of the given term.

Level 3

Controlled vocabulary: Controlled vocabulary reflecting accepted profession practice and typically encountered in advanced undergraduate, graduate, and research mathematics.

Relationships: As in the MSC, Level 3 will be a two-layer tree structure with topics at the first layer and sub-topics at the second layer. Each leaf of the tree will include a set of related topics.

Level 3 will be identified with the MSC.

Lack of Transitivity. It should be noted that the relationships used in Level 1 and Level 2 are not intended to be transitive (or possibly even anti-symmetric)! As an example, pi might be considered a specialization of circle and number. Circle, in turn, might be considered a specialization of shape, and number might be considered a specialization of counting. But pi would not be considered a specialization of shape or of counting. A reason for this non-transitivity is that the structures being defined contain both semantic and axiomatic associations. The link between pi and circle is axiomatic; the definition of pi involves a circle. The link between circle and shape is semantic.

Another reason is that of we think of a term as defining a region in document space, neither generalization nor specialization require strict containment. If one region is "mostly contained" in another, then it can be considered a specialization.

Note that anti-symmetry may also be violated by number and counting. These may not be considered equivalent, but it could be argued that each is a specialization of the other in the context of searching for educational material. This again may be attributed to the use of semantic associations which are not constrained by the rules of an axiomatic structure and the fact that containment of associated regions need not be strict.

Consequences of the Lack of Transitivity. Search engines like Infoseek and Yahoo can display paths through a browse structure that lead to a topic being searched. Without transitivity, this is not possible. The MMWG did not view this as an issue since the relationships defined are sufficient to answer the underlying question of where to look to find out more (or more specific) information. Moreover, the metadata associated to an object will be primarily for the use of intelligent engines and agents and not for display.

(Note: What might create a difficulty is the notion of a taxonomic stairway in the IMS metadata specifications. This notion does not appear in LOM, so will be ignored for now. -RR-)

Comment on the Names "Level 1", "Level 2", and "Level 3". The MMWG gave some serious thought to this. Words like "elementary", "intermediate", and "advanced" were rejected because they are likely to mean different things to different people. Using educational levels to name the taxonomies suffered from even more problems: meanings differ with geographic region, educational levels focus on subject matter rather than vocabulary, and educational levels are inaccurate in many of the non-traditional learning contexts. The MMWG settled on neutral terminology with no connotations other than the vague notion that the three levels are ordered in terms of a natural progression through mathematics.

V Other LOM Tags and Classifications

Overview of Work. At the August meeting the MMWG saw a non-enabled set of screens that will be used by NEEDS to associate metadata with objects in their database. NEEDS has agreed to extend these screens to include information needed for tagging mathematical objects. To make this work, it is necessary to decide:

  1. Which LOM tags will be used "as is" and which will be extended.
  2. Which metadata tags will be mandatory for mathematical resources.
  3. Which metadata tags will be optional but available for mathematical resources.
  4. For each tag requiring a "best practices" list, which list will be used or, if necessary, created.
  5. For tags with generic values (e.g. "very low" to "very high") what, if any, definitions or further explanations will be associated with these values in the context of mathematics.
  6. For Classification, which taxonomies (other than content classification) will be used. It might be necessary to invent some!

The following is a first cut at 1 - 3. Much of this was done after the August meeting and represents the opinion of the author and not necessarily of the MMWG.

IMS Core Metadata Elements. The IMS best practices document identifies a set of elements as "Core" elements. It is suggested that these be included as mandatory or optional as follows. Some suggestions/explanations are given. See the appendix for more.

Element

Suggestions by RR

General.Identifier

Mandatory. (What about DOI?)

General.Title

Mandatory.

General.CatalogEntry

Optional. Mandatory in a digital library.

General.Language

Mandatory. (What happens when an abstract is in English and the resource in another language?)

General.Description

Optional

LifeCycle.Version

Optional with 0.0 as a default.

LifeCycle.Contribute (.role, .entity, .date)

Mandatory (but not date). Could be complicated: should require EITHER author OR designer OR . . .

MetaMetaData.Identifier

Mandatory but assigned by system

MetaMetaData.MasterScheme

Mandatory but assigned by system

MetaMetaData.Language

Mandatory but assigned by system (value = EN)

Technical.format

Mandatory with default. (MIME type)

Technical.location

Optional. (could be URL) See DOI again?

Rights.Cost

Optional. (How does one deal with pricing structures other than a single price?)

Rights.CopyrightOtherRestrictions

Mandatory. (LOM values are "yes" and "no")

Rights.Description

Mandatory. Suggest NEEDS scheme: freeware, shareware, commercial, etc.

Classification.purpose

Optional. Needs best practice list

Classification.description

Optional. Textual description.

Classification.Keywords

Optional.

Recommended Additional Mandatory or Optional Elements. Most of the following elements were discussed during the August meeting. A few have been added by the author of this document. Only those elements not in the IMS Core (see above) are included.

Element

Suggestions by RR

   

General.Keywords

Optional

General. Structure

Possibly make Optional as drop-down list

General.AggregationLevel

Either skip or extend and make optional.

LifeCycle.Status

Optional (draft, final version, etc.)

Technical.Requirements (.type etc.)

Mandatory with empty set as default

Technical.InstallationRemarks

Optional (textual description)

Technical.Size

Optional: (size in bytes)

Educational.LearningResourceType

Mandatory with MMWG-suggested list needed.

Educational.PedagagicApproach

Optional: This is an extension of LOM

Educational.SemanticDensity

Mandatory: Needs further description

Educational.IntedendEndUserRole

Optional: Needs best practices list

Educational.LearningContext

Mandatory: Needs best practices list

Educational.TypicalAgeRange

Optional

Educational.Difficulty

Mandatory

Educational.Description

Optional

Annotation.Person

Optional

Annotation.Date

Optional

Annotation.Description

Optional

Clasification.TaxonPath

Mandatory: Include subject classification.
Optional: Educational Objective as relates to standards, Accessibility classification.

Other LOM elements that will need to be considered in the future. "Optional" means there will be a place to input this data but need not be defined for all documents.

Discussion of Individual Elements. The following is an example of the kind of work that must be done on some elements. The element in question is Educational.SemanticDensity. In mathematics, this might be interpreted as describing the density of specialized notation and jargon but is difficult to define in absolute terms. Yet it must be defined, if even only roughly. If left undefined, a high school teacher might consider a college text to be high in semantic density whereas a research mathematician might consider the same text to be low. The following is a draft attempt by RR.

LOM value

LOM description

Characteristics

0

Very Low

No notation. No jargon. Discursive language for non-technical audience.

1

Low

Minimal notation and jargon, mostly explained. "Scientific American" style exposition.

2

Medium

Notation freely used but explained. Language and style typical of textbooks.

3

High

Written using compact notation in a style suitable for a general audience of professional mathematicians. (E.g. expository articles in the AMS Bulletin.)

4

Very High

Written in compact notation in a style suitable for an audience of experts; often found in research journals.

This is the only example of LOM definitions, other than content, given in this report.

VI Implementation, Work, and Cooperation Commitments

Cooperation and commitments from digital library projects

VII Timelines

Deadline/Date

Benchmark/Event

October 24 - 30

Metadata Mini-course at WebNet99 in Hawaii

Regular WebNet meeting from Oct 25 - 30.

November 17

Deadline for completing draft of content taxonomies (level 1 and level 2)

Deadline for prototypes to be displayed at FMC meeting, December 1.

November 18 & 19

Next MMWG meeting, Columbus. Ohio.

December 1 - 5

The Future of Mathematical Communication conference at MSRI

December 31

Deadline for having MMWG proposal for public release

January 19

Panel presentation at joint annual mathematics meetings, Washington, DC

Meeting of MMWG

April 12 -15

NCTM annual meeting, Chicago, Ill.

VIII Attendees and Acknowledgements

The MMWG August meeting was attended by the following persons in alphabetical order. Participation varied from one hour to two days.

Don Albers (Mathematical Association of America)
Joe Buhler (Deputy Director, Mathematical Sciences Research Institute)
David Collinge (Senior Product Manager for Central Media, Pearson Education)
Mary Craven (Department Chair, Punaho School, Honolulu, HI)
Charles Drucker (Headlund Digital Media)
Wade Ellis (West Valley College, Saratoga, CA)
Ann Jensen (Mathematics Librarian, UC Berkeley)
Gene Klotz (Director, the Math Forum)
Brandon Muramatsu (Director, NEEDS project)
Ruth Radetsky (Middle School Teacher, San Francisco)
Robby Robson (Oregon State University)
Len Simutis (Director, Eisenhower National Clearninghouse)
Linda Yamomota (Mathematics Librarian, Stanford University)

Acknowledgements. The Eisenhower National Clearinghouse generously supported the August MMWG meeting through travel grants to some participants. NEEDS and UC Berkeley hosted the meeting. Brandon Muramatsu did a wonderful job of making local arrangements. Brandon Muramatsu and Greg Paschall provided technical assistance as well. Gene Klotz helped greatly in the pre-organization. A note of thanks goes to Mike Hodges at Pearson Education for taking an active interest in our efforts.

Robby Robson

August 17, 1999

APPENDIX

See http://math-classes.orst.edu/metadata/docs/LOM35_annotated.html