Thesaurus
Contents
Definition
A hierarchical arrangement of related words and phrases often displayed in systematized lists of synonyms
Description
Thesauri - are based on concepts and show relationships between terms. Relationships commonly expressed in a thesaurus include hierarchy, equivalence (synonyms) and association or relationships. These relationships are generally represented by the notation BT (broader term), NT (narrower term), USED FOR (synonym), and RT (associative or related term). Associative relationships may be more detailed in some schemes. For example, the INIS Thesaurus has defined eight relationships, many of which are associative. Preferred terms for indexing and retrieval are identified. Entry terms (or non-preferred terms) point to the preferred terms to be used for each concept. There are standards for the development of monolingual thesauri (NISO 1998; ISO 1986) and multilingual thesauri (ISO 1985). Many thesauri are large; they may include more than 50,000 terms. Most were developed for a specific discipline or a specific product or family of products. For example, the INIS Multilingual Thesaurus contains over 40,000 terms and available in seven languages.
there are two ISO standards describing thesauri structure. [ISO2788] describes monolingual thesauri, while [ISO5964] is for multilingual thesauri. Here, we shall discuss thesauri as they are defined in the ISO standards, while noting that in practice many users extend the structure somewhat and, in some cases, the term is applied to structures differing substantially from what is described here. Thesauri basically take taxonomies as described above and extend them to make them better able to describe the world by not only allowing subjects to be arranged in a hierarchy, but also allowing other statements to be made about the subjects.
Typical Thesaurus Structure
[ISO2788] provides the following properties for describing subjects:
BT (Broader Term) - refers to the term above the current one in the hierarchy (term with wider or less specific meaning). In practice some systems allow multiple BTs for one term, while others do not.
NT (Narrower Term) - an inverse property known which is implied by the BT.
SN (Scope Note) - is a string attached to the term explaining its meaning within the thesaurus. This can be useful in cases where the precise meaning of the term is not obvious from context (i.e. technical solution vs solution in chemistry).
USE (a specific term instead) - refers to another term that is to be used instead of the current term and implies that the terms are synonymous (an inverse property known as UF or USED FOR). For example, on 'topic navigation maps' we could put a 'USE' property referring to 'topic map'. This would mean that we recognize the term 'topic navigation map', but that 'topic maps' means the same thing and we encourage the use of 'topic maps' instead. If we do this we would also have a 'UF' property on 'topic map' referring to 'topic navigation map', since this is implied by the 'USE' relationship.
TT (Top Term) - refers to the topmost ancestor of this term. The term at the other end of this property is the one that would be found by following the 'BT' property until you reach a term that has no 'BT'. This property is strictly speaking redundant, in the sense that it doesn't add any information, though it may be convenient.
RT (Related Term) - refers to a term that is related to the current term, without being a synonym or a broader/narrower term. For 'topic map' we could use this to indicate that 'subject-based classification' and 'ontology' are terms related to 'topic map'. One could say that taxonomies are thesauri that only use the BT/NT properties to build a hierarchy, and don't make use of any of the properties described below, so it could be said that every thesaurus contains a taxonomy. In short, thesauri provide a much richer vocabulary for describing the terms than taxonomies do and so are much more powerful tools. As can be seen, using a thesaurus instead of a taxonomy would solve several practical problems in classifying objects and also in searching for them.
International Standards for Thesauri Development
There are several international standards which define the basic rules for thesaurus development:
- UNESCO Guidelines for the establishment and development of monolingual thesauri. 1970 (followed by later editions in 1971 and 1981)
- ISO 2788 Guidelines for the establishment and development of monolingual thesauri. 1974 (revised 1986)
- ISO 5964 Guidelines for the establishment and development of multilingual thesauri. 1985
- ISO 25964 Thesauri and interoperability with other vocabularies. Part 1 - Thesauri for information retrieval published 2011; Part 2 - Interoperability with other vocabularies published 2013.
INIS Multilingual Thesaurus
INIS Thesaurus is one of the main products of the International Nuclear Information Systems and is the result of a systematic study performed by subject specialists at the INIS Secretariat and INIS Member States.