Thursday, August 5, 2010

Controlled vocabularies

Adherence to a standardized scheme of categorization is widely recognized and accepted as valid and useful to facilitate information management, particularly its storage and access. The categories in one of these schemes capture all the conceptual categories that make up an information space, the particular conceptual universe that is being considered. This is a tall order, and information professionals have developed a variety of schemes to address a multitude of such environments. Dewey Decimal and the Library of Congress are two of the most popular schemes whose categorization can reference the conceptual contents of the resources, works, or objects, physical or electronic.

The pragmatic application of this principle has shown to be useful and sound but it places a strong demand on those who implement them, maintain them and use them. Categorization systems must account for the rapid growth of resources, the innovation and inventive of authors and creators, and the evolution of language. All of these are inherent properties of human nature and represent a moving target for the information profession.

These demands are not new but have become more visible over the last years due to the explosion in volume, quality and variety of information generated. To keep up with this pressure, the information profession developed tools such as dictionaries, index, thesauri, lists, pathfinders, suggested materials, etc.

Some of those tools are referred to as controlled vocabularies, and they expand the classes in the corresponding categorization scheme. They are formed by either keywords that represent categories, or semantic constructions that relate two or more concepts. It is important to emphasize that controlled vocabularies are used as surrogates to represent content in information objects, and that they are used for storage and access, as entry points to the documents.

There are document-like surrogates, such as abstracts and summaries, also used as entry point in various systems but they are not considered controlled vocabularies. At the lowest level in the hierarchical taxonomy of the categorization schemes are the specific keywords to be used for objects in that category. Those keywords are used alone or in combination and are assumed to sufficient to describe all of the significant information in the particular domain.

No comments:

blogger logo