Friday, July 30, 2010

Retrieving information, or documents?

The answer depends on whether one document in its totality satisfies the totality of the information need. In other words, a single document completely satisfies the user’s need. Ideally, the document will perfectly fit the information need without adding anything to it.

But information needs come in all shapes and forms whereas documents are pretty much fixed. This is true even for dynamic documents once they are created. Dynamic documents are ad hoc creatures that nevertheless encapsulate static content from that point forward.

Most documents are, in fact, compositions that authors put together in response to some urge. However, documents are also composite of integrated elements. In text, these would be sections, paragraphs, chapters, etc. Going at an even deeper level of granularity, one can count words and even letters as atomic elements. Even phonemes may enter the scene as yet another form of information. But in the end, these elements are parts of the document. Current computer-based information systems deliver documents because documents are convenient unit of analysis.

However, much activity has been taking place in areas of computer-based processing of information. Text mining is one of them, and it refers to the identification of segments in a narrative that have a particular meaning and can be followed up or build upon to construct and support bold statements.

No comments:

blogger logo