Shannon and Weaver (S&W) presented an engineering solution to an information problem with something that would eventually be known as information theory. Though very mathematically, its philosophical roots are intrinsically linked to a physical property of matter known as entropy, or the amount of thermal energy, or kinetic energy of the electrons in a closed system. In another definition, entropy also measures a degree of disorder.
S&W equate uncertainty, or information that is not known, to entropy, or degree of disorder in the closed system, by assuming that the probabilistic amount of order in nature with respect to information, or that the natural state of organized information in a closed system is constant and equal to 100%. Thus the goal of their method is to totally reduce, minimize, and eliminate uncertainty.
Let’s consider the next examples. It is also important to know that S&W addressed the problem of noise reduction in electronic transmissions. In one case, a message is correctly transmitted and received. There is no loss of information and uncertainty is equal to 0. In another case, the message is garbled and the information is totally lost, so that uncertainty is equal to 1. Probability is bound by 0 and 1.
The usefulness of S&W appears in all its majestic importance when the message is partially garbled. Is in those cases when the importance of the assumptions used to calculate probabilities emerged. Faster computer processing and inexpensive data storage enabled better corrective solutions. S&W provided the theoretical approach and computer technologies allowed its implementation.
But in general, with respect to computer-based information processing, and in particular with respect to information retrieval, there is great temptation to use the theory without realizing that the environment for which it was developed resulted in a closed system. Language utilization for communication is an open system.
In other words, information theory has limitations. It behooves the researcher and student of information processing to understand that S&W bound this framework and environment in a very particular way. The enticing opportunity of information theory should be balanced with the reality of the results.
For these reasons, one should be asking why is it that systems work at all when using this and other unsuitable theories rather than trying to gain efficiency on processes that are theoretical flawed.
Saturday, September 18, 2010
Sunday, September 12, 2010
The Information Science Space
Living organisms are consumers of information, indicating the importance of asking basic questions about human consumption of information. Basic human activities, such as breathing require human processing of information.
Processing of information is a cognitive activity that is manifested through thoughts and actions. Cognition is, therefore, an important field of study for Information Science thus part of the Information Science Space.
The expansion of computer technologies in the 1950s boosted Cognitive Science. Cognitive theories, models and systems would be implemented and tested quickly and easily. As computer information systems gained territory in daily life, their role in society cannot be denied.
The field of study known as human-computer interaction (HCI), as several others in the interface of computer technology with humans and with information, has established the bona-fide inclusion of computer technology in the Information Science Space.
NOTE: The study of computer technology takes place in a family of fields under different disciplines such as computer science and information systems.
Because human life does not take place in a vacuum but is part of an interactive community, several other sociological aspects emerge. Various components of this rich human-to-human and human-to information interactions are substantive for multiple disciplines from the humanities to the social sciences. This is the reason why the study of Information Science is a multidisciplinary study.
Several fundamental issues:
Processing of information is a cognitive activity that is manifested through thoughts and actions. Cognition is, therefore, an important field of study for Information Science thus part of the Information Science Space.
The expansion of computer technologies in the 1950s boosted Cognitive Science. Cognitive theories, models and systems would be implemented and tested quickly and easily. As computer information systems gained territory in daily life, their role in society cannot be denied.
The field of study known as human-computer interaction (HCI), as several others in the interface of computer technology with humans and with information, has established the bona-fide inclusion of computer technology in the Information Science Space.
NOTE: The study of computer technology takes place in a family of fields under different disciplines such as computer science and information systems.
Because human life does not take place in a vacuum but is part of an interactive community, several other sociological aspects emerge. Various components of this rich human-to-human and human-to information interactions are substantive for multiple disciplines from the humanities to the social sciences. This is the reason why the study of Information Science is a multidisciplinary study.
Several fundamental issues:
- Is it true that living organisms are consumers of information?
- If so, one could ask if there are any types of human processes and behavior not immediately related to information.
- One can define information processing very broadly so as to include breathing, or very narrowly so as to include memorization.
- Information science is not only concerned with information use but also with all processes related to information.
- Understanding the use of information and other information processes remains difficult in part due to the lack of agreement about how specifically information needs to be defined and operate upon.
Thursday, August 19, 2010
Search interfaces
Information systems can be said to have three major components: technology, information, and the user base. The last Century brought up several generational cycles of technology, particularly on the hardware side. Miniaturization improved prices and faster transactional speeds, which increased availability of inexpensive solutions for business and individuals. The growth in number and diversity of software applications resulted in an explosion of possibilities for the creation and the processing of information.
There is software to facilitate writing, to read, to store, to analyze, to modify, to publish, etc. Creations can be in text, in images, in moving images, sound, etc. Increased connectivity has empowered users who can now communicate and get information in ways only imagined before. But the transition to a totally automated environment has not been smooth across the information profession landscape.
The most obvious example is the old card catalog, which is now computer based and present in most libraries. It is usually known as the online catalog. Information that used to be available in the 3x5 cards is now available through computer terminals. The replication is so perfect that it reveals an issue: With a few exceptions, online catalogs are for the most part simplistic replicas of the old manual catalogs.
And it is this reality what speaks loud about an even bigger issue, that the design of those systems do not exploit the capabilities of the technology to go beyond what is known and use it to expand human capabilities and accomplish new tasks or find new uses of the information.
Perhaps the online catalogs and their complex options are empowering to some users but one could ask if they may not also explain the popularity of simple interfaces, such as those offered by google, yahoo and other Internet search engines. The multiple access points offered by many library interfaces are separated as individualized operation. Most, or all of them, such as advanced search capabilities, are consolidated into internal processes and functionality in the services mentioned above. Whereas there is added complexity to the internal operations, their main goal seems to be the improvement of the user experience. And the users come back to use them again and again.
The question is, why do libraries settle for interfaces that fail to fully exploit the power of computer systems?
There is software to facilitate writing, to read, to store, to analyze, to modify, to publish, etc. Creations can be in text, in images, in moving images, sound, etc. Increased connectivity has empowered users who can now communicate and get information in ways only imagined before. But the transition to a totally automated environment has not been smooth across the information profession landscape.
The most obvious example is the old card catalog, which is now computer based and present in most libraries. It is usually known as the online catalog. Information that used to be available in the 3x5 cards is now available through computer terminals. The replication is so perfect that it reveals an issue: With a few exceptions, online catalogs are for the most part simplistic replicas of the old manual catalogs.
And it is this reality what speaks loud about an even bigger issue, that the design of those systems do not exploit the capabilities of the technology to go beyond what is known and use it to expand human capabilities and accomplish new tasks or find new uses of the information.
Perhaps the online catalogs and their complex options are empowering to some users but one could ask if they may not also explain the popularity of simple interfaces, such as those offered by google, yahoo and other Internet search engines. The multiple access points offered by many library interfaces are separated as individualized operation. Most, or all of them, such as advanced search capabilities, are consolidated into internal processes and functionality in the services mentioned above. Whereas there is added complexity to the internal operations, their main goal seems to be the improvement of the user experience. And the users come back to use them again and again.
The question is, why do libraries settle for interfaces that fail to fully exploit the power of computer systems?
Tuesday, August 10, 2010
Vocabularies to represent information
Documents may include narratives, arguments, sections, and multiple other semantic components that make up an interrelated network of concepts. To represent all of that information at least two types of descriptors are used. One is generated from free language and another is generated from a list of terms, dubbed controlled vocabulary. 
Free language descriptors are also called folksonomy and any term would be acceptable for the most part. Controlled vocabulary is a list of terms (a thesaurus) or lexical organizations (ontology) to account for the most salient, or significant concepts in the collection. A third type is often left alone as if it were invisible, unnecessary or useless, and perhaps taken for granted because of its immediacy: the actual vocabulary in the text of the documents. In all cases, one important issue stems from the limitations and characteristics of language and it is in reference to the decisions about which information elements to include or to exclude.
These three tools are used in the creation of representative surrogates of the documents. All of them have positive and negative characteristics. Their evaluation as sources must consider standardization, scope of coverage, specificity, exhaustivity, length in number of entries, etc. All of them have tradeoffs.
For example, free text representation may be very flexible but lacks standardization across systems. Controlled vocabularies lack flexibility and to properly include the specificity of significant concepts may be difficult. The scope of the language in the document set is only one of multiple potential ways to represent a concept because there exists a multiplicity of possible ways, such as synonyms.
Representation of information is a difficult chore.
Free language descriptors are also called folksonomy and any term would be acceptable for the most part. Controlled vocabulary is a list of terms (a thesaurus) or lexical organizations (ontology) to account for the most salient, or significant concepts in the collection. A third type is often left alone as if it were invisible, unnecessary or useless, and perhaps taken for granted because of its immediacy: the actual vocabulary in the text of the documents. In all cases, one important issue stems from the limitations and characteristics of language and it is in reference to the decisions about which information elements to include or to exclude.
These three tools are used in the creation of representative surrogates of the documents. All of them have positive and negative characteristics. Their evaluation as sources must consider standardization, scope of coverage, specificity, exhaustivity, length in number of entries, etc. All of them have tradeoffs.
For example, free text representation may be very flexible but lacks standardization across systems. Controlled vocabularies lack flexibility and to properly include the specificity of significant concepts may be difficult. The scope of the language in the document set is only one of multiple potential ways to represent a concept because there exists a multiplicity of possible ways, such as synonyms.
Representation of information is a difficult chore.
Monday, August 9, 2010
Document description and categories
Categorization of information, and the use of categories as a form of surrogates of objects appeared in early history. Documents of one type, such as bills of sale, would be organized in generally the same location. They would also be separate from other documents, such as cargo inventories. This principle can still be seen libraries all over the world. Organizing documents into categories facilitates access to the documents.
As the number and variety of documents grew, the number of categories also grew in size and complexity. The creation of additional categories and subcategories made sense, which in turn, increased the specificity of descriptiveness of that particular scheme. Categories had always been considered access points to the documents but increased the specificity of new schemes affected the perception of what their role could be. In time, from being access or entry points to documents by describing what the documents were about, or aboutness, the subcategories were promoted to being used as representations of the information in the documents.
If documents are analogous to capsules of information, information is inside of documents. Categories and subcategories were useful to describe the capsules, or documents. Today, their usefulness has expanded to describe the capsule’s contents. The problem is that these representation constructions cannot capture all but only a limited set of all the information in the document set.
Users of systems never know what information is not being included in the set of surrogate representations.
This problem is a fundamental failure of document retrieval systems that use this type of representation alone. If some information is not represented, it will not be found.
As the number and variety of documents grew, the number of categories also grew in size and complexity. The creation of additional categories and subcategories made sense, which in turn, increased the specificity of descriptiveness of that particular scheme. Categories had always been considered access points to the documents but increased the specificity of new schemes affected the perception of what their role could be. In time, from being access or entry points to documents by describing what the documents were about, or aboutness, the subcategories were promoted to being used as representations of the information in the documents.
If documents are analogous to capsules of information, information is inside of documents. Categories and subcategories were useful to describe the capsules, or documents. Today, their usefulness has expanded to describe the capsule’s contents. The problem is that these representation constructions cannot capture all but only a limited set of all the information in the document set.
Users of systems never know what information is not being included in the set of surrogate representations.
This problem is a fundamental failure of document retrieval systems that use this type of representation alone. If some information is not represented, it will not be found.
Sunday, August 8, 2010
Controlled Vocabularies and Representation
An information space is defined by a set or collection of documents. A list of terms is built to represent the information elements in the information space so that terms, either alone or in combination, can be used to represent all of the documents in the set. The list of terms is known as the controlled vocabulary. A short list of the terms can be used as a surrogate of each document. The list of terms is assumed to convey an accurate representation of the information space, but also of the particular properties of the document set implying that a finite list of descriptors can be used to represent all the information in the documents, or at least the most relevant information. In addition, the representation will have to accurately capture the purpose, scope, audience, and level of expertise found in the documents.
The parallel with the use of letters is clear. After all, words and their myriad of meanings are the result of combining a finite number of letters. Likewise, a finite number of carefully selected terms can represent the universe of ideas and concepts in the document collection. This powerful argument is behind the creation of lists of keywords as representational building blocks of complex information concepts.
But, is it true that words can represent everything?
The parallel with the use of letters is clear. After all, words and their myriad of meanings are the result of combining a finite number of letters. Likewise, a finite number of carefully selected terms can represent the universe of ideas and concepts in the document collection. This powerful argument is behind the creation of lists of keywords as representational building blocks of complex information concepts.
But, is it true that words can represent everything?
Thursday, August 5, 2010
Controlled vocabularies
Adherence to a standardized scheme of categorization is widely recognized and accepted as valid and useful to facilitate information management, particularly its storage and access. The categories in one of these schemes capture all the conceptual categories that make up an information space, the particular conceptual universe that is being considered. This is a tall order, and information professionals have developed a variety of schemes to address a multitude of such environments. Dewey Decimal and the Library of Congress are two of the most popular schemes whose categorization can reference the conceptual contents of the resources, works, or objects, physical or electronic.
The pragmatic application of this principle has shown to be useful and sound but it places a strong demand on those who implement them, maintain them and use them. Categorization systems must account for the rapid growth of resources, the innovation and inventive of authors and creators, and the evolution of language. All of these are inherent properties of human nature and represent a moving target for the information profession.
These demands are not new but have become more visible over the last years due to the explosion in volume, quality and variety of information generated. To keep up with this pressure, the information profession developed tools such as dictionaries, index, thesauri, lists, pathfinders, suggested materials, etc.
Some of those tools are referred to as controlled vocabularies, and they expand the classes in the corresponding categorization scheme. They are formed by either keywords that represent categories, or semantic constructions that relate two or more concepts. It is important to emphasize that controlled vocabularies are used as surrogates to represent content in information objects, and that they are used for storage and access, as entry points to the documents.
There are document-like surrogates, such as abstracts and summaries, also used as entry point in various systems but they are not considered controlled vocabularies. At the lowest level in the hierarchical taxonomy of the categorization schemes are the specific keywords to be used for objects in that category. Those keywords are used alone or in combination and are assumed to sufficient to describe all of the significant information in the particular domain.
The pragmatic application of this principle has shown to be useful and sound but it places a strong demand on those who implement them, maintain them and use them. Categorization systems must account for the rapid growth of resources, the innovation and inventive of authors and creators, and the evolution of language. All of these are inherent properties of human nature and represent a moving target for the information profession.
These demands are not new but have become more visible over the last years due to the explosion in volume, quality and variety of information generated. To keep up with this pressure, the information profession developed tools such as dictionaries, index, thesauri, lists, pathfinders, suggested materials, etc.
Some of those tools are referred to as controlled vocabularies, and they expand the classes in the corresponding categorization scheme. They are formed by either keywords that represent categories, or semantic constructions that relate two or more concepts. It is important to emphasize that controlled vocabularies are used as surrogates to represent content in information objects, and that they are used for storage and access, as entry points to the documents.
There are document-like surrogates, such as abstracts and summaries, also used as entry point in various systems but they are not considered controlled vocabularies. At the lowest level in the hierarchical taxonomy of the categorization schemes are the specific keywords to be used for objects in that category. Those keywords are used alone or in combination and are assumed to sufficient to describe all of the significant information in the particular domain.
Tuesday, August 3, 2010
Processing text, or word processing?
Text processing and word processing can be confusing because text and words seem to be the same. However, there is an important distinction when they are used in reference to computer-based processing. Text normally refers to plain encoding of alphanumeric characters. Files encoded as text are also known as text files. They are the most basic and common type of computer encoding. 
Word processing adds the formatting of text and other information related to the text. These files are normally encoded differently than text files. These are normally referred to as binary files. Users can access a text file directly from the operating system but binary files require a type of program to decode the data.
The astute reader can see that a document so defined may not only be of words but also of any other types of elements. This is, in fact, true. A binary file can be of anything and a program would be required to decode the document, identify its elements and process them whether for display or any other type of action. In particular, a program used to access a word processing file, requires a program known as a word processor, such as the popular Microsoft Word for Windows.
Summarizing, a binary file can encode all types of documents, including still images or audio. A word processing file is a type of binary file, which may also include format, location and other such information about the file's content. A text file is formed by the alphanumeric characters of the text. Text files are the common communication encoding that different computers can use. For this reason, text is the de-facto communication encoding in the web and computer-to-computer protocols across the Internet.
Word processing adds the formatting of text and other information related to the text. These files are normally encoded differently than text files. These are normally referred to as binary files. Users can access a text file directly from the operating system but binary files require a type of program to decode the data.
The astute reader can see that a document so defined may not only be of words but also of any other types of elements. This is, in fact, true. A binary file can be of anything and a program would be required to decode the document, identify its elements and process them whether for display or any other type of action. In particular, a program used to access a word processing file, requires a program known as a word processor, such as the popular Microsoft Word for Windows.
Summarizing, a binary file can encode all types of documents, including still images or audio. A word processing file is a type of binary file, which may also include format, location and other such information about the file's content. A text file is formed by the alphanumeric characters of the text. Text files are the common communication encoding that different computers can use. For this reason, text is the de-facto communication encoding in the web and computer-to-computer protocols across the Internet.
Sunday, August 1, 2010
Information Retrieval Systems
Applying computer technologies to the number crunching needs of the day was one of the drivers in the implementation and maturity of computer technologies, but text processing was not far behind. After all, mathematics and text converged in areas related to encoding, cryptology and compression.
Engineers were aware that the new technologies of the 1950s might make possible multiple applications, including those to create, edit, and otherwise process text documents. By the 60s the transistor had enabled a revolution in miniaturization. At about the same time, advances in database technology were providing ideas about storage and access. By the 70s, network technology took computers to a larger scale.
During all this time, a small cadre of scientists and practitioners in the little known art of information retrieval had been researching and implementing systems to process text. This included the creation and editing, as well as other supporting tasks such as the storage and retrieval of documents, which were, in their own right, complete, elaborate and complex.
Terms like text processing, word processing and information retrieval were still confusing but were slowly starting to convey distinctive and different types of activities. The idea of a system that processes information was not far fetched anymore and the dream of Vannevar Bush was now possible. Information systems consolidated networks, databases, and all the newly implemented ideas related to the process of information. The information retrieval (IR) system would only be one component.
Engineers were aware that the new technologies of the 1950s might make possible multiple applications, including those to create, edit, and otherwise process text documents. By the 60s the transistor had enabled a revolution in miniaturization. At about the same time, advances in database technology were providing ideas about storage and access. By the 70s, network technology took computers to a larger scale.
During all this time, a small cadre of scientists and practitioners in the little known art of information retrieval had been researching and implementing systems to process text. This included the creation and editing, as well as other supporting tasks such as the storage and retrieval of documents, which were, in their own right, complete, elaborate and complex.
Terms like text processing, word processing and information retrieval were still confusing but were slowly starting to convey distinctive and different types of activities. The idea of a system that processes information was not far fetched anymore and the dream of Vannevar Bush was now possible. Information systems consolidated networks, databases, and all the newly implemented ideas related to the process of information. The information retrieval (IR) system would only be one component.
Search engines
The IR system was the component in the system that located a document to satisfy a given query that a user would create. The paradigm consider that the user query presented to the system was in the form of a template that started with “I want a document that contains…” and then the user would type a list of words and perhaps some parameters that provided a semantic relationship among the words. These parameters would be BOOLEAN (such as AND, OR, NOT) or distance (as how physically close two words should be), or indicate that synonyms or other semantic expansion should be used, etc. The system would then identify a document, or several documents that matched the query and possibly present them in a type of ranking order.
The only difference between an IR system and a search engine is the name, and may be also some of the document processing functionality.
The only difference between an IR system and a search engine is the name, and may be also some of the document processing functionality.
Friday, July 30, 2010
Retrieving information, or documents?
The answer depends on whether one document in its totality satisfies the totality of the information need. In other words, a single document completely satisfies the user’s need. Ideally, the document will perfectly fit the information need without adding anything to it.
But information needs come in all shapes and forms whereas documents are pretty much fixed. This is true even for dynamic documents once they are created. Dynamic documents are ad hoc creatures that nevertheless encapsulate static content from that point forward.
Most documents are, in fact, compositions that authors put together in response to some urge. However, documents are also composite of integrated elements. In text, these would be sections, paragraphs, chapters, etc. Going at an even deeper level of granularity, one can count words and even letters as atomic elements. Even phonemes may enter the scene as yet another form of information. But in the end, these elements are parts of the document. Current computer-based information systems deliver documents because documents are convenient unit of analysis.
However, much activity has been taking place in areas of computer-based processing of information. Text mining is one of them, and it refers to the identification of segments in a narrative that have a particular meaning and can be followed up or build upon to construct and support bold statements.
But information needs come in all shapes and forms whereas documents are pretty much fixed. This is true even for dynamic documents once they are created. Dynamic documents are ad hoc creatures that nevertheless encapsulate static content from that point forward.
Most documents are, in fact, compositions that authors put together in response to some urge. However, documents are also composite of integrated elements. In text, these would be sections, paragraphs, chapters, etc. Going at an even deeper level of granularity, one can count words and even letters as atomic elements. Even phonemes may enter the scene as yet another form of information. But in the end, these elements are parts of the document. Current computer-based information systems deliver documents because documents are convenient unit of analysis.
However, much activity has been taking place in areas of computer-based processing of information. Text mining is one of them, and it refers to the identification of segments in a narrative that have a particular meaning and can be followed up or build upon to construct and support bold statements.
Tuesday, July 27, 2010
Information representation
The search process has been greatly simplified throughout the years. A  user enters one or several words. The system finds the names of files  where the words appear. If the user supplies a combination of words, the  system can find them individually or in some type of semantic  relationship as in Boolean, or logic, operations. It is possible to ask  the equivalent of “I want documents that discuss giraffes in Costa  Rica”.
Indexing files form a representation system, a system representing the information in a collection of documents, a computer-based bibliographic representation. But there are other bibliographic representations. Document titles and other types of summaries are good examples of representation systems as well.
The reader will notice that each type of representation provides their own set of mutual semantic relationships, degree of specificity, and other attributes that correspond to their particular interpretations of the original information.
Indexing files form a representation system, a system representing the information in a collection of documents, a computer-based bibliographic representation. But there are other bibliographic representations. Document titles and other types of summaries are good examples of representation systems as well.
The reader will notice that each type of representation provides their own set of mutual semantic relationships, degree of specificity, and other attributes that correspond to their particular interpretations of the original information.
Monday, July 26, 2010
Indexing files
A naïve approach to document retrieval was  implemented in the early days of computer-based text processing:  simple word match. For this, the computer would read each word of each document, one word and one document at a time. The process included a comparison of each word with the sample word initially supplied by the user. The goal was to match sequences of bits, what the user provide to what was in the documents. The reader understands that this comparison was not at the level of meaning but at the level of symbols. Meaning is at the level of information and symbol is at the level of package. The symbol, or word, or sequences of bits and bytes, is the capsule of information.
The complexity of the operation would increase if two or more words were supplied. Moreover, the semantic relationship of the supplied words was also an issue, or rather how would the words be combined. In all, the processing of text using the sequential methodology of scanning was terribly expensive.
This obstacle required a new approach to use them taking advantage of their capabilities. This is the same approach that is at the foundation of many novel computer applications. Special intermediate files were created with a particular organization that encoded the relative position of words in the stored documents. These are the index files, or indexing. More specifically, the inverted index files. These files store a list of all the word in all the documents in a collection, including the name of the document (in terms of file name) and their position of the word in the document. This organization allows for all types of automatic, or computer-based, operations expanding the capabilities of human processors.
The complexity of the operation would increase if two or more words were supplied. Moreover, the semantic relationship of the supplied words was also an issue, or rather how would the words be combined. In all, the processing of text using the sequential methodology of scanning was terribly expensive.
This obstacle required a new approach to use them taking advantage of their capabilities. This is the same approach that is at the foundation of many novel computer applications. Special intermediate files were created with a particular organization that encoded the relative position of words in the stored documents. These are the index files, or indexing. More specifically, the inverted index files. These files store a list of all the word in all the documents in a collection, including the name of the document (in terms of file name) and their position of the word in the document. This organization allows for all types of automatic, or computer-based, operations expanding the capabilities of human processors.
Organizing and representing information
There are many types of documents. Also, they come in many formats. Thy go from small leaflets to volumes of books. In general, a document is a package of information created to preserve some particular focal information and items related to that information. The components are normally organized in some order that may be sequential, hierarchical, or a combination of both.
Packages are convenient containers but also serve for storage purposes. Documents with related content, or the information they carry, are organized in collections. Semantically speaking, in an abstract space of information, related documents would be place closer to each other than to unrelated documents. The semantic distance is a measure of how similar documents are to each other.
Electronic documents have particular characteristics that make them amenable to automatic processing, such as encryption and compression. Although it is important to differentiate information from the package wherein it exists, most processing of information, particularly computer-based information processing treats the symbol or package as equal to the information it carries. In other words, the paper is only a word, it is not really paper.
Likewise, computer-based information processing is really processing of words, symbols, bits and bytes, but not meanings and concepts.
Packages are convenient containers but also serve for storage purposes. Documents with related content, or the information they carry, are organized in collections. Semantically speaking, in an abstract space of information, related documents would be place closer to each other than to unrelated documents. The semantic distance is a measure of how similar documents are to each other.
Electronic documents have particular characteristics that make them amenable to automatic processing, such as encryption and compression. Although it is important to differentiate information from the package wherein it exists, most processing of information, particularly computer-based information processing treats the symbol or package as equal to the information it carries. In other words, the paper is only a word, it is not really paper.
Likewise, computer-based information processing is really processing of words, symbols, bits and bytes, but not meanings and concepts.
Thursday, July 22, 2010
Packages and capsules of information
There is so much information that even with computers we find it difficult to keep track of everything. For this reason alone we usually don’t directly examine the actual information objects but their representation. We go to an electronic bookstore and look at images, reviews and summaries of books. A search engine gives us titles and brief sentences from the web pages, or snapshots of images. To have contact with the actual information objects is expensive and time consuming. Just 100 years ago we needed to wait months to go from one part of the world to another. There are Internet applications that can take us around the globe for a virtual visit in seconds. If we want to get specialized pictures of exotic life forms we can probably find them in a book or in a video. We interact with representations most of the time. Even words are representations.
Representation of information is related to its use, to its storage for later use, to its retrieval by command. It is not only in reference to certain obvious types of media such as print and electronics but also to other more abstract forms as in the discussions of concept representation, organizational learning and knowledge, or mental structures.
Void of a definition of information, to look at representation one must at least define a unit of information and examine how it is packaged. Examining information as packages may be useful. After all, we are familiar with books, documents, and other packages of information and not with information itself. Packages of information are concrete and information is an abstract construct or phenomenon.
Based on the idea that information answers some information need, and that it exists in some package of information, identifiers of these questions are categorized according to type of question. The set of issues related to this categorization is known as information organization; the materials that are organized are the information packages and the capsules of information at varying degrees of specificity in each package.
Example of questions answered by different information capsules that may exist in one or several packages of information: Who wrote Tom Sawyer? What books did Mark Twain write? Who is Samuel Langhorne Clemens. All of these questions are related but separate. Each represents something, a capsule of information. Each question is a representation in and of itself. The answers also represent the same thing in an illustration of how a capsule of information may have multiple representations, and that capsules may be part of multiple packages.
A short linguistic construction that combines all of these related pointers would address all questions, and may be considered a representation. This construction could take the form of a sentence such as Samuel Langhorne Clemens, also known as Mark Twain, wrote the novel Tom Sawyer. A larger paragraph would expand the idea, convey the same topic and include new not asked pieces of information. Likewise, there are multiple books on the life and work of Mark Twain, which would span the topic even more.
Representation of information is related to its use, to its storage for later use, to its retrieval by command. It is not only in reference to certain obvious types of media such as print and electronics but also to other more abstract forms as in the discussions of concept representation, organizational learning and knowledge, or mental structures.
Void of a definition of information, to look at representation one must at least define a unit of information and examine how it is packaged. Examining information as packages may be useful. After all, we are familiar with books, documents, and other packages of information and not with information itself. Packages of information are concrete and information is an abstract construct or phenomenon.
Based on the idea that information answers some information need, and that it exists in some package of information, identifiers of these questions are categorized according to type of question. The set of issues related to this categorization is known as information organization; the materials that are organized are the information packages and the capsules of information at varying degrees of specificity in each package.
Example of questions answered by different information capsules that may exist in one or several packages of information: Who wrote Tom Sawyer? What books did Mark Twain write? Who is Samuel Langhorne Clemens. All of these questions are related but separate. Each represents something, a capsule of information. Each question is a representation in and of itself. The answers also represent the same thing in an illustration of how a capsule of information may have multiple representations, and that capsules may be part of multiple packages.
A short linguistic construction that combines all of these related pointers would address all questions, and may be considered a representation. This construction could take the form of a sentence such as Samuel Langhorne Clemens, also known as Mark Twain, wrote the novel Tom Sawyer. A larger paragraph would expand the idea, convey the same topic and include new not asked pieces of information. Likewise, there are multiple books on the life and work of Mark Twain, which would span the topic even more.
Wednesday, July 21, 2010
Does information have parts, or is it a whole?
George Miller wrote: The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information (The Psychological Review, 1956, vol. 63, pp. 81-97) advancing the idea of limits in the capacity of short term memory. In general, there seem to be limiting thresholds of the amount of information one can practically receive, decode, and integrate.
Specificity may be one a way to encapsulate information and help overcome this limitation. Telephone numbers can be used to illustrate this point. A telephone number has 10 digit separated into 3 parts, area code, exchange router and 4-digit number. It is easier to remember them in terms of chunks rather than by its individual digits. This example also illustrates a second point, the encapsulation of information.
Observing a surface, there are at least two measures of importance about it: perspective and distance. Perspective is the relative position from where the observation was done and determines what can be seen. Distance determines the detail, or resolution, of what is seen. These two are important when considering ambiguity.
These two measures appear with the same meaning and role in reference to an information package. Perspective is a function of personal inclination and preference adding bias to the interpretation of the package. Distance affects the semantic size (how narrow or how broad is a particular information unit) of the component in the package. The semantic size is the degree of specificity; what can be discerned as a unit of semantic value in the package.
How a unit of information is coded or stored will determine is processing because it points directly to the degree of specificity of that unit.
NOTE: Documents can be seen as units of information. Libraries and search engines treat documents as units of information. Books can be made up of self-contained chapters that are units in their own right.
Specificity may be one a way to encapsulate information and help overcome this limitation. Telephone numbers can be used to illustrate this point. A telephone number has 10 digit separated into 3 parts, area code, exchange router and 4-digit number. It is easier to remember them in terms of chunks rather than by its individual digits. This example also illustrates a second point, the encapsulation of information.
Observing a surface, there are at least two measures of importance about it: perspective and distance. Perspective is the relative position from where the observation was done and determines what can be seen. Distance determines the detail, or resolution, of what is seen. These two are important when considering ambiguity.
These two measures appear with the same meaning and role in reference to an information package. Perspective is a function of personal inclination and preference adding bias to the interpretation of the package. Distance affects the semantic size (how narrow or how broad is a particular information unit) of the component in the package. The semantic size is the degree of specificity; what can be discerned as a unit of semantic value in the package.
How a unit of information is coded or stored will determine is processing because it points directly to the degree of specificity of that unit.
NOTE: Documents can be seen as units of information. Libraries and search engines treat documents as units of information. Books can be made up of self-contained chapters that are units in their own right.
Monday, July 19, 2010
information as whole and part
Educational psychologists recognize different types of learning but in general they are understood as one of the products of getting exposed to information. Other posts have explored some of the characteristics of information. In most cases these effects are proposed as inferences because of the elusive nature of information and the difficulties of directly seeing its behavior. Observable phenomenon is due to human reaction to a given unit of information, which in many cases is difficult to compose. A good analogy was presented earlier with where without defining information, it is the substance encapsulated by a container.
In an attempt to understand the characteristics of information, one can see it as a substance of a certain degree of complexity that can be incorporated into a body of knowledge. From the opposite perspective, information can then be a portion of a body of knowledge, thus a body of knowledge can be seen as a whole, or network, of integrated components that could be defined as information.
In this description, information is a part of the whole and can have any size shape or form that is part of that whole. An independent unit on its own right, this component may have subcomponents which may be similar in nature, that is also be information units themselves. The idea of self-similarity can be borrowed from the physical sciences to describe this relationship at some level.
In an attempt to understand the characteristics of information, one can see it as a substance of a certain degree of complexity that can be incorporated into a body of knowledge. From the opposite perspective, information can then be a portion of a body of knowledge, thus a body of knowledge can be seen as a whole, or network, of integrated components that could be defined as information.
In this description, information is a part of the whole and can have any size shape or form that is part of that whole. An independent unit on its own right, this component may have subcomponents which may be similar in nature, that is also be information units themselves. The idea of self-similarity can be borrowed from the physical sciences to describe this relationship at some level.
Saturday, July 17, 2010
Graduate studies
Education at the graduate and the undergraduate levels are different in a fundamental way. Undergraduate studies emphasize basic skills and memorization. Students at the Master’s level get deep into their subject matter adhering to sets of best practices. At the Ph.D. level graduate students go into theory and models that are fundamental to their educational interest. The main difference between the two levels of graduate studies is not just the final objective or level of knowledge but at the degree of abstractedness of the content of study.
A graduate with a Master’s degree is expected to complete practical tasks whereas a graduate with a Ph.D. is expected to work at a theoretical level wherein different tasks can be developed, studied, and evaluated.
Their main difference in training is that a Ph.D. will mostly work at an abstract level whereas a Master’s has expertise to implement and complete practical tasks within prescribed rules, guidelines or paradigm with limited regards to the foundational theoretical constructs upon which those are built. Master's apply theory while Ph.D develop them.
Disclaimer: This brief discussion only applies to differences in the studies that Master's and Ph.D. students undertake, and not to the actual differences in intellectual capabilities or activities the same students end up doing in their professional life, which may clearly contradict what is presented here.
A graduate with a Master’s degree is expected to complete practical tasks whereas a graduate with a Ph.D. is expected to work at a theoretical level wherein different tasks can be developed, studied, and evaluated.
Their main difference in training is that a Ph.D. will mostly work at an abstract level whereas a Master’s has expertise to implement and complete practical tasks within prescribed rules, guidelines or paradigm with limited regards to the foundational theoretical constructs upon which those are built. Master's apply theory while Ph.D develop them.
Disclaimer: This brief discussion only applies to differences in the studies that Master's and Ph.D. students undertake, and not to the actual differences in intellectual capabilities or activities the same students end up doing in their professional life, which may clearly contradict what is presented here.
Friday, July 16, 2010
Physics, information theory
The previous posts describe some of the components and characteristics of an information space. It is clear for the discussions that a definition of information is elusive. In once case the problem of definition is equated to that of light. We can only see light as it reveals an object while remaining invisible itself. Is it possible that information is likewise recognized only as it reveals something else, or when is seen as something else acquires shape? Just as light is, could information be considered a form of energy?
This idea might not be too far fetched. After all, physicists discuss information as something that is lost, or something that is preserved. If permanence can be expressed as preservation or loss in quantifiable nomenclature characterization of variation in amount of information may be possible without having to define information. In other words, measurement is by proxy. Information gradients are problematic but their existence is in principle an accepted fact of physics.
Forcing the topic, one could also argue that information also carries a qualitative dimension, which the measure ignores. After all, an observable consequence may be measured but unknowns in the stimuli or in the interaction between object and component stimulus would raise questions about other associated factors. The result is an unknown environment with respect to the qualitative values of information, even as information gradients might be measured.
This terminology is related to a specific disciplinary endeavor known as information theory, which borrows heavily from the physical sciences, more in particular, from the second law of thermodynamics. This law, important for classical physics, establishes that a closed system neither gains nor loses energy. Claude Shannon would present his idea of information theory by referring to the gain or loss of information by a system. His ideas, mathematical in nature, have been fundamental in the development of digital networks and signal transmission.
But as modern physics advanced in the second part of the XX Century, and into the new millennium, the second law of thermodynamics is being questioned, in particular about subatomic particles. Will this have any theoretical value as a possible new model for information? Maybe it is too early to tell, or maybe it is an unrealistic wish. The fact is that, as many forms of energy that can be stored, channeled and used, information, its nature and a plausible and encompassing definition, remain elusive.
This idea might not be too far fetched. After all, physicists discuss information as something that is lost, or something that is preserved. If permanence can be expressed as preservation or loss in quantifiable nomenclature characterization of variation in amount of information may be possible without having to define information. In other words, measurement is by proxy. Information gradients are problematic but their existence is in principle an accepted fact of physics.
Forcing the topic, one could also argue that information also carries a qualitative dimension, which the measure ignores. After all, an observable consequence may be measured but unknowns in the stimuli or in the interaction between object and component stimulus would raise questions about other associated factors. The result is an unknown environment with respect to the qualitative values of information, even as information gradients might be measured.
This terminology is related to a specific disciplinary endeavor known as information theory, which borrows heavily from the physical sciences, more in particular, from the second law of thermodynamics. This law, important for classical physics, establishes that a closed system neither gains nor loses energy. Claude Shannon would present his idea of information theory by referring to the gain or loss of information by a system. His ideas, mathematical in nature, have been fundamental in the development of digital networks and signal transmission.
But as modern physics advanced in the second part of the XX Century, and into the new millennium, the second law of thermodynamics is being questioned, in particular about subatomic particles. Will this have any theoretical value as a possible new model for information? Maybe it is too early to tell, or maybe it is an unrealistic wish. The fact is that, as many forms of energy that can be stored, channeled and used, information, its nature and a plausible and encompassing definition, remain elusive.
Thursday, July 15, 2010
Meaning construction
When we attempt to make sense of our surroundings, by necessity, we notice only a fraction of phenomena. Otherwise the amount of signals would overwhelm our senses and overtax our attention. This is done by our mental mechanisms, which include filtering processes to direct our attention on what it deems important. This happens at every instant.
We are only aware of a minimum set of prioritized events. Out of that small set we attend to only a few of them, maybe a couple of them only while our senses keep receiving multiple signals. Those signals may be pre-coordinated, as in a face-to-face conversation when we hear words but also see gestures, or body language. For post-coordination of signals, the brain couples separate and unsynchronized partial channels. In both cases, the objective is to fit all the stimuli we receive within a coherent framework that explains them as a whole. This whole process is known as sense making.
Sense making and meaning construction are similar cognitive processes. Sense making is semi-internal because it works with external stimuli and organizes it into a coherent entity. Construction of meaning integrates the coherent entity into a structure in memory that already exists, expanding it, or creates a new structure. Any one of these structures may be novel or an ontological replica of another.
Meaning construction results in a greater structural entity than the interpretation of the initial collection of signals and stimuli. It is a purely internal cognitive process of integration or creation that either expands an existing structure or creates new ones.
At a first step, our senses capture signals, then they are cognitively joined to make a temporary whole or conceptual structure in memory that is interpreted and placed into a greater whole conceptual structure, also in memory. The need to consider these as two separate processes arises from the intermediate need to combine and interpret the separate signals first and to place it into another structure at a second stage.
Question: Are these structures, or structural entities, information, knowledge?
Question: Where is information?
We are only aware of a minimum set of prioritized events. Out of that small set we attend to only a few of them, maybe a couple of them only while our senses keep receiving multiple signals. Those signals may be pre-coordinated, as in a face-to-face conversation when we hear words but also see gestures, or body language. For post-coordination of signals, the brain couples separate and unsynchronized partial channels. In both cases, the objective is to fit all the stimuli we receive within a coherent framework that explains them as a whole. This whole process is known as sense making.
Sense making and meaning construction are similar cognitive processes. Sense making is semi-internal because it works with external stimuli and organizes it into a coherent entity. Construction of meaning integrates the coherent entity into a structure in memory that already exists, expanding it, or creates a new structure. Any one of these structures may be novel or an ontological replica of another.
Meaning construction results in a greater structural entity than the interpretation of the initial collection of signals and stimuli. It is a purely internal cognitive process of integration or creation that either expands an existing structure or creates new ones.
At a first step, our senses capture signals, then they are cognitively joined to make a temporary whole or conceptual structure in memory that is interpreted and placed into a greater whole conceptual structure, also in memory. The need to consider these as two separate processes arises from the intermediate need to combine and interpret the separate signals first and to place it into another structure at a second stage.
Question: Are these structures, or structural entities, information, knowledge?
Question: Where is information?
Wednesday, July 14, 2010
Sense making to construct meaning
As part our humanity, we attempt to make sense of our surroundings. By necessity, what we notice is only a minimal amount of phenomena to prevent overwhelming our senses and overtaxing our attention. As a result, our mental mechanisms include filtering processes to direct our attention on what it deems important at every instance. We are only aware of a minimum set that prioritizes events to give attention to only some of the signals some of our senses are receiving. The signals may be pre-coordinated, as in a face-to-face conversation where we hear words and calibrate them with the gestures, or body language. Post-coordination of signals is when the brain must couple separate and not synchronized partial channels. In both cases, the objective is to fit all the stimuli we receive within a coherent framework that explains them as a whole. This whole process is known as sense making.
Sense making and meaning construction are similar cognitive processes. Sense making is semi-internal in that it works with external stimuli. It organizes the external stimuli into a coherent entity. Construction of meaning integrates that coherent entity into a structure in memory that already exists or that is an ad hoc creation. The structure may be novel or an ontological replica of another. Meaning construction results in a greater structural entity than the interpretation of the initial collection of signals and stimuli. It is a purely internal cognitive process of integration or creation that either expands an existing structure or creates a new structure of knowledge.
Question: This sounds like data is transformed into knowledge, where is information?
Sense making and meaning construction are similar cognitive processes. Sense making is semi-internal in that it works with external stimuli. It organizes the external stimuli into a coherent entity. Construction of meaning integrates that coherent entity into a structure in memory that already exists or that is an ad hoc creation. The structure may be novel or an ontological replica of another. Meaning construction results in a greater structural entity than the interpretation of the initial collection of signals and stimuli. It is a purely internal cognitive process of integration or creation that either expands an existing structure or creates a new structure of knowledge.
Question: This sounds like data is transformed into knowledge, where is information?
Tuesday, July 13, 2010
Scientific Research?
Invalidating research by questioning the immediate applicability of research results is an unsustainable position. To begin with, results from research is primary information. The value of the results is not always explicit, clear or immediate.
Immediate value of research is relative to the disciplinary body of knowledge within which the particular research was placed. It is within that space that the work would need to be examined first, but there is more. Natural phenomenon does not stand alone. It is part of an intrincate and most often invisible network of influencing factors. Boundaries among disciplines are artificial and were forced by organizational needs of institutions and other such enterprises. Everything is connected to everything else in nature.
In this context, the influence of a particular research experiment is normally part of a line of research, or inquire with unknown effects on other external disciplines. This status remains so until the time when it is discovered by researchers in those other disciplines. Interdisciplinary crossover is not only possible but desirable.
On the other hand, regardless of applicability, there may be research and lines of inquire with weak theoretical assumptions. This would be a problem. Unequivocally acceptance of basic assumptions within a theoretical framework leads into weak science, waste of resources and mediocrity. Some researchers would complain about this statement and it may not be their experience or obvious within their own areas of expertise. After all, most researchers receive strong training on research methodologies to critically evaluate possible pitfalls in their own work. This includes a thorough understanding of the assumptions on which those methodologies rest.
The study of properties, critical components, factors or their interrelations in theoretical frameworks is a type of research that routinely receives funding awards because their results either affirm or question the fundamental assumptions of the framework. But readers of reports on such research should beware of the assumptions hid in the methodology. Readers should judge the methodology and the conclusions.
It is important that researchers, are not only trained about methodologies and their fundamental assumptions, but that they are also trained to question those fundamental assumptions. This is particularly important about their specific expertise and assumptions in their own line of research.
Any student who is not being trained to ask questions, to think critically even about their education, is missing the point. Conversations with young researchers make me think that the value of strong theories is being minimized, dismissed, or -- worse yet -- unknown by them in their research activities. Trained researchers should be expected to respond for the state of the fundamental assumptions in their theoretical paradigms.
Immediate value of research is relative to the disciplinary body of knowledge within which the particular research was placed. It is within that space that the work would need to be examined first, but there is more. Natural phenomenon does not stand alone. It is part of an intrincate and most often invisible network of influencing factors. Boundaries among disciplines are artificial and were forced by organizational needs of institutions and other such enterprises. Everything is connected to everything else in nature.
In this context, the influence of a particular research experiment is normally part of a line of research, or inquire with unknown effects on other external disciplines. This status remains so until the time when it is discovered by researchers in those other disciplines. Interdisciplinary crossover is not only possible but desirable.
On the other hand, regardless of applicability, there may be research and lines of inquire with weak theoretical assumptions. This would be a problem. Unequivocally acceptance of basic assumptions within a theoretical framework leads into weak science, waste of resources and mediocrity. Some researchers would complain about this statement and it may not be their experience or obvious within their own areas of expertise. After all, most researchers receive strong training on research methodologies to critically evaluate possible pitfalls in their own work. This includes a thorough understanding of the assumptions on which those methodologies rest.
The study of properties, critical components, factors or their interrelations in theoretical frameworks is a type of research that routinely receives funding awards because their results either affirm or question the fundamental assumptions of the framework. But readers of reports on such research should beware of the assumptions hid in the methodology. Readers should judge the methodology and the conclusions.
It is important that researchers, are not only trained about methodologies and their fundamental assumptions, but that they are also trained to question those fundamental assumptions. This is particularly important about their specific expertise and assumptions in their own line of research.
Any student who is not being trained to ask questions, to think critically even about their education, is missing the point. Conversations with young researchers make me think that the value of strong theories is being minimized, dismissed, or -- worse yet -- unknown by them in their research activities. Trained researchers should be expected to respond for the state of the fundamental assumptions in their theoretical paradigms.
Monday, July 12, 2010
Context and Specificity
Context refers to the relationship between part and whole so that part acquires a greater degree of specificity in the interpretation, or meaning, that is different that when part is examined as an isolated entity.
Specificity is a property of the scope of meaning, of what was understood by the reader, on what was read, of what is carried by the symbols, the letters, the words, etc. For example, the expression “brown fox” is less specific than “the small brown fox jumped over the tree”. One expression is more specific than another when it captures more details.
These properties are useful when examining human understanding of texts and documents. A simplistic examination may indicate that meaning and understanding are some objective characteristics of text and documents. After all, grammar provides a solid foundation to the combination of words to accommodate a variety of meanings.
The understanding of those meanings would have to occur at some standardized level that by necessity would be the lowest level in a universe of multiple levels of understanding and meanings.
Understanding or meaning is derived from the text itself and from how much the reader is able to relate to the content. These relationships form a space, which build context to text. Placing a text in a context is contextualization.
Text and context, once together, form information. This is what is extracted from the expression, or text. Ideally, the writer, or constructor of the expression, has captured some meaning in an expression. If the reader captures the same meaning, and at the same level of specificity intended by the creator, it is said that there has been an accurate interpretation, thus accurate communication has taken place. Accurate contextualization and the placement of the text at the original level of specificity must occur for the correct interpretation and understanding of the text to take place.
Variability in the application of contextualization and specificity levels by individuals explain the differences in reading understanding and in meaning construction among different individuals.
Specificity is a property of the scope of meaning, of what was understood by the reader, on what was read, of what is carried by the symbols, the letters, the words, etc. For example, the expression “brown fox” is less specific than “the small brown fox jumped over the tree”. One expression is more specific than another when it captures more details.
These properties are useful when examining human understanding of texts and documents. A simplistic examination may indicate that meaning and understanding are some objective characteristics of text and documents. After all, grammar provides a solid foundation to the combination of words to accommodate a variety of meanings.
The understanding of those meanings would have to occur at some standardized level that by necessity would be the lowest level in a universe of multiple levels of understanding and meanings.
Understanding or meaning is derived from the text itself and from how much the reader is able to relate to the content. These relationships form a space, which build context to text. Placing a text in a context is contextualization.
Text and context, once together, form information. This is what is extracted from the expression, or text. Ideally, the writer, or constructor of the expression, has captured some meaning in an expression. If the reader captures the same meaning, and at the same level of specificity intended by the creator, it is said that there has been an accurate interpretation, thus accurate communication has taken place. Accurate contextualization and the placement of the text at the original level of specificity must occur for the correct interpretation and understanding of the text to take place.
Variability in the application of contextualization and specificity levels by individuals explain the differences in reading understanding and in meaning construction among different individuals.
Sunday, July 11, 2010
Many Levels of Reading Understanding
Reading provides an excellent field laboratory to explore understanding, particularly when it is pleasure reading. Experiments on reading understanding seem to support the idea that, at least with respect to this type of materials, readers agree about the content they have read.
But, while readers seem agree it is clear that readers understand the content at various levels. Children, young adults, college graduates are just some of the categories of people that in average exhibit varying degrees and levels of understanding. In general, the acceptance that there are multiple levels of reading understanding seems to be universal. This begs questions about those levels. For example:
This is not a novel idea but rather one that has been and is constantly examined to gain more understanding about the different factors affecting the reading experience. I am sure that there are many people interested and knowledgeable on these issues, such as reading specialists, teachers, researchers on communication, and members of other similar groups.
But, while readers seem agree it is clear that readers understand the content at various levels. Children, young adults, college graduates are just some of the categories of people that in average exhibit varying degrees and levels of understanding. In general, the acceptance that there are multiple levels of reading understanding seems to be universal. This begs questions about those levels. For example:
- Are the various reading understanding levels completely distinct, do they partially overlap, or completely overlap as in being organized as layers of understanding?
- Can one say that one level of understanding is greater or higher than another level?
This is not a novel idea but rather one that has been and is constantly examined to gain more understanding about the different factors affecting the reading experience. I am sure that there are many people interested and knowledgeable on these issues, such as reading specialists, teachers, researchers on communication, and members of other similar groups.
Saturday, July 10, 2010
And then, there is human nature
It seems that some aspects of nature can be observed and the observations recorded. One way to record is in human memory but it is known to be plagued of bias. Memory seems to accommodate preferences and perspectives of each individual. Moreover, the senses have limitations and our personal background supplies context that could distort acquisition and interpretation of nature’s signals.
The important matter of attention has something to say about the resulting record of an event. Attention is related to individual awareness and interest, and it is directed towards specific areas of the environment in response to some internal reaction to external triggers.
With all of these weak links in the chain of interpretation, it is no wonder that witnesses of the same event come up to remember different things about it, including inexistent segments or components. One may be inclined to argue that anxiety, distractions and unexpected occurrences explain the differences, thus more time and a relaxed atmosphere will enable different people to record exactly the same memories about a given experience, event or stimuli.
The important matter of attention has something to say about the resulting record of an event. Attention is related to individual awareness and interest, and it is directed towards specific areas of the environment in response to some internal reaction to external triggers.
With all of these weak links in the chain of interpretation, it is no wonder that witnesses of the same event come up to remember different things about it, including inexistent segments or components. One may be inclined to argue that anxiety, distractions and unexpected occurrences explain the differences, thus more time and a relaxed atmosphere will enable different people to record exactly the same memories about a given experience, event or stimuli.
Documents, information and meaning
It is possible to envision documents as information that has been packaged. A document is an instance of various components interacting with each other, and that together have a certain meaning. Meaning is, therefore, an inherent property of information as well as of the document.
Meaning emerges from the interpretation of phenomena, including documents as human creations. There are multiple versions of arguments that deal with the nature of meaning, of documents and of information, and of how they relate to each other. However, one would be hard pressed to argue that meaning, documents and information are not related.
A document, thus defined, captures an instantiation of information that is understood as a unit. However, it is clear that it is also a bundle of parts, which are in itself units of information on their own right. Therefore, a document is a network of multiple parts packaged as one unit.
This idea becomes clear when we see a multi-letter word, or a multi-word sentence, etc. Each letter, each word, is a unit of information in its own right but yet form a whole with its own meaning. A similar example is an image made out of strokes, or of pixels, or of visual elements in a coherent collage. The final meaning is related to the relationship of the component elements to each other, as well as of how each part is related to the whole.
Meaning emerges from the interpretation of phenomena, including documents as human creations. There are multiple versions of arguments that deal with the nature of meaning, of documents and of information, and of how they relate to each other. However, one would be hard pressed to argue that meaning, documents and information are not related.
A document, thus defined, captures an instantiation of information that is understood as a unit. However, it is clear that it is also a bundle of parts, which are in itself units of information on their own right. Therefore, a document is a network of multiple parts packaged as one unit.
This idea becomes clear when we see a multi-letter word, or a multi-word sentence, etc. Each letter, each word, is a unit of information in its own right but yet form a whole with its own meaning. A similar example is an image made out of strokes, or of pixels, or of visual elements in a coherent collage. The final meaning is related to the relationship of the component elements to each other, as well as of how each part is related to the whole.
Meaning and documents
As it relates to documents as expressions of meaning, meaning can either be constructed when creating the document or derived when one interacts with it.
Meaning can be defined as a subjective interpretation of an objective instantiation, namely a document.
Document is further defined as an instantiation that has been captured in some form or medium, which could be void of explicit a-priori meaning, underlining the reality of possible a-posteriori assignment of meaning to the creation of a document. This is exemplified in the multitude of cases when original documents have served purposes different to those for which they had been originally created.
Meaning can be defined as a subjective interpretation of an objective instantiation, namely a document.
Document is further defined as an instantiation that has been captured in some form or medium, which could be void of explicit a-priori meaning, underlining the reality of possible a-posteriori assignment of meaning to the creation of a document. This is exemplified in the multitude of cases when original documents have served purposes different to those for which they had been originally created.
Friday, July 9, 2010
On meaning
Meaning is captured by some form of entity, such as words, individually or combined.
What type of operation takes place when words are combined? Is meaning additive? Some words are declarative (noun, verbs) and others are modifiers (adjectives, adverbs). The effect of some words over other words is not additive.
What about meaning? What is meaning in relationship to words?
Can meaning be considered absolute?
Can meanings be combined?
If meaning is transmitted, is the sent meaning the same as the received meaning?
What type of operation takes place when words are combined? Is meaning additive? Some words are declarative (noun, verbs) and others are modifiers (adjectives, adverbs). The effect of some words over other words is not additive.
What about meaning? What is meaning in relationship to words?
Can meaning be considered absolute?
Can meanings be combined?
If meaning is transmitted, is the sent meaning the same as the received meaning?
Thursday, July 8, 2010
Information is invisible
We often don’t recognize the value and  the role of information until such a a time when we need it.
What is information?
When information is information and not something else?
What is information before it is information?
Does something carry information?
Is information tangible? How tangible?
Or, is only the carrier of information tangible?
What is the carrier of that carry information?
Look up: Semiotics.
In this light, how can information be managed, organized?
And, what makes information valuable?
What is its role?
Look up: Representation and Information Representation.
In the discussion of the previous questions, it is clear that pinpointing a definition of information is slippery at best. It is more like a moving target, or perhaps rather like a moving bullet that can be seen when it is at rest but not when it is being used as it was naturally conceived. Similarly to light, information can only be perceived indirectly, as it interacts with the environment. Light remains invisible otherwise.
What is information?
When information is information and not something else?
What is information before it is information?
Does something carry information?
Is information tangible? How tangible?
Or, is only the carrier of information tangible?
What is the carrier of that carry information?
Look up: Semiotics.
In this light, how can information be managed, organized?
And, what makes information valuable?
What is its role?
Look up: Representation and Information Representation.
In the discussion of the previous questions, it is clear that pinpointing a definition of information is slippery at best. It is more like a moving target, or perhaps rather like a moving bullet that can be seen when it is at rest but not when it is being used as it was naturally conceived. Similarly to light, information can only be perceived indirectly, as it interacts with the environment. Light remains invisible otherwise.
Subscribe to:
Comments (Atom)