Corazones de Alcachofa

Terminología

*Amanuense: Persona que tiene por oficio escribir a mano, copiando o poniendo en limpio escritos ajenos, o escribiendo lo que se le dicta.

*Autenticidad: Garantía del carácter genuíno y fidedigno de ciertos materiales digitales, es decir, de que son lo que se afirma de ellos, ya sea objeto original o en tanto que copia conforme o fiable de un original realizada mediante procesos perfectamente documentados.

*Certificación: Proceso de evaluación del grado en el que un programa de preservación cumple con un conjunto de normas o prácticas mínimas previamente acordadas.

*Códice: Libro anterior a la invención de la imprenta.

*Colofón: Anotación al final de los libros, que indica el nombre del impresor y el lugar y fecha de la impresión, o alguna de estas circunstancias.

*Derechos: Facultades o poderes legales que se tienen o ejercen con respecto a los materiales digitales, como son los derechos de autor, la privacidad, la confidencialidad, y las restricciones nacionales o corporativas impuestas por motivos de seguridad.

*Filigrana: Obra formada de hilos de oro y plata, unidos y soldados con mucha perfección y delicadeza.

*Glosa: Explicación o comentario de un texto oscuro o difícil de entender.

*Miniado: Pintado, ilustrado de miniatura

*Identidad de Objetos Digitales: Característica que permite distinguir un objeto digital del resto, incluidas otras versiones o copias del mismo contenido.

*Ingesta: Operación consistente en almacenar obejtos digitales y la documentación relacionada, de manera segura y ordenada.

*Objetos Conceptuales: Objetos digitales con los que el ser humano interactúa de manera comprensible para él.

*Papel vitela: Papel liso y sin grano, de gran calidad, cuya superficie permite la reproducción detallada de los dibujos más finos.

*Piedra rosetta: contiene un texto en tres tipos de escritura y su gran importancia radica en haber sido la pieza clave para comenzar a descifrar los jeroglíficos de los antiguos egipcios. Es una estela de granito negro, con una inscripción bilingüe (griego y egipcio) de un decreto del faraón Ptolomeo V, en tres formas de escritura: jeroglífica, demótica y griego uncial (con letras mayúsculas); contiene noventa y dos renglones.

*Patrimonio Digital: Conjunto de materiales digitales que poseen el suficiente valor para ser conservados para que se puedan consultar y utilizar en el futuro.

*Preservación Digital: Acciones destinadas a mantener la accesibilidad de los objetos digitales a largo plazo.

Fuente:

http://unesdoc.unesco.org/images/0013/001300/130071s.pdf

noviembre 16, 2009 Posted by | Digital Resources | | Deja un comentario

Documento Digital vs. Documento Tradicional

Los documentos electrónicos son los archivos producidos con “procesadores de palabras”, “hojas de cálculo”, “administradores de bases de datos”, o programas para elaborar gráficos. Son documentos digitales, los conjuntos sistemáticamente integrados de texto, gráficos e imágenes con los que se construyen “presentaciones” en las computadoras. Son documentos electrónicos las llamadas “Páginas Web” y los mensajes que se transmiten por “e-mail”… Y lo son también las fotografías, el sonido y los videos producidos con instrumentos –cámaras y grabadoras- llamados digitales; instrumentos que registran directamente la información audiovisual, en forma de señales eléctricas positivas y negativas en un medio electrónico.

Los documentos digitales tienen características que los diferencian de los documentos tradicionales. Por ejemplo, los documentos digitales pueden leerse saltándose las páginas y no linealmente, como los impresos. Esta es una característica relacionada con la forma en que funciona el documento. A un documento digital puede cambiársele el contenido de una línea, de un párrafo o una página, sin que por ello halla que cambiar el documento entero. Esta es una característica relacionada con la “identidad” del documento; con la condición de documento único, de “testigo” científico o académico que corresponde a un documento tradicional, cuando se utiliza como apoyo para la discusión, demostración o ilustración de una tesis, hipótesis o teoría. Distinguimos en consecuencia, dos tipos de características en estos documentos: funcionales y de identidad.

Tipos de Documentos Digitales:

1.-Impresos digitalizados: Un documento digital puede ser el resultado de haber procesado con un “scanner” un documento originalmente impreso. El resultado, en primer lugar es una imagen (fotografía digital) del documento impreso. Tal imagen sirve para guardar en medio electrónico el documento o para volverlo a imprimir tal cual era originalmente. Pero en tanto imagen, no tiene las capacidades de hipertexto de un documento “textual” digital; como imagen, además, ocupa mucho espacio, lo que hace ineficiente la digitalización, sobre todo, si el documento tiene varias páginas.

2.-Digitales para imprimir: Un documento digital puede ser elaborado directamente en medio electrónico, con programas del tipo “procesador de palabras”, como “Word” de Microsoft, pero con el objetivo de imprimirlo luego.

3.-Digitales Multimediáticos: Además de documentos digitales creados a partir de documentos impresos y de documentos digitales creados para ser impresos, existen documentos concebidos desde el principio para ser consultados en una computadora, que aprovechan plenamente las características que les otorga su condición electrónica, especialmente las de “hipertexto” y “multimedia” para dar forma a una nueva manera de comunicarse.

Retrieved: 11/09/2009

*http://www.msinfo.info/propuestas/documentos/documentos_digitales.html

*http://www.msinfo.info/propuestas/documentos/documentos_digitales.html

*http://dialnet.unirioja.es/servlet/articulo?codigo=1071179

noviembre 8, 2009 Posted by | Digital Resources | | Deja un comentario

Metadata

Metadata is structured data which describes the characteristics of a resource. It shares many similar characteristics to the cataloguing that takes place in libraries, museums and archives. The term “meta” derives from the Greek word denoting a nature of a higher order or more fundamental kind. A metadata record consists of a number of pre-defined elements representing specific attributes of a resource, and each element can have one or more values.

Metadata (meta data, or sometimes metainformation) is “data about data”, of any sort in any media. Metadata is text, voice, or image that describes what the audience wants or needs to see or experience. The audience could be a person, group, or software program. Metadata is important because it aids in clarifying and finding the actual data. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, such as a database schema. In data processing, metadata provides information about, or documentation of, other data managed within an application or environment. This commonly defines the structure or schema of the primary data.

Each metadata schema will usually have the following characteristics:

*a limited number of elements
*the name of each element
*the meaning of each element
Typically, the semantics is descriptive of the contents, location, physical attributes, type (e.g. text or image, map or model) and form (e.g. print copy, electronic file). Key metadata elements supporting access to published documents include the originator of a work, its title, when and where it was published and the subject areas it covers. Where the information is issued in analog form, such as print material, additional metadata is provided to assist in the location of the information, e.g. call numbers used in libraries. The resource community may also define some logical grouping of the elements or leave it to the encoding scheme. For example, Dublin Core may provide the core to which extensions may be added.

Some of the most popular metadata schemas include:

*Dublin Core
*AACR2 (Anglo-American Cataloging Rules)
*GILS (Government Information Locator Service)
*EAD (Encoded Archives Description)
*IMS (IMS Global Learning Consortium)
*AGLS (Australian Government Locator Service)
While the syntax is not strictly part of the metadata schema, the data will be unusable, unless the encoding scheme understands the semantics of the metadata schema. The encoding allows the metadata to be processed by a computer program. Important schemes include:

*HTML (Hyper-Text Markup Language)
*SGML (Standard Generalised Markup Language)
*XML (eXtensible Markup Language)
*RDF (Resource Description Framework)
*MARC (MAchine Readable Cataloging)
*MIME (Multipurpose Internet Mail Extensions)
Metadata may be deployed in a number of ways:

*Embedding the metadata in the Web page by the creator or their agent using META tags in the HTML coding of the page
*As a separate HTML document linked to the resource it describes
*In a database linked to the resource. The records may either have been directly created within the database or extracted from another source, such as Web pages.
*The simplest method is for Web page creators to add the metadata as part of creating the page. Creating metadata directly in a database and linking it to the resource, is growing in popularity as an independent activity to the creation of the resources themselves. Increasingly, it is being created by an agent or third party, particularly to develop subject-based gateways.

Retrieved: 11/06/2009

Sources:

*http://en.wikipedia.org/wiki/Metadata
*http://www.library.uq.edu.au/iad/ctmeta4.html
*http://dublincore.org/
*http://archive.ifla.org/II/metadata.htm

octubre 29, 2009 Posted by | Digital Resources | | Deja un comentario

Bookmarking

The so called “bookmarking” system allows to save and share links to your favourite stories, tools and communities; it is some  sort of creating a menu that you can access from anywhere , and that you can allow friends to see.

There are lots of social bookmarking services, and each of one is a bit different. But they all work basically the same way: you set up a profile online, and save your favourite links to that profile. We can say, then, that a list of bookmarks is the result of human action, we filter the sites we view and list links only to those sites that provide value. 

For users, social bookmarking can be useful as a way to access a consolidated set of bookmarks from various computers, organize large numbers of bookmarks, and share bookmarks with contacts. Libraries have found social bookmarking to be useful as an easy way to provide lists of innovative links to patrons.

The concept of bookmarking dates back to April 1996 with the launch of itList, the features of which included public and private bookmarks. Within the next three years, online bookmark services became competitive as more companies grew up in this field. They provided folders for organizing bookmarks, and some services automatically sorted bookmarks into folders.

Interesting web-sites:

http://www.simpy.com/
http://www.socialmarker.com/

References:

http://www.avert.org/social-bookmarking-help.htm
hthttp://www.statesman.com/search/content/standing/share.html
tp://www.ivillage.com/support/free/0,,bxg1c5dp,00.html

octubre 15, 2009 Posted by | Digital Resources, Uncategorized | | 1 comentario

Artificial and Computational Intelligence

Artificial intelligence is both the intelligence of machines and the branch of computer science which aims to create it. The term artificial intelligence is also used to describe a property of machines or programs: the intelligence that the system demonstrates. AI research uses tools and insights from many fields, including computer science, psychology, philosophy, neuroscience, cognitive science, linguistics, operations research, economics, control theory, probability, optimization and logic.

Subjects in computational intelligence as defined by IEEE Computational Intelligence Society mainly include: Neural networks: trainable systems with very strong pattern recognition capabilities. Fuzzy systems: techniques for reasoning under uncertainty, have been widely used in modern industrial and consumer product control systems; capable of working with concepts such as ‘hot’, ‘cold’, ‘warm’ and ‘boiling’. Evolutionary computation: applies biologically inspired concepts such as populations, mutation and survival of the fittest to generate increasingly better solutions to the problem

AI research also overlaps with tasks such as robotics, control systems, scheduling, data mining, logistics, speech recognition, facial recognition and many others. Computational intelligence Computational intelligence involves iterative development or learning (e.g., parameter tuning in connectionist systems).

Major AI textbooks define artificial intelligence as “the study and design of intelligent agents, where an intelligent agent is a system that perceives its environment and takes actions which maximize its chances of success. John McCarthy, who coined the term in 1956, defines it as “the science and engineering of making intelligent machines.

Among the traits that researchers hope machines will exhibit are reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate objects. General intelligence (or “strong AI”) has not yet been achieved and is a long-term goal of some AI research.

AI research uses tools and insights from many fields, including computer science, psychology, philosophy, neuroscience, cognitive science, linguistics, ontology, operations research, economics, control theory, probability, optimization and logic. AI research also overlaps with tasks such as robotics, control systems, scheduling, data mining, logistics, speech recognition, facial recognition and many others.

“Artificial intelligence is the next stage in evolution,” Edward Fredkin said in the 1980s, expressing an idea first proposed by Samuel Butler’s Darwin Among the Machines (1863), and expanded upon by George Dyson in his book of the same name (1998). Several futurists and science fiction writers have predicted that human beings and machines will merge in the future into cyborgs that are more capable and powerful than either. This idea, called transhumanism, has roots in Aldous Huxley and Robert Ettinger, is now associated with robot designer Hans Moravec, cyberneticist Kevin Warwick and Ray Kurzweil.

 

Retrieved 20th August 2008 at 18:29

*Science daily, http://www.sciencedaily.com/articles/a/artificial_intelligence.htm

*http://en.wikipedia.org/wiki/Artificial_intelligence

agosto 20, 2008 Posted by | Human Language Technologies | | Deja un comentario

Educación a Distancia – Internet

La educación a distancia es una modalidad educativa en la que los estudiantes no necesitan asistir físicamente a ningún aula. Normalmente, se envía al estudiante por correo el material de estudio (textos escritos, vídeos, cintas de audio, CD-Roms y el devuelve los ejercicios resueltos. Hoy en día, se utiliza también el correo electrónico y otras posibilidades que ofrece Internet, fundamentalmente las aulas virtuales. Al aprendizaje desarrollado con las nuevas tecnologías de la comunicación se le llama e-learning. En algunos casos, los estudiantes deben o pueden acudir a algunos despachos en determinadas ocasiones para recibir tutorías, o bien para realizar exámenes. Existe educación a distancia para cualquier nivel de estudios, pero lo más usual es que se imparta para estudios universitarios.

Una de la características atractivas de esta modalidad de estudios es su flexibilidad de horarios. El estudiante se organiza su período de estudio por sí mismo, lo cual requiere cierto grado de autodisciplina. Esta flexibilidad de horarios a veces es vulnerada por ciertos cursos que exigen participaciones online en horarios y/o espacios específicos.

Entre los antecedentes de la educación a distancia están los cursos por correspondencia, que se iniciaron por la necesidad de impartir enseñanza a alumnos en lugares aislados, en los que no era posible construir un colegio. Tales cursos se ofrecieron al nivel de primaria y secundaria, y en ellos, a menudo, eran los padres quienes supervisaban el progreso educativo del alumno.

Sus principales ventajas residen en la posibilidad de atender demandas educativas insatisfechas por la educación convencional hegemónica. Las ventajas a las que alude la mayoría de las personas que usan este método, es la de poder acceder a este tipo de educación independientemente de dónde residan, eliminando así las dificultades reales que representan las distancias geográficas. Además, respeta la organización del tiempo, respetando la vida familiar y las obligaciones laborales.

En cambio, sus desventajas se refieren a la desconfianza que se genera ante la falta de comunicación entre el profesor y sus alumnos, sobre todo en el proceso de evaluación del aprendizaje del alumno. Por otro lado, es necesario una intervención activa del tutor para evitar el potencial aislamiento que puede tener el alumno que estudia en esta modalidad. Otra gran desventaja radica en el aislamiento que se puede llegar a dar entre seres humanos, eliminando la interacción social física.

En los últimos tiempos, ya sea por el impacto de la llamada Sociedad de la Información y del Conocimiento o sea, simplemente, porque las relaciones sociolaborales de las personas deben readaptarse a nuevas situaciones empresariales y personales, estamos asistiendo a un cambio de hábitos de los individuos que se están reflejando también en los procesos de formación. Así, y especialmente a medida que la edad de los estudiantes aumenta y, con ella, las responsabilidades a que están sujetos, la necesidad de ofrecer sistemas de formación que superen los obstáculos generados por los desplazamientos o por la falta de tiempo para asistir a las clases, se hace cada vez más evidente.

 De ahí se deriva que, tanto los sistemas convencionales como los virtuales están condenados a entenderse: la educación convencional no va a desaparecer, pero sí a transformarse.

 Encontrado el 14 de agosto del 2008 a las 18:27

*http://es.wikipedia.org/wiki/Educaci%C3%B3n_a_distancia

*Albert Sandrá Morer (Universitat Overta de Catalunya) http://www.uib.es/depart/gte/edutec-e/revelec15/albert_sangra.htm

agosto 14, 2008 Posted by | Human Language Technologies | | Deja un comentario

Translation Task

The Framework for Machine Translation Evaluation in ISLE is an attempt to organize the various methods that are used to evaluate MT systems, and to relate them to the purpose and context of the systems. Therefore, FEMTI is made of two interrelated classifications or taxonomies.

The first classification enables evaluators to define an intended context of use for the MT system to evaluate. Each feature is then linked to relevant quality characteristics and metrics, defined in the second classification.

The characteristics of the translation task are:

  • Assimilation: “the ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a (relatively) large volume of texts produced by people outside the organization, in (usually) several languages”.

  • Dissemination: “the ultimate purpose of dissemination is to deliver to others a translation of documents produced inside the organization”.

  • Communication: “the ultimate purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage”.

 

The Framework for Machine Translation Evaluation in ISLE. Retrieved July the 26 2008, 17:44

http://www.isi.edu/natural-language/mteval/

julio 26, 2008 Posted by | Human Language Technologies | , | Deja un comentario

Machine Aided Translation

Machine translation is an autonomous operating system with strategies and approaches that can be classified as follows:

  • the direct strategy
  • the transfer strategy
  • the pivot language strategy

The direct strategy, the first to be used in machine translation systems, involves a minimum of linguistic theory. This approach is based on a predefined source language-target language binomial in which each word of the source language syntagm is directly linked to a corresponding unit in the target language with a unidirectional correlation, for example from English to Spanish but not the other way round. The best-known representative of this approach is the system created by the University of Georgetown, tested for the first time in 1964 on translations from Russian to English. The Georgetown system, like all existing systems, is based on a direct approach with a strong lexical component. The mechanisms for morphological analysis are highly developed and the dictionaries extremely complex, but the processes of syntactical analysis and disambiguation are limited, so that texts need a second stage of translation by human translators.

In practice, computer-assisted translation is a complex process involving specific tools and technology adaptable to the needs of the translator, who is involved in the whole process and not just in the editing stage. The computer becomes a workstation where the translator has access to a variety of texts, tools and programs: for example, monolingual and bilingual dictionaries, parallel texts, translated texts in a variety of source and target languages, and terminology databases. Each translator can create a personal work environment and transform it according to the needs of the specific task. Thus computer-assisted translation gives the translator on-the-spot flexibility and freedom of movement, together with immediate access to an astonishing range of up-to-date information. The result is an enormous saving of time.

There have been basically two overall strategies which researchers have adoptedin the design of MT systems. In the first, the system is designed in all its details specifically for a particular pair of languages, e.g. Russian as the language of the originaltexts (the source language) and English as the language of the translated texts (the target language). Translation is direct from source language (SL) text to target language (TL) text; the vocabulary and syntax of the source language is analysed as little as necessary for acceptable target language output. For example, if a Russian word can be translated in only one way in English it does not matter that the English word may have other meanings or that the Russian might have two or more possible translations in another language. Likewise, if the original Russian word order can be retained in   English and give acceptable translated sentences, there is no need for syntactic analysis. In other words, analysis of the source language is determined strictly by the requirements of the target language. By contrast, in the second strategy, analysis of SL texts is pursued independently of the TL in question. Translation is indirect via some kind of ‘intermediary language’ or via a transfer component operating upon ‘deep syntactic’ or semantic representations of SL texts and producing equivalent representations from which TL texts can be generated. For example, a Russian passive sentence might be analysed as a deep syntactic form which allows for translation in English as either an active or a passive according to circumstances (e.g. the demands of idiomaticity, constraints on English verb forms, etc.) Likewise the various Russian expressions for ‘large’, ‘great’, ‘extreme’, etc. which differ in their distribution according to the nouns and verbs with which they occur, might all be represented as (say) Magn and translated in English by whichever is the most appropriate idiomatic form for the corresponding English noun or verb.

It has long been a subject of discussion whether machine translation and computer-assisted translation could convert translators into mere editors, making them less important than the computer programs. The fear of this happening has led to a certain rejection of the new technologies on the part of translators, not only because of a possible loss of work and professional prestige, but also because of concern about a decline in the quality of production. Some translators totally reject machine translation because they associate it with the point of view that translation is merely one more marketable product based on a calculation of investment versus profits. They define translation as an art that possesses its own aesthetic criteria that have nothing to do with profit and loss, but are rather related to creativity and the power of the imagination.

 

Retrieved: 17-07-2008, 18:28

http://www.hutchinsweb.me.uk/JDoc-1978.pdf

http://accurapid.com/journal/29computers.htm

http://en.wikipedia.org/wiki/Computer-assisted_translation

 

 

julio 17, 2008 Posted by | Human Language Technologies | | Deja un comentario

Machine Translation

Machine translation, sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.

Current machine translation software often allows for customisation by domain or profession (such as weather reports) — improving output by limiting the scope of allowable substitutions. This technique is particularly effective in domains where formal or formulaic language is used. It follows then that machine translation of government and legal documents more readily produces usable output than conversation or less standardised text.

Improved output quality can also be achieved by human intervention: for example, some systems are able to translate more accurately if the user has unambiguously identified which words in the text are names. With the assistance of these techniques, MT has proven useful as a tool to assist human translators, and in some cases can even produce output that can be used “as is”. However, current systems are unable to produce output of the same quality as a human translator, particularly where the text to be translated uses casual language.

Machine translation can use a method based on linguistic rules, which means that words will be translated in a linguistic way — the most suitable (orally speaking) words of the target language will replace the ones in the source language.
The Machine Translation (MT) project at Microsoft Research is focused on creating MT systems and technologies that cater to the multitude of translation scenarios today. Data driven systems, in particular those with a statistical core engine, have proven to be the most efficient, due to their ability to adapt to a wide domain coverage and being trained in new language pairs within a matter of weeks. This team works closely with research and development partners worldwide,  making the system accessible to a variety of products and services.
The field of machine translation has changed remarkably little since its earliest days in the fifties. The issues that divided researchers then remain the principal bones of contention today. The first of these concerns the distinction between that so-called interlingual and the transfer approach to the problem. The second concerns the relative importance of linguistic matters as opposed to common sense and general knowledge. The only major new lines of investigation that have emerged in recent years have involved the use of existing translations as a prime source of information for the production of new ones. One form that this takes is that of example-based machine translation in which a system of otherwise fairly conventional design is able to refer to a collection of existing translations.

Retrieved on the 11th of May 2008

*Microsoft Corporation-Microsoft Research: http://research.microsoft.com/nlp/projects/mtproj.aspx*Martin

Palo Alto Research Center, Palo Alto, California: http://cslu.cse.ogi.edu/HLTsurvey/ch8node4.html

*Wikipedia-The free enyclopedia, John Hutchins: http://en.wikipedia.org/wiki/Machine_translation

mayo 11, 2008 Posted by | Human Language Technologies | | Deja un comentario

Natural Language

In the philosophy of language, a natural language (or ordinary language) is a language that is spoken, written, or signed by humans for general-purpose communication, as distinguished from formal languages (such as computer-programming languages or the “languages” used in the study of formal logic, especially mathematical logic) and from constructed languages.

Though the exact definition is debatable, natural language is often contrasted with artificial or constructed languages such as Esperanto, Latino Sexione, and Occidental.

Linguists have an incomplete understanding of all aspects of the rules underlying natural languages, and these rules are therefore objects of study. The understanding of natural languages reveals much about not only how language works (in terms of syntax, semantics, phonetics, phonology, etc), but also about how the human mind and the human brain process language. In linguistic terms, ‘natural language’ only applies to a language that has evolved naturally, and the study of natural language primarily involves native (first language) speakers.

The goal of the Natural Language Processing (NLP) group is to design and build software that will analyze, understand, and generate languages that humans use naturally, so that eventually you will be able to address your computer as though you were addressing another person.

This goal is not easy to reach. “Understanding” language means, among other things, knowing what concepts a word or phrase stands for and knowing how to link those concepts together in a meaningful way. It’s ironic that natural language, the symbol system that is easiest for humans to learn and use, is hardest for a computer to master. Long after machines have proven capable of inverting large matrices with speed and grace, they still fail to master the basics of our spoken and written languages.

There are several major reasons why natural language understanding is a difficult problem. They include:

 

  1. The complexity of the target representation into which the matching is being done. Extracting meaningful information often requires the use of additional knowledge.
  2. The type of mapping: one-to-one, many-to-one, one-to-many, or many-to-many. One-to-many mappings require a great deal of domain knowledge beyond the input to make the correct choice among target representations. So for example, the word tall in the phrase “a tall giraffe” has a different meaning than in “a tall poodle.” English requires many-to-many mappings.
  3. The level of interaction of the components of the source representation. In many natural language sentences, changing a single word can alter the interpretation of the entire structure. As the number of interactions increases, so does the complexity of the mapping.
  4. The presence of noise in the input to the understander. We rarely listen to one another against a silent background. Thus speech recognition is a necessary precursor to speech understanding.
  5. The modifier attachment problem. (This arises because sentences aren’t inherently hierarchical, I’d say — POD.) The sentence Give me all the employees in a division making more than $50,000 doesn’t make it clear whether the speaker wants all employees making more than $50,000, or only those in divisions making more than $50,000.
  6. The quantifier scoping problem. Words such as “the,” “each,” or “what” can have several readings.
  7. Elliptical utterances. The interpretation of a query may depend on previous queries and their interpretations. E.g., asking Who is the manager of the automobile division and then saying, of aircraft?.

Natural Language Processing. Retrieved: 05-05-2008

*http://en.wikipedia.org/wiki/Natural_language

*http://research.microsoft.com/nlp/

*http://www.cs.dartmouth.edu/~brd/Teaching/AI/Lectures/Summaries/natlang.html

mayo 5, 2008 Posted by | Human Language Technologies | | Deja un comentario