Harrods’s Weblog

Entradas de Mayo 2008

The differences between the following specialized terms (Q.3)

Mayo 19, 2008 · No hay comentarios

This are the differences between the following specialized terms: machine translation, machine aided translation, multilingual content management and translation technology.

  • Machine translation: Sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.
  • Computer-assisted translation,computer-aided translation or CAT: Is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process. Computer-assisted translation is sometimes called machine-assisted, or machine-aided, translation.
  • Multilingual content management: It contains information, mostly in the form of more or less structured text documents, but potentially also including audio clips, video clips and images.
  • Translation: Is the action of interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the “target text.”

SOURCES:

  • Wikipedia, 19 May 2008. Retrived: 19 May 2008, 18:04

http://en.wikipedia.org/wiki/Translation_technology

  • Wikipedia, 15 May 2008. Retrived: 19 May 2008, 18:05

http://en.wikipedia.org/wiki/Machine_aided_translation

  • Wikipedia, 7 May 2008. Retrived: 19 May 2008, 18:06

http://en.wikipedia.org/wiki/Machine_translation

  • Wikipedia. Retrived: 19 May 2008, 18:08

http://www.wikipedia.org/

Categorías: General
Etiquetado: ,

MT systems: example of translation (Q.3)

Mayo 19, 2008 · No hay comentarios

I have use transledium to make the following translation into different languages:

  • SPANISH: Internet es un método de interconexión descentralizada de redes de computadoras implementado en un conjunto de protocolos denominado TCP/IP y garantiza que redes físicas hetereogéneas funcionen como una red lógica única, de alcance mundial. Sus orígenes se remontan a 1969, cuando se estableció la primera conexión de computadoras, conocida como ARPANET, entre tres universidades en California y una en Utah, EE.UU.

 

  • GERMAN: Internet ist eine dezentralisiert-Verbindungs-Methode von Computernetzen, in einer TCP/IP genannten Gesamtheit von Protokollen implementiert, und er garantiert, dass heterogene|ungleichartige körperliche|physikalische Netze als ein einzigartiges logisches Netz, von Weltreichweite funktionieren. Seine|Ihre Herkunft geht auf 1969 zurück, als man die erste Computerverbindung, als ARPANET gekannt, zwischen drei Universitäten in Kalifornien und einer in Utah, EE festlegte. UU.

 

  • ENGLISH: Internet is a method of decentralized interconnection of nets|networks of computers implemented in an ensemble|set of protocols named TCP/IP and it|he guarantees that heterogeneous physical nets|networks work as a unique logical net|network, of world scope. Its|His|Her|Their origins go back to 1969, when the first connection of computers, known as ARPANET, between three universities in California and one in Utah, EE, was established. UU.

 

  • FRENCH: Internet est une méthode d’interconnexion décentralisée de filets|réseaux d’ordinateurs exécutée dans un ensemble de protocoles dénommé TCP/IP et il garantit que des filets|réseaux physiques hétérogènes fonctionnent comme un filet|réseau logique unique, de portée mondiale. Ses|Leurs origines remontent à 1969, quand on a établi la première connexion d’ordinateurs, connue comme ARPANET, entre trois universités en Californie et une en Utah, EE. UU.

To sum up, we can see that the transladium has had some problems with the meanings because for some words it gives us more than one meaning. For example: “físicas hetereogéneas” wich it has been translated into: “heterogene|ungleichartige körperliche|physikalische”.

SOURCES:

  • Wikipedia, 18 May 2008. Retrived: 19 May 2008, 18:50.

          http://es.wikipedia.org/wiki/Acceso_a_Internet

  • Traductor automático. Retrived: 19 May 2008, 18:51

          http://www.translendium.net:8080/home/text.do

       

Categorías: General
Etiquetado: ,

Characteristics of a translation task according to the FEMTI report.

Mayo 19, 2008 · No hay comentarios

FEMTI means ,The Framework for Machine Translation Evaluation in ISLE is a resource that helps MT evaluators define contextual evaluation plans. FEMTI consists of two interrelated classifications or taxonomies: the first one lists possible characteristics of the contexts of use that are applicable to MT systems. The second one lists the possible characteristics of an MT system, along with the metrics that were proposed to measure them.

According to the FEMTI report, the characteristics of the translation task refers to the information flow intended for the output, from the point of view of the agent (human or otherwise) who receives the translation. The main characteristics are the following:

  • Assimilation: The ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a (relatively) large volume of texts produced by people outside the organization, in (usually) several languages.
  • Document routing or sorting: The purpose of document routing / sorting is to scan incoming translated documents quickly in order to send them to the appropriate points for further processing or storage.
  • Information extraction or summarization: The purpose of information extraction or summarization is to extract some portion(s) of the translated text, either manually or automatically, for subsequent processing or storage. Information extraction is typically concerned with filling templates by identifying atomic elements of events. In contrast, summarization aims to provide a self-contained and internally cohesive text which serves as a selective account of the original.

 

Source:

  • Femti- a Framework for the Evaluation of Machine Translation in ISLE. (2002). Retrived: 19 May 2008, 17:13

         http://www.issco.unige.ch:8080/cocoon/femti/st-home.html

        

Categorías: General
Etiquetado: ,

Explanation of the topics (Q.2)

Mayo 19, 2008 · No hay comentarios

The first topic that I am going to explain is “Corpus-based language modeling” which belongs to The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Corpus linguistics is the study of language as expressed in samples (corpora) or “real world” text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are largely derived by an automated process, which is corrected.

Computational methods had once been viewed as a holy grail of linguistic research, which would ultimately manifest a ruleset for natural language processing and machine translation at a high level. Such has not been the case, and since the cognitive revolution, cognitive linguistics has been largely critical of many claimed practical uses for corpora. However, as computation capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability.

The corpus approach runs counter to Noam Chomsky’s view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. Corpus linguistics does away with Chomsky’s competence/performance split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.

The second topic is “Pragmatics”  which belongs to The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Pragmatics is the study of the ability of natural language speakers to communicate more than that which is explicitly stated. The ability to understand another speaker’s intended meaning is called pragmatic competence. An utterance describing pragmatic function is described as metapragmatic. Another perspective is that pragmatics deals with the ways we reach our goal in communication. Suppose, a person wanted to ask someone else to stop smoking. This can be achieved by using several utterances. The person could simply say, ‘Stop smoking, please!’ which is direct and with clear semantic meaning; alternatively, the person could say, ‘Whew, this room could use an air purifier’ which implies a similar meaning but is indirect and therefore requires pragmatic inference to derive the intended meaning.

Pragmatics is regarded as one of the most challenging aspects for language learners to grasp, and can only truly be learned with experience.

The last topic is “Speech Recognition”  which belongs to  The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to machine-readable input (for example, to the binary code for a string of character codes). The term voice recognition may also be used to refer to speech recognition, but more precisely refers to speaker recognition, which attempts to identify the person speaking, as opposed to what is being said.

Speech recognition applications include voice dialing (e.g., “Call home”), call routing (e.g., “I would like to make a collect call”), domotic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft cockpits (usually termed Direct Voice Input).

SOURCES:

  • Wikipedia, 14 May 2008. Retrived: 19 May 2008, 17:52

          http://en.wikipedia.org/wiki/Corpus_linguistics

  • Wikipedia, 25 April 2008. Retrived: 19 May 2008, 17:53

          http://en.wikipedia.org/wiki/Pragmatics

  • Wikipedia, 15 May 2008. Retrived: 19 May 2008, 17:54

         http://en.wikipedia.org/wiki/Speech_recognition

Categorías: General
Etiquetado: ,