The differences between the following specialized terms (Q.3)

This are the differences between the following specialized terms: machine translation, machine aided translation, multilingual content management and translation technology.

  • Machine translation: Sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.
  • Computer-assisted translation,computer-aided translation or CAT: Is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process. Computer-assisted translation is sometimes called machine-assisted, or machine-aided, translation.
  • Multilingual content management: It contains information, mostly in the form of more or less structured text documents, but potentially also including audio clips, video clips and images.
  • Translation: Is the action of interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the «target text.»

SOURCES:

  • Wikipedia, 19 May 2008. Retrived: 19 May 2008, 18:04

http://en.wikipedia.org/wiki/Translation_technology

  • Wikipedia, 15 May 2008. Retrived: 19 May 2008, 18:05

http://en.wikipedia.org/wiki/Machine_aided_translation

  • Wikipedia, 7 May 2008. Retrived: 19 May 2008, 18:06

http://en.wikipedia.org/wiki/Machine_translation

  • Wikipedia. Retrived: 19 May 2008, 18:08

http://www.wikipedia.org/

MT systems: example of translation (Q.3)

I have use transledium to make the following translation into different languages:

  • SPANISH: Internet es un método de interconexión descentralizada de redes de computadoras implementado en un conjunto de protocolos denominado TCP/IP y garantiza que redes físicas hetereogéneas funcionen como una red lógica única, de alcance mundial. Sus orígenes se remontan a 1969, cuando se estableció la primera conexión de computadoras, conocida como ARPANET, entre tres universidades en California y una en Utah, EE.UU.

 

  • GERMAN: Internet ist eine dezentralisiert-Verbindungs-Methode von Computernetzen, in einer TCP/IP genannten Gesamtheit von Protokollen implementiert, und er garantiert, dass heterogene|ungleichartige körperliche|physikalische Netze als ein einzigartiges logisches Netz, von Weltreichweite funktionieren. Seine|Ihre Herkunft geht auf 1969 zurück, als man die erste Computerverbindung, als ARPANET gekannt, zwischen drei Universitäten in Kalifornien und einer in Utah, EE festlegte. UU.

 

  • ENGLISH: Internet is a method of decentralized interconnection of nets|networks of computers implemented in an ensemble|set of protocols named TCP/IP and it|he guarantees that heterogeneous physical nets|networks work as a unique logical net|network, of world scope. Its|His|Her|Their origins go back to 1969, when the first connection of computers, known as ARPANET, between three universities in California and one in Utah, EE, was established. UU.

 

  • FRENCH: Internet est une méthode d’interconnexion décentralisée de filets|réseaux d’ordinateurs exécutée dans un ensemble de protocoles dénommé TCP/IP et il garantit que des filets|réseaux physiques hétérogènes fonctionnent comme un filet|réseau logique unique, de portée mondiale. Ses|Leurs origines remontent à 1969, quand on a établi la première connexion d’ordinateurs, connue comme ARPANET, entre trois universités en Californie et une en Utah, EE. UU.

To sum up, we can see that the transladium has had some problems with the meanings because for some words it gives us more than one meaning. For example: «físicas hetereogéneas» wich it has been translated into: «heterogene|ungleichartige körperliche|physikalische».

SOURCES:

  • Wikipedia, 18 May 2008. Retrived: 19 May 2008, 18:50.

          http://es.wikipedia.org/wiki/Acceso_a_Internet

  • Traductor automático. Retrived: 19 May 2008, 18:51

          http://www.translendium.net:8080/home/text.do

       

Characteristics of a translation task according to the FEMTI report.

FEMTI means ,The Framework for Machine Translation Evaluation in ISLE is a resource that helps MT evaluators define contextual evaluation plans. FEMTI consists of two interrelated classifications or taxonomies: the first one lists possible characteristics of the contexts of use that are applicable to MT systems. The second one lists the possible characteristics of an MT system, along with the metrics that were proposed to measure them.

According to the FEMTI report, the characteristics of the translation task refers to the information flow intended for the output, from the point of view of the agent (human or otherwise) who receives the translation. The main characteristics are the following:

  • Assimilation: The ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a (relatively) large volume of texts produced by people outside the organization, in (usually) several languages.
  • Document routing or sorting: The purpose of document routing / sorting is to scan incoming translated documents quickly in order to send them to the appropriate points for further processing or storage.
  • Information extraction or summarization: The purpose of information extraction or summarization is to extract some portion(s) of the translated text, either manually or automatically, for subsequent processing or storage. Information extraction is typically concerned with filling templates by identifying atomic elements of events. In contrast, summarization aims to provide a self-contained and internally cohesive text which serves as a selective account of the original.

 

Source:

  • Femti- a Framework for the Evaluation of Machine Translation in ISLE. (2002). Retrived: 19 May 2008, 17:13

         http://www.issco.unige.ch:8080/cocoon/femti/st-home.html

        

Explanation of the topics (Q.2)

The first topic that I am going to explain is «Corpus-based language modeling» which belongs to The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Corpus linguistics is the study of language as expressed in samples (corpora) or «real world» text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are largely derived by an automated process, which is corrected.

Computational methods had once been viewed as a holy grail of linguistic research, which would ultimately manifest a ruleset for natural language processing and machine translation at a high level. Such has not been the case, and since the cognitive revolution, cognitive linguistics has been largely critical of many claimed practical uses for corpora. However, as computation capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability.

The corpus approach runs counter to Noam Chomsky‘s view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. Corpus linguistics does away with Chomsky’s competence/performance split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.

The second topic is «Pragmatics»  which belongs to The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Pragmatics is the study of the ability of natural language speakers to communicate more than that which is explicitly stated. The ability to understand another speaker’s intended meaning is called pragmatic competence. An utterance describing pragmatic function is described as metapragmatic. Another perspective is that pragmatics deals with the ways we reach our goal in communication. Suppose, a person wanted to ask someone else to stop smoking. This can be achieved by using several utterances. The person could simply say, ‘Stop smoking, please!’ which is direct and with clear semantic meaning; alternatively, the person could say, ‘Whew, this room could use an air purifier’ which implies a similar meaning but is indirect and therefore requires pragmatic inference to derive the intended meaning.

Pragmatics is regarded as one of the most challenging aspects for language learners to grasp, and can only truly be learned with experience.

The last topic is «Speech Recognition»  which belongs to  The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio).

Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to machine-readable input (for example, to the binary code for a string of character codes). The term voice recognition may also be used to refer to speech recognition, but more precisely refers to speaker recognition, which attempts to identify the person speaking, as opposed to what is being said.

Speech recognition applications include voice dialing (e.g., «Call home»), call routing (e.g., «I would like to make a collect call»), domotic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft cockpits (usually termed Direct Voice Input).

SOURCES:

  • Wikipedia, 14 May 2008. Retrived: 19 May 2008, 17:52

          http://en.wikipedia.org/wiki/Corpus_linguistics

  • Wikipedia, 25 April 2008. Retrived: 19 May 2008, 17:53

          http://en.wikipedia.org/wiki/Pragmatics

  • Wikipedia, 15 May 2008. Retrived: 19 May 2008, 17:54

         http://en.wikipedia.org/wiki/Speech_recognition

Research topics on HTL (Q2)

In this article we are going to see the most recent researches mentioned in many sites of Human Language Technologies.

Refering to the German Research Center for Artificial Intelligence we realise that:

These themes are elaborated in research, development and commercial projects:

  • exploiting – and automatically extending – ontologies for content processing
  • tighter integration of shallow and deep techniques in processing
  • enriching deep processing with statistical methods
  • combining language checking with structuring tools in document authoring
  • document indexing for German and English
  • automatically associating recognized information with related information and thus building up collective knowledge
  • automatically structuring and visualizing extracted information
  • processing information encoded in multiple languages, among them Chinese and Japanese.

The Edimburgh Language Technology Group develops the following areas:

  • Combining Shallow Semantics and Domain Knowledge (EASIE).
  • Text Mining for Biomedical Content Curation (TXM).
  • Cross-retail Multi-agent Retail Comparison (CROSSMARC).
  • Smart Qualitalive Data: Methods and Community tools for Data Mark-up (SQUAD).
  • Machine Learning for Named Entity Recognition (SEER).
  • Integrated Models and Tools for Fine-Grained Prosody in Discourse (Synthesis).
  • Joint Action Science and Technology (JAST).
  • AMI consorting projects that are developing technologies for meeting browsing and to assist people participating in meetings from a remote location.
  • Study of how pairs collaborate when in planning a route on a map (Collaborating using diagrams).

The Common Language Resources and Technology Infrastructure wants Common Language Resources and Technology Infrastructure, at March 17/18/19. the CLARIN Kick-Off meeting will take place to start our pan-European research infrastructure work. We want to achieve a number of goals at these three days:

  • We need a broad and deep understanding of the goals of CLARIN by everyone involved. Yet we cannot assume that the knowledge is already sufficiently spread.
  • We need to start the interaction with everyone involved and interested and to take up the comments and ideas from all the experts.
  • We need to spread the relevant messages about the different layers of the work that is involved when setting up a research infrastructure in particular since it involves aspects that were not yet topic of the general discussions in our field.
  • We need to create a positive atmosphere and an enthusiasm which will be important to meet our challenging goals.
  • We need to start the actual work in the working groups and invite all experts to participate.
  • Of course those who are partners in the EC funded project need to understand the rules of the game. In particular the double funding scheme – national and EC funding – needs careful attention from all of us. Other members need to be informed about the national groups.

The Association for Computational Linguistics and Natural Processing Language (Columbus, Ohio) invite student researchers to submit their work to the workshop.

  • Pragmatics, discourse, semantics, syntax and the lexicon.
  • Phonetics, phonology and morphology.
  • Linguistic, mathematical and psychological models of language.
  • Information retrieval, information extraction, question answering.
  • Summarization and paraphrasing.
  • Speech recognition, speech synthesis.
  • Corpus-based language modeling.
  • Multi-lingual processing, machine translation, translation aids.
  • Spoken and written natural language interfaces, dialogue systems.
  • Multi-modal language processing, multimedia systems.
  • Message and narrative understanding systems.

Sources:

Hans Uszkoreit and European centres for Human Language Technologies (Q1)

Hans Uszkoreit is Professor of Computational Linguistics at Saarland University. At the same time he serves as Scientific Director at the German Research Center for Artificial Intelligence (DFKI) where he heads the DFKI Language Technology Lab. By cooptation he is also Professor of the Computer Science Department.

Hans Uszkoreit studied Linguistics and Computer Science at the Technical University of Berlin and the University of Texas at Austin. During his time in Austin he also worked as a research associate in a large machine translation project at the Linguistics Research Center. In 1984 Uszkoreit received his Ph.D. in linguistics from the University of Texas. From 1982 until 1986, he worked as a computer scientist at the Artificial Intelligence Center of SRI International in Menlo Park, Ca. During this time he was also affiliated with the Center for the Study of Language and Information at Stanford University as a senior researcher and later as a project leader. In 1986 he spent six months in Stuttgart on an IBM Research Fellowship at the Science Division of IBM Germany. In December 1986 he returned to Stuttgart to work for IBM Germany as a project leader in the project LILOG (Linguistic and Logical Methods for the Understanding of German Texts). During this time, he also taught at the University of Stuttgart. In 1988 Uszkoreit was appointed to a newly created chair of Computational Linguistics at Saarland University and started the Department of Computational Linguistics and Phonetics. In 1989 he became the head of the newly founded Language Technology Lab at DFKI. He has been a co-founder and principal investigator of the Special Collaborative Research Division (SFB 378) «Resource-Adaptive Cognitive Processes» of the DFG (German Science Foundation). He is also co-founder and professor of the «European Postgraduate Program Language Technology and Cognitive Systems», a joint Ph.D. program with the University of Edinburgh.Uszkoreit is Permanent Member of the International Committee of Computational Linguistics (ICCL), Member of the European Academy of Sciences, Past President of the European Association for Logic, Language and Information, Member of the Executive Board of the European Network of Language and Speech, Member of the Board of the European Language Resources Association (ELRA), and serves on several international editorial and advisory boards. He is co-founder and Board Member of XtraMind Technologies GmbH, Saarbruecken, acrolinx gmbh, Berlin and Yocoy Technologies GmbH, Berlin. Since 2006, he serves as Chairman of the Board of Directors of the international initiative dropping knowledge.His current research interests are computer models of natural language understanding and production, advanced applications of language and knowledge technologies such as semantic information systems, cognitive foundations of language and knowledge, grammar formalisms and their implementation, syntax and semantics of natural language and the grammar of German.Talking about his recent publications we can find:

  • Uszkoreit, H., F. Xu, Weiquan Liu, J. Steffen, I. Aslan, J. Liu, C. Müller, B. Holtkamp, M. Wojciechowski (2007)
    A Successful Field Test of a Mobile and Multilingual Information Service System COMPASS2008. In Proceedings of HCI International 2007, 12th International Conference on Human-Computer Interaction, Beijing, 2007.
  • Uszkoreit, H., F. Xu, J. Steffen and I. Aslan (2006) The pragmatic combination of different cross-lingual resources for multilingual information services In Proceedings of LREC 2006, Genova, Italy, May, 2006.
  • Uszkoreit, H., U. Callmeier, A. Eisele, U. Schäfer, M. Siegel, J. Uszkoreit (2004): Hybrid Robust Deep and Shallow Semantic Processing for Creativity Support in Document Production. In Proceedings of KONVENS 2004, Vienna, Austria.

About Hans Uszkoreit’s researches we can find:

  • Research Assistant at the Center for Advanced Study in the Social and the Behavioral Sciences at Stanford. (1981-82).
  • Research Associate in the project METAL at Linguistics Research Center in Austin, Texas. (1977-80).
  • Research Assistant at the Cognitive Science Program at Stanford University.

This are some of the European Research Centres of Human Language Technologies:

  • National Centre for Language Technology (NCLT) : Dublin, Ireland “carries out basic researchs and develops applications».
  • OFAI Language Technology Group: Australian center «conduct research in modelling and processing human languages, especially for German».
  • Edimburgh Language Technology Group (LTG) : The LTG has been working since 1990s they «building practical solutions to real problems in text processing».
  • Language Technology Documentation Centre in Finland: Developed «in order to make speech-to-speech translation real».

Sources:

  • Hans Uszkoreit, February 2007. Retrived: 21:05, 1 April 2008.

http://www.coli.uni-saarland.de/~hansu/publ.html

http://www.coli.uni-saarland.de/~hansu/bio.html

http://www.coli.uni-saarland.de/~hansu/pc.html

  • European research centres for Human Language Technologies. Retrived: 22:41, 1 April 2008.

http://www.ling.helsinki.fi/filt/projects/index-en.shtml

http://www.ltg.ed.ac.uk/

http://www.ofai.at/research/nlu/

http://www.nclt.dcu.ie/

Human Language Technology (Q1)

There are so many definitions of Human Language Technologies that can be founded on the Net the first definition we can find it on the Wikipedia but we find it as Natural Language Processing (NPL).

«Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages.

Natural-language-generation systems convert information from computer databases into normal-sounding human language. Natural-language-understanding systems convert samples of human language into more formal representations that are easier for computer programs to manipulate.»

Another definition referred to the Human Language processing is the one given by Hans Uszkoreit,

«Language technology — sometimes also referred to as human language technology — comprises computational methods, computer programs and electronic devices that are specialized for analyzing, producing or modifying texts and speech. These systems must be based on some knowledge of human language. Therefore language technology defines the engineering branch of computational linguistics».

On Hans Uszkoreit’s book «Language Technology A First Overview», he says

«Language Technologies are information technologies that are specialized for dealing with the most complex information medium in our world: human language. Therefore rhese technologies are also subsumed under the term Human Language Technology. Human language occurs in spoken and written form. Whereas speech is the oldest and most natural mode of language comunication, complex information and most of human knowledge is maintained and transmitted in written texts. Speech and text technologies process or produce language in these two modes of realization. But language also has aspects that are shared between speech and txt such as dictionaries, most of grammar and the meaning of sentences. Thus large parts of language technology cannot be subsumed under speech and text techmologies. Among those are technologies that link language to knowledge. We de not know how language, knowledge and thought are represented in the human brain. Nevertheless, language technology had to create formal representation systems that link language to concepts and tasks in the real world. This provides the interface to the fast growing area of knowledge technologies.

In our comunication we mix language with other modes of comunication and other information media. We combine speech with gesture and facial expressions. Digital texts are combined with pictures and sounds. Movies may contain language and spoken and written form. Thus speech and text technologies overlap and interact with many other technologies that facilitate processing of multimodal communication and multimedia documents.»

Sources:

  • Hans Uszkoreit book «Language Technology A First Overview». Retrived: 16.15, 1 April 2008.

http://www.dfki.de/~hansu/LT.pdf

  • German Research Center for Artificial Intelligence «Language Technology Lab». Retrived: 17:25, 1 April 2008.

http://www.dfki.de/lt/lt-general.php

  • Wikipedia, 18 March 2008. Retrived: 18:23, 1 April 2008.

http://en.wikipedia.org/wiki/Natural_language_processing

PHP

PHP is a computer scripting language originally designed for producing dynamic web pages. The name PHP is a recursive initialism for PHP: Hypertext Preprocessor.

PHP is used mainly in server-side scripting, but can be used from a command line interface or in standalone graphical applications. Textual User Interfaces can also be created using ncurses.The most recent version of PHP is 5.2.5, released on 8 November 2007. It is considered to be free software by the Free Software Foundation.

PHP is a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. PHP generally runs on a web server, taking PHP code as its input and creating Web pages as output. However, it can also be used for command-line scripting and client-side GUI applications. PHP can be deployed on most web servers and on almost every operating system and platform free of charge. The PHP Group also provides the complete source code for users to build, customize and extend for their own use.

PHP primarily acts as a filter. The PHP program takes input from a file or stream containing text and special PHP instructions and outputs another stream of data for display.

From PHP 4, the PHP parser compiles input to produce bytecode for processing by the Zend Engine, giving improved performance over its interpreter predecessor. PHP 5 uses the Zend Engine II.

SERVER- SIDE SCRIPTING

Originally designed to create dynamic web pages, PHP’s principal focus is server-side scripting. While running the PHP parser with a web server and web browser, the PHP model can be compared to other server-side scripting languages such as Microsoft‘s ASP.NET system, Sun MicrosystemsJavaServer Pages, and mod perl as they all provide dynamic content to the client from a web server. To more directly compete with the «framework» approach taken by these systems, Zend is working on the Zend Framework – an emerging (as of June 2006) set of PHP building blocks and best practices; other PHP frameworks along the same lines include CakePHP, PRADO and Symfony.

The LAMP architecture has become popular in the Web industry as a way of deploying inexpensive, reliable, scalable, secure web applications. PHP is commonly used as the P in this bundle alongside Linux, Apache and MySQL, although the P can also refer to Python or Perl. PHP can be used with a large number of relational database management systems, runs on all of the most popular web servers and is available for many different operating systems. This flexibility means that PHP has a wide installation base across the Internet; As of April 2007, over 20 million Internet domains were hosted on servers with PHP installed. The number of installations is different from the number of sites actually using those installations, but this statistic does reflect the popularity of PHP.

Examples of popular open source server-side PHP applications include phpBB, WordPress, and MediaWiki.

COMMAND-LINE SCRIPTING

PHP also provides a command line interface SAPI for developing shell and desktop applications, daemons, log parsing, or other system administration tasks that have traditionally been the domain of Perl, Python, awk, or shell scripting.

CLIENT-SIDE GUI APPLICATIONS

PHP provides bindings to GUI libraries such as GTK+ (with PHP-GTK), Qt with PHP-Qt and text mode libraries like ncurses in order to facilitate development of a broader range of cross-platform GUI applications.

FUENTES:

  • Wikipedia, 12 de febrero de 2008. Fecha de consulta 12 de febrero de 2008.

http://en.wikipedia.org/wiki/PHP

EL LENGUAJE XML

El metalenguaje conocido como XML, representa una manera distinta de hacer las cosas, más avanzada, cuya principal novedad consiste en permitir compartir los datos con los que se trabaja a todos los niveles, por todas las aplicaciones y soportes puesto que tiende a la globalización y la compatibilidad entre los sistemas, ya que es la tecnología que permitirá compartir la información de una manera segura, fiable, fácil.No hay que confundir el XML con el ya conocido y extendido HTML, hay varias diferencias pero el principal motivo es que el HTML se preocupa por formatear datos y para ello son las etiquetas que tiene su lenguaje. El XML se preocupa por estructurar la información que pretende almacenar. La estructura la marca la lógica propia de la información.Objetivos y usos del XML El XML se creó para que cumpliera varios objetivos:
  • Que fuera idéntico a la hora de servir, recibir y procesar la información que el HTML, para aprovechar toda la tecnología implantada para este último.
  • Se buscaba algo formal y conciso desde el punto de vista de los datos y la manera de guardarlos.
  • Debía ser extensible, para que lo puedan utilizar en todos los campos del conocimiento.
  • La facilidad de leer y editar también era clave.
  • Y además que fuera fácil de implantar, programar y aplicar a los distintos sistemas.

El XML se puede usar para infinidad de trabajos y aporta muchas ventajas en amplios escenarios:

  • Comunicación de datos. Si la información se transfiere en XML, cualquier aplicación podría escribir un documento de texto plano con los datos que estaba manejando en formato XML y otra aplicación recibir esta información y trabajar con ella.
  • Migración de datos.
  • Aplicaciones web. Con XML tenemos una sola aplicación que maneja los datos y para cada navegador o soporte podremos tener una hoja de estilo o similar para aplicarle el estilo adecuado.

Esto es a grandes rasgos el XML, pero prodría profundizarse mucho más en cuanto a su origen y el propio lenguaje en sí mismo, pero no se trata de eso, si no de ser conscientes de la relevancia que está adquiriendo en los últimos tiempos así como sus consecuencias, como la que ya conocíamos, la Web 2.0. Fuentes:

Wikipedia

Como sabemos, el pasado día 15 de enero 2008, la Wikipedia cumplió 7 años y por eso me gustaría aprovechar un artículo para hablar de ella.

Wikipedia es una enciclopedia libre plurilingüe basada en la tecnología wiki.Wikipedia se escribe de forma colaborativa por voluntarios, permitiendo que la gran mayoría de los artículos sean modificados por cualquier persona con acceso mediante un navegador web. El proyecto, comenzó el 15 de enero de 2001 por Jimbo Wales, con la ayuda de Larry Sanger, como complemento de la enciclopedia escrita por expertos Nupedia. Ahora depende de la fundación sin ánimo de lucro Wikimedia Foundation. Wikipedia registró en diciembre de 2007 más de 9 millones de artículos, incluyendo más de 2 millones en su edición en inglés, y a finales de febrero de 2006 alcanzó la cifra de 1.000.000 de usuarios registrados. Actualmente Wikipedia tiene ediciones en más de 253 idiomas, pero solamente 137 están activas.

El lema de Wikipedia es «La enciclopedia libre que todos podemos editar», y el proyecto es descrito por su cofundador Jimmy Wales como «un esfuerzo para crear y distribuir una enciclopedia libre, de la más alta calidad posible, a cada persona del planeta, en su idioma», para lograr «un mundo en el que cada persona del planeta tenga acceso libre a la suma de todo el saber de la humanidad». Es desarrollada en el sitio web Wikipedia.org haciendo uso de un software wiki, término originalmente usado para el WikiWikiWeb.

Hay 3 características que definen la Wikipedia en el funcionamiento de la web:

  1. Es una enciclopedia, entendida como soporte que permite la recopilación, el almacenamiento y la transmisión de la información de forma estructurada.
  2. Es un wiki, por lo que, con pequeñas excepciones, puede ser editada por cualquiera.
  3. Es de contenido abierto y utiliza la licencia GFDL.

En marzo de 2000 Jimbo Wales creó Nupedia, un proyecto de enciclopedia libre basado en un ambicioso proceso de revisión por pares, diseñado para hacer sus artículos de una calidad comparable a la de las enciclopedias profesionales gracias a la participación de eruditos a los que se proponía colaborar de modo no remunerado.

Debido al lento avance del proyecto, en 2001 se creó un wiki vinculado a Nupedia cuya finalidad inicial era agilizar la creación de artículos de forma paralela, antes de que éstos pasaran al sistema de revisión por expertos.

Wikipedia tiene una serie de políticas que son establecidas por los propios participantes en el proyecto. Algunas de estas políticas son:

  1. Debido a la diversidad y número de participantes e ideologías, provenientes de todas partes del mundo, Wikipedia intenta construir sus artículos de la forma más exhaustiva posible.
  2. Se siguen un número de convenciones con respecto al nombramiento de artículos, optando preferentemente por la versión más comúnmente utilizada en su respectiva lengua.
  3. Las discusiones acerca del contenido y edición de un artículo ocurren en las páginas de discusión y no sobre el artículo mismo.
  4. Existen un número de temas que resultan excluidos de Wikipedia por no constituir artículos enciclopédicos estrictamente hablando. Por ejemplo, Wikipedia no contiene definiciones de diccionario, que pueden encontrarse en el Wikcionario.

Wikipedia está siendo editada por miles de personas en todo el mundo. Las personas que editan Wikipedia son conocidas como wikipedistas, y sus colaboradores actúan siempre de manera voluntaria.Al día de hoy, en la edición en español hay registrados 612.865 usuarios de los cuales un total de 111 usuarios son bibliotecarios y, 62 son bots propios y automáticos de mantenimiento.

  • Wikipedia, 26 de enero 2008. Fecha de consulta, 27 de enero 2008.

http://es.wikipedia.org/wiki/Wikipedia