FACTOID # 148: The top ten tourist destinations France, Spain, USA, Italy, China, UK, Austria, Mexico, Germany and Canada account for 49.6 percent of all tourist arrivals worldwide.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Computer assisted translation

Computer-assisted translation, Computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process. Look up translate in Wiktionary, the free dictionary. ... A BlueGene cabinet. ... A screenshot of computer software running in Windows XP. Software is a program that enables a computer to perform a specific task, as opposed to the physical components of the system (hardware). ... The translation process is an activity during which a person (the translator) establishes equivalences between a text, or segments of a text, and another language. ...


Computer-assisted translation is sometimes called machine-assisted, or machine-aided, translation.

Contents

Computer-assisted translation vs. Machine translation

Although the two concepts are similar, computer-assisted translation should not be confused with machine translation (MT). Machine translation, sometimes referred to by the acronym MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. ...


In computer-assisted translation, the computer program supports the translator, who translates the text himself, making all the essential decisions involved, whereas in machine translation, the translator supports the machine, that is to say that the computer or program translates the text, which is then edited by the translator, or not edited at all. Difficulties with such unedited output are described at machine translation. Machine translation, sometimes referred to by the acronym MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. ...


Overview

Computer-assisted translation is a broad and imprecise term covering a range of tools, from the fairly simple to the more complicated. These can include:

  • Spell checkers, either built into word processing software, or add-on programs;
  • Grammar checkers, again either built into word processing software, or add-on programs;
  • terminology managers, allowing the translator to manage his own terminology bank in an electronic form. This can range from a simple table created in the translator's word processing software or spreadsheet, a database created in a program such as FileMaker Pro or Alpha Five, or, for more robust (and more expensive) solutions, specialized software packages such as LogiTerm, MultiTerm, Termex, etc.
  • Dictionaries on CD-ROM, either unilingual or bilingual
  • Terminology databases, either on CD-ROM or accessible through the Internet, (such as The Open Terminology Forum, TERMIUM or Grand dictionnaire terminologique from the Office québécois de la langue française)
  • Full-text search tools (or indexers), which allow the user to query already translated texts or reference documents of various kinds. In the translation industry one finds such indexers as Naturel, ISYS Search Software and dtSearch.
  • Concordancers, which are programs that retrieve instances of a word or an expression in a monolingual, bilingual or multiligual corpus.
  • Bitexts, a fairly recent development, the result of merging a source text and its translation, which can then be consulted using a full-text search tool.
  • Translation memory managers (TMM), tools consisting of a database of text segments in a source language and their translations in one or more target languages.
  • Systems that are nearly automatic as in machine translation, but allow user decisions for ambiguous cases. These are sometimes called human-aided machine translation.

Word processing, in its now-usual meaning, is the use of a word processor to create documents using computers. ... Terminology, in its general sense, simply refers to the usage and study of terms, that is to say words and compound words generally used in specific contexts. ... FileMaker Pro is a cross-platform database application from FileMaker Inc. ... Alpha Five is a database application produced by Alpha Software, similar to Microsoft Access and Filemaker Pro. ... The CD-ROM (an abbreviation for Compact Disc Read-Only Memory (ROM)) is a non-volatile optical data storage medium using the same physical format as audio compact discs, readable by a computer with a CD-ROM drive. ... The Grand dictionnaire terminologique (LGDT) is an online terminological database containing nearly 3 million French, English and Latin technical terms in 200 industrial, scientific and commercial fields. ... The Office québécois de la langue française (OQLF) (Quebec Office of the French language) was established on March 24, 1961 along with the Quebec ministry of Cultural affairs. ... Established in 1988 in Sydney, Australia, ISYS Search Software is a developer of enterprise search software. ... A concordancer is a computer program that automatically constructs a concordance. ... In the field of translation studies a bitext is a merged document comprised of both source- and target-language versions of a given text. ... A translation memory, or TM, is a type of database that is used in software programs designed to aid human translators. ... The term database originated within the computer industry. ...

Translation memory software

Translation memory (TM) programs store previously translated source texts and their equivalent target texts in a database and retrieve related segments during the translation of new texts. A translation memory, or TM, is a type of database that is used in software programs designed to aid human translators. ...


Such programs split the source text into manageable units known as "segments." A source-text sentence or sentence-like unit (headings, titles or elements in a list) may be considered a segment, or texts may be segmented into larger units such as paragraphs or small ones, such as clauses. As the translator works through a document, the software displays each source segment in turn and provides a previous translation for re-use, if the program finds a matching source segment in its database. If it does not, the program allows the translator to enter a translation for the new segment. After the translation for a segment is completed, the program stores the new translation and moves onto the next segment. The translation memory, in principle, is a simple database fields containing the source language segment, the translation of the segment, and other information such as segment creation date, last access, translator name, and so on.


Some translation memory programs function as standalone environments, while others function as an add-on or macro to commercially available word-processing or other business software programs. Add-on programs allow source documents from other formats, such as desktop publishing files, spreadsheets, or HTML code, to be handled using the TM program.


Terminology management software

Terminology management software provides the translator a means of automatically searching a given terminology database for terms appearing in a document, either by automatically displaying terms in the translation memory software interface window or through the use of hot keys to view the entry in the terminology database. Some programs have other hotkey combinations allowing the translator to add new terminology pairs to the terminology database on the fly during translation. Computer-assisted translation, Computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process. ...


Alignment software

Alignment programs take completed translations, divide both source and target texts into segments, and attempt to determine which segments belong together in order to build a translation memory database with the content. The resulting TM can then be used for future translations.


Comparison of different CAT tools

(Alphabetical order, free software first, proprietary solutions second.)

Tool Supported File Formats OS Price License
ForeignDesk HTML, C Source Files, Java Source Code,
Microsoft Help File Sources (HPJ, HHC, HHK, CNT), Trados,
Windows Open Source
Okapi Framework PO, Windows RC, TMX, Wordfast, Trados, Java Properties, Regular-expression-based text, Illustrator, INX, ResX, Table-type files, XML Windows (.NET) LGPL
OmegaT+ HTML, DocBook, Plain Text, PO, JavaHelp, Java Resource Bundles, OpenOffice.org, StarOffice Multiplatform (Java) Open Source
OmegaT HTML, XHTML, DocBook, Plain Text, PO, JavaHelp, Java Resource Bundles, OpenDocument (ODF), OpenOffice.org, StarOffice, HTML Help Compiler (HCC), INI files Multiplatform (Java) Open Source
Transolution HTML, StarOffice/Openoffice.org,
XLIFF, DOCBOOK
Multiplatform (Python) GPL
AidTransStudio OpenOffice.org, MS Word Xml Windows (.NET) Basic: Free Proprietary
Cafetran HTML, XML,
OpenOffice.org, AbiWord, Kword, MS Word
Multiplatform (Java) 180 Euro
Déjà Vu (DVX) XML, Plain Text, OpenOffice.org, Adobe FrameMaker, Adobe PageMaker, ASP, Interleaf/Quicksilver, InDesign, Help Content, SGML, MS Access, MS Excel, MS PowerPoint, MS Word, QuarkXPress, RTF, Resource files, C/C++/Java source files, Java Properties, JavaScript, VBScript, GNU gettext Windows 490 Euro Proprietary
Heartsome Translation Suite HTML/XHTML, XML, Plain Text, OpenOffice.org, StarOffice, AbiWord, PO/POT (GNU Gettext), SVG, Adobe FrameMaker (MIF), Adobe InDesign, DocBook, DITA, Java Properties, JavaScript, RTF, Tagged RTF, Trados TTX, MS Office 2003 XML, ResX (Windows .NET Resources), RC (Windows C/C++ Resources), MS Office 2007 (beta) Multiplatform (Java) See price list. Proprietary
LogiTerm  ?  ?  ? ?
MemoQ HTML, plain text,
MS Word (plus RTF and other Word documents), Excel, PowerPoint, Trados TTX, proprietary bilingual format
Windows 4Free: Freeware
Translator Pro: 390 Euro
LSP 5: 1490 Euro
Proprietary
MetaTexis HTML, XML, Resource files
MS Word (all kinds of text files that can be imported by MS Word), MS Excel, MS PowerPoint, Adobe FrameMaker, Adobe PageMaker, QuarkXPress
Windows Lite: 29 Euro
Pro: 79 Euro
.NET/Office: 109 Euro
Proprietary
MLTS  ?  ?  ?
MultiCorpora MultiTrans  ?  ?  ?
MultiLing Fortis Translation Suite  ?  ?  ?
Rainbow HTML, XHTML, Scripts,
Photoshop, etc.
Windows (.NET) Freeware Proprietary
SDLX  ?  ?  ?
STAR Transit Text ANSI / ASCII / Unicode for Windows, Text for Apple Macintosh, Corel WordPerfect, HTML,

XML (ASP.NET, ASP, JSP, XSL), SGML, SVG (Scalable Vector Graphics), MS Word for Windows, MS Excel, MS PowerPoint, RTF y RTF for WinHelp, RC, QuarkXPress, Adobe FrameMaker, Adobe PageMaker, Interleaf /Quicksilver, Adobe InDesign, XGate para QuarkXPress, AutoCAD // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... C is a general-purpose, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ... The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language for creating special-purpose markup languages, capable of describing many different kinds of data. ... Microsoft . ... GNU logo The GNU Lesser General Public License (formerly the GNU Library General Public License) is an FSF approved Free Software license designed as a compromise between the GNU General Public License and simple permissive licenses such as the BSD license and the MIT License. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... DocBook is a markup language for technical documentation, originally intended for authoring technical documents related to computer hardware and software but which can be used for any other sort of documentation. ... PO may stand for: Pareto optimality Parole Officer Per os, Latin for by mouth or orally Perfect Orange a third wave ska based in Knoxville, TN from 2002-2005 Petty Officer, a Non-Commissioned Officer Rank in many Navies Pilkington Optronics, now Thales Optronics Pilot Officer, a junior commissioned rank... OpenOffice. ... StarOffice is Sun Microsystems commercial office suite software package. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Java is an object-oriented programming language developed by Sun Microsystems in the early 1990s. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... It has been suggested that XHTML_Modularization be merged into this article or section. ... DocBook is a markup language for technical documentation, originally intended for authoring technical documents related to computer hardware and software but which can be used for any other sort of documentation. ... PO may stand for: Pareto optimality Parole Officer Per os, Latin for by mouth or orally Perfect Orange a third wave ska based in Knoxville, TN from 2002-2005 Petty Officer, a Non-Commissioned Officer Rank in many Navies Pilkington Optronics, now Thales Optronics Pilot Officer, a junior commissioned rank... OpenDocument or ODF, short for the OASIS Open Document Format for Office Applications, is an open format for saving and exchanging office documents such as memos, reports, books, spreadsheets, databases, charts, and presentations. ... OpenOffice. ... StarOffice is Sun Microsystems commercial office suite software package. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Java is an object-oriented programming language developed by Sun Microsystems in the early 1990s. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... StarOffice is Sun Microsystems commercial office suite software package. ... OpenOffice. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Python is a programming language created by Guido van Rossum in 1990. ... The GNU logo The GNU General Public License (GNU GPL or simply GPL) is a widely used free software license, originally written by Richard Stallman for the GNU project. ... OpenOffice. ... Microsoft . ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language for creating special-purpose markup languages, capable of describing many different kinds of data. ... OpenOffice. ... AbiWord is a Free Software word processor licensed under the GNU General Public License. ... In computing, KWord is a free word processor, member of the KOffice project of the KDE Desktop Environment The text-layout scheme in KWord is based on frames, making it similar to Adobe FrameMaker. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Java is an object-oriented programming language developed by Sun Microsystems in the early 1990s. ... To meet Wikipedias quality standards, this article may require cleanup. ... The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language for creating special-purpose markup languages, capable of describing many different kinds of data. ... OpenOffice. ... The Standard Generalized Markup Language (SGML) is a metalanguage in which one can define markup languages for documents. ... C is a general-purpose, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ... C++ (IPA pronounciation: ) is a general-purpose, high-level programming language with low-level facilities. ... JavaScript is the name of Netscape Communications Corporations implementation of the ECMAScript standard, a scripting language based on the concept of prototype-based programming. ... GNU (pronounced ) is a free operating system consisting of a kernel, libraries, system utilities, compilers, and end-user applications. ... gettext is the GNU internationalization (i18n) library. ... To meet Wikipedias quality standards, this article may require cleanup. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... It has been suggested that XHTML_Modularization be merged into this article or section. ... The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language for creating special-purpose markup languages, capable of describing many different kinds of data. ... OpenOffice. ... StarOffice is Sun Microsystems commercial office suite software package. ... AbiWord is a Free Software word processor licensed under the GNU General Public License. ... GNU (pronounced ) is a free operating system consisting of a kernel, libraries, system utilities, compilers, and end-user applications. ... gettext is the GNU internationalization (i18n) library. ... SVG is also the IATA code for Stavanger Airport, Sola in Norway. ... DocBook is a markup language for technical documentation, originally intended for authoring technical documents related to computer hardware and software but which can be used for any other sort of documentation. ... DITA (Darwin Information Typing Architecture) is an XML-based architecture for authoring, producing, and delivering technical information. ... JavaScript is the name of Netscape Communications Corporations implementation of the ECMAScript standard, a scripting language based on the concept of prototype-based programming. ... The Rich Text Format (often abbreviated to RTF) is a proprietary document file format developed and owned by Microsoft since 1987 for cross-platform document interchange. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Java is an object-oriented programming language developed by Sun Microsystems in the early 1990s. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... The Extensible Markup Language (XML) is a W3C-recommended general-purpose markup language for creating special-purpose markup languages, capable of describing many different kinds of data. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... It has been suggested that XHTML_Modularization be merged into this article or section. ... Microsoft . ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ...

Windows Proprietary
Trados Translators Workbench When it is installed, it automatically adds a template to MS word. To use in Excel, PowerPoint, one should use TagEditor installed with workbench  ?  ?
Tr-aid  ?  ?  ?
Wordfast MS Word Microsoft Office Word macro 180 Euro Proprietary
WordFisher MS Word WordBasicMs Office Word macro Free Licence
Similis HTML, PDF,
Word, Trados
Windows 295 Euro (monoposte) Proprietary
Open Language Tools HTML/XHTML, XML, DocBook SGML, ASCII, StarOffice/OpenOffice/ODF, .po (gettext), .properties, .java (ResourceBundle), .msg/.tmsg (catgets) Multiplatform (Java) Free CDDL
POedit software Multiplatform ([[]]) Free [[]]
File Formats OS Price License

Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... // One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. ... Proprietary indicates that a party, or proprietor, exercises private ownership, control or use over an item of property, usually to the exclusion of other parties. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ... Java is an object-oriented programming language developed by Sun Microsystems in the early 1990s. ... Common Development and Distribution License (CDDL) is an open source license, produced by Sun Microsystems, based the Mozilla Public License, version 1. ... Multiplatform (or multi-platform) is a term commonly used in the computer world about a project that can be used on multiple platforms. ...

See also

  • Computer-assisted reviewing

To meet Wikipedias quality standards, this article or section may require cleanup. ...

External links

CAT Discussion groups

Software localization tools

Translation memory packages

Other tools

Computer-assisted translation tools indexes



 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.