ARCHIVED 4.6.3. Term-Extraction Tools

 

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

Consult the Pavel in ...

Español Português Italiano Nederlands العربية

Previous page Next page

The use of computer-assisted term-extraction tools or automated term-extraction tools helps minimize some of the repetitive tasks associated with manual highlighting of terms. Some software will allow you to insert a selected term directly into a term record, eliminating the typographical errors caused by re-keying.

They can also provide lists of words or word sequences found in a text or corpus that can be examined to determine whether the suggested candidate is a bona fide term and is worthy of recording.

Automated or computer-assisted term extraction is an effective way of "mining" existing documents and then building terminology database records from legacy content.

Automated Term-Extraction Tools

Automated term-extraction tools can perform term extraction. Some can do this without human intervention. Often, as the system can only extract the terms based on the instructions it is programmed with, people must still remove pseudo-terminological units from the resulting files. With the help of an indexing function, automatic pairing of the half-records can then be performed. Some products also include a term-extraction function that can be applied to identical bilingual texts. Potentially equivalent terminology units can then be proposed automatically. Some product suites can include include a text-alignment tool, a record-creation module, and a translation memory that facilitate management of the collection of terminology acquired.

Computer-Assisted Term-Extraction Tools

When the terminologist wishes to select terms to be retained for further study, s/he may use a computer-assisted method such as YVANHOÉ©, a product developed by a Translation Bureau terminologist for in-house use by colleagues responsible for managing a large terminological data bank. It extracts tagged terms from an electronic document and copies them to individual records, together with their context, source reference and page number. The resulting file is later retrieved using a data-recording software (such as LATTER© or TERMICOM©) so that the records can be completed, merged and improved as research is pursued. The finalized records are then automatically transferred to the TERMIUM Plus® data bank, to a publishing software, or to both.