Summary of VOCALL Project.

1. Description of the Project

This report details work done on the VOCALL project, no.3483, funded under the Leonardo Programme, for the period 01.12.1995 - 30.11.96. The VOCALL project is coordinated by Dublin City University (DCU), Dublin, Ireland. Its other constituent partners are ILTEC, Lisbon, Portugal; ILSP, Athens, Greece; top Schulung, Hamburg, Germany; and FAS, Dublin, Ireland.

The aim of the project is to build language learning tools for vocationally-oriented learners in the areas of computers, business administration, and construction. The multimedia product, which will be identical in all languages of the project, will be marketed as a self-learning tool for FL learners, as well as L1-disadvantaged learners, in vocational and professional training in the areas mentioned.

The reasons why such a tool is sorely needed are many, and include:

  1. promoting linguistic skills as part of vocational training
  2. fostering methods of self-learning in the workplace
  3. furthering access to vocational training promoting equality of opportunity
  4. adapting to industrial changes

It should be obvious from the members of the consortium that a primary focus of the VOCALL project is that of less widely used and taught languages (LWUTLs), namely Portuguese, Greek and Irish. In these cases, their maintenance as first language (L1) in vocationally-oriented language learning (VOLL), especially for disadvantaged users, cannot be separated from their promotion as foreign languages (FLs) for immigrant populations and learners in other member states. Written language resources such as lexica and terminology banks (let alone spoken language resources) are not well developed as learner aids in the case of LWUTLs, and these concerns are addressed in VOCALL in the development of a standardised series of learner aids for the improvement of linguistic skills in vocational contexts.

With this in mind, it was foreseen to source, develop (if necessary) and make available in digital format (disk, CD-ROM), using widely available technology (e.g. MS-Access, MS-Windows), a multilingual glossary of technical terms in the areas mentioned above, for the languages of the partners, as part of a self-learning tool encompassing multimedia technology (sound, video, graphics, text).

As well as the envisaged product being used as a self-learning tool in VOLL centres, we anticipate it being of interest to peripheral regions (in which FL competence is limited), small businesses, and in distance-learning centres. It will be of benefit to learners in an adult or lifelong context with a disadvantage in previous L1 or FL education, young learners in first-chance vocational education, as well as immigrant groups in member states whose national languages are LWUTLs.

The principal impact will be a raising of the status of LWUTLs in the vocational training of their linguistically disadvantaged citizens, whether in first-chance or adult and continuing education. This will impact on all the sectors chosen. We hope the provision of efficient, innovatory teaching aids will raise the quality of learning and teaching, thus leading to improvements in all such areas as well as wider mobility of European citizens, particularly in the vocational context.

2. Terminological Issues

The plan as outlined in the original project proposal for the first year was as follows:

  1. 0: Meeting at DCU (March)
  2. 4: Resource Survey completed (end July)
  3. 6: Meeting at ILSP (early Sept)
    1. Workplans
    2. Discuss tools, software, hardware issues.
    3. Discuss lexical issues.
  4. 11: Contact User Groups established
  5. 12: 1st lexicon finished (Feb)

It was agreed at the inaugural meeting of the VOCALL project that the field of Computer Skills would be used for the pilot study. The corpus for this pilot study would be the FAS material.

2.1 Sourcing of Terms

2.11 Dublin City University (DCU) and FAS.

English Terms: Firstly a corpus of texts was compiled in the area of Computer Skills, consisting of teaching materials supplied by FAS. A list of terms was extracted from this corpus, and this list was sent to ILSP, ILTEC and top Schulung in May 1996.

Further teaching materials were supplied by FAS in August 1996. The first list was then revised and a second, more complete list was sent to ILSP, ILTEC and top Schulung in August 1996.

Irish Terms: It was decided that in the case of the computer terms, it would not be possible to build up a corpus in Irish as there would not be enough material to do so. Equivalent terms in Irish exist for approximately 50% of the terms included in the DCU English term list. New terms are being created by DCU for the remaining 50%. The newly created terms will be submitted to An Coiste Tearmaiochta (The Terminology Committee for the Irish language) for approval and standardisation, this being the accepted practice for Irish term creation. The needs of the project have already been discussed with the terminologists of An Coiste Tearmaiochta and we have their full support.

2.12 ILSP

ILSP encountered difficulties initially in sourcing teaching materials in Greek in order to build up a Greek corpus. The teaching materials used by vocational training colleges in Greece is owned by the teachers and not by the institutes in question, and the teachers were not willing to give their notes to ILSP.

ILSP compiled a complementary list of English terms, which they sourced from two introductory manuals and from 4 computer lexica - 3 printed and one computerised. They also consulted the Help menus of the most commonly used software packages, (e.g. Windows 95, Word 7.0, Excel for Windows, Access for Windows.

Since the meeting at Athens ILSP has made contact with the OAED (Organization of Manpower Employment), Greece. They hope to sign a "Cooperation Agreement" with them by the end of this year, which will enable ILSP to be supplied with teaching material by OAED.A further possibility is that the OAED will serve as a test site for ILSP.

2.13 ILTEC

ILTEC consulted vocational training colleges in Portugal, some of which provided them with their training material. Their corpus also consisted of handbooks on secretarial and office skills, and some terminologies.

ILTEC then made a comparative analysis of the first list of terms from DCU and their own sources. ILTEC have now sourced Portuguese equivalents for almost all of the terms contained in the DCU revised list (containing 615 terms), the ILSP list of complementary English terms (containing 425 terms), and another list of complementary English terms which ILTEC have compiled (containing 738 terms). All of these lists are now part of the merged list (see 4.2.3).

2.14 top Schulung

top researched the teaching materials of their various training centres, using the English list provided by DCU as a basis reference source. They also used the English programs of Word & Excel, and a number of dictionaries, including one on-line dictionary, as reference materials. They have sourced German terms for most of the terms contained in the revised DCU list. top also approached the Technical University in Hamburg as a cross-reference for the terms. top are currently in the process of compiling a list of the terms most commonly used in their training centres.

2.2 Criteria for inclusion of terms in list/size of list

DCU sent a copy of the corpus of FAS teaching materials to each partner. All lists of terms created by that time by DCU, ILSP and ILTEC were collated by DCU into a merged list. The terms contained in the merged list were then tagged automatically so that we would have a reference as to their source (either DCU, ILSP, or ILTEC, or a combination). This merged list was then sent to all partners. Based upon all of the above material, (FAS corpus, merged list, job-descriptions and other end-user profile information provided by FAS), each partner was able to specify a number of user profiles.

Following an on-line discussion of the end user profile, the following categories were decided upon:

  1. (a) secretarial/administration
  2. (b) data entry/ telemarketing (telesales)
  3. (c) systems maintenance
  4. (d) working with a small company, therefore needing a variety of skills.

(a), (b) and (d) would all be using packages such as Word, Excel and Access, or special software particular to an individual company. No programming work would be involved.

2.3 Tagging of List

ILSP proposed the following criteria for tagging the list:

  1. A: secretarial /administration, data entry, telesales
  2. B: systems maintenance
  3. C: working in small company in a number of areas
  4. D: all basic computer terms
  5. G: all general words

The merged list was coded by different members of each site to ensure agreement (see Appendix A2).

Discussion is currently ongoing on the topic of the automatic evaluation metric which will be used to define the final list.

Following a proposal by the Greek partners, it was decided to adopt the strategy of distinguishing between terms per se, and general language vocabulary to be used with such terms. The first group will be subject to the coding methodology as outlined above and below, while the general language terms will be mere lists, with equivalent translations.

It was agreed that the terms themselves (not the general language vocabulary) will need to contain at least the following information:

  1. Term
  2. Part of Speech
  3. Gender
  4. Pronunciation (audio clip, not phonetic transcription)
  5. Term Number (to link translations)

and that the following may be required:

  1. Examples
  2. Definition

3 Summary of results to date

The principal results to date concern work done on the first sublanguage area (computer terms), and implementation issues.

As can be seen from the original plan as outlined at the March meeting, progress has been made in all areas. The terminological database survey was begun, and will be completed once certain institutions produce the results asked for by VOCALL partners.

The terminology work has continued to the point where, as envisaged, the first lexicon will be finished as planned early in the new year (as mentioned above). The other lexica have been decided upon, as has the order in which they will be treated.

Software and hardware concerns are being borne in mind, though we are presently observing closely current trends in the computing field, especially with regard to 32-bit operating systems rather than 16-bit.

A pre-prototype of the tool has been developed, and approved by members of the project. The next stage will be to include the finished first lexicon in the tool, at which point the first testing phase can be entered into.

Contact groups have been established in all countries, and teaching material has been gathered in all countries except Greece, where discussions are ongoing.

4 Work Programme for 2nd Period.

  1. 1: Assembly of final version of first list (Jan.)
  2. 2: Integrate list into prototype of tool; Begin second list (Feb.); Meeting Lisbon, 28th Feb and 1st March 1997.
  3. 6: Send tool for field-testing (June)
  4. 8: Write renewal report for continued funding of project; Finish second list (August)
  5. 9: Meeting Hamburg.
  6. 12: Write final report (Dec.)=09

Of course, developmental work on both the tool and further lexica will continue throughout the period outlined here. During this period, the product will begin to be marketed with a view to its being completed by December 1998 (subject to further funding).

January 1997