ACL3 Project
2000/1
Title: Analysis
of Email Tandem Learning
Claire Healy
98284363
Supervisor:
Christine Appel
Section: Description:
1.0 Introduction
2.0 Foreign / Second Language Learning
3.0 CALL
4.0 Tandem Language Learning
5.0 Email Tandem Learning
5.1 Advantages of Email Tandem
Learning
5.2 Disadvantages of Email Tandem
Learning
6.0 NLP and CALL
7.0 Tagging
8.0 Mini Implementation
8.1 Tools
9.0 Further Developments
Bibliography
URLS
Appendix A - Text, Input File
Appendix B - Code
Appendix C - Results
1.0 Introduction
When I began researching for
this project, my main focus was on CALL. I then became interested in Tandem
Learning via email, and I believe it to be an important tool for foreign
language learners. Because email tandem
learning is still relatively new, and pilot projects between universities are
still ongoing, there have not been many tools created to help students using
this medium. My main aim of this project was to analyse email tandem learning,
and create a useful tool that could be incorporated as part of a tandem site.
2.0 Foreign / Second
Language Learning
Second language learning and
foreign language learning are quite similar, but there is a distinction. Foreign
language learning normally occurs in the classroom. Second language learning
takes place in a natural environment, usually a country where the (second)
language is spoken. Second language acquisition implies that acquisition (as
opposed to learning) of a language occurs. The process of acquiring the second
language occurs in a more natural situation where the language is adopted. Both
second and foreign language learning research how language learning occurs and
how it best succeeds. Both strive for the best results from their different
methods.
The use of computers to aid
language learning is relatively new. Computer Assisted Language Learning (CALL)
- developed in the 1960s – has played and still plays an active role in the
language learning process. Second language learning materials and methods have
been employed, with the use of the computer, to encourage language learning
outside the classroom. But as Levy (1997, page 153) points out, CALL should be
regarded as a “tool” in language learning rather than a “tutor”, or a
substitute for a teacher.
Computer Assisted Language
Learning, according to Levy (1997,page 1) is “the search for and study of
applications of the computer in language teaching and learning”. It seeks to
employ computers to improve language learning.
CALL is seen as an
attractive alternative to traditional methods of language learning and language
teaching (Nerbonne, 2001, chapter 36). In the beginning (1960s and ‘70s) CALL
was predominantly behaviouristic (“drill and practice”). The computer delivered
instructional materials to the student. Its direction changed during the 1970s
and ‘80s and CALL reached its second phase, basing itself on the communicative
approach to teaching. It moved away from the “drill and kill” method, and
focused more on the involvement of student choice, control and interaction.
With the rapid technological developments of the late 1980s and
‘90s, CALL became the focus of two important areas of this technology –
Multimedia computers and the Internet. Multimedia, mainly CD-ROMS, opened up a
new and exciting field for CALL. It allowed a variety of media (graphics,
animation, sound, and video) to present a new method of language learning using
the computer. A more authentic learning environment was created, combining
listening and seeing (reading, writing, speaking and listening) in a single
activity. This new method motivated the students and also allowed them to work
and learn at their own pace. Although Multimedia was regarded as an exciting
breakthrough with CALL, it did not appeal to all areas of the learning
community. The industrial sector
invested in this technology often at considerable expense. Schools and
universities also invested in multimedia software, but some were and still are
on tight budgets and had uncertain hardware and software infrastructures.
Language learners and teachers turned towards the Internet (Warschauer, 1996,
page 4). The poor quality of Multimedia material also drew language learners
and teachers towards the Internet.
Today, the Internet is the
most widely used resource in both educational institutions and industrial
bodies. The World Wide Web (WWW) can be used to search through millions of
files throughout the world, to locate materials such as newspaper articles,
radio broadcasts, books etc. With regards to language learning, the Internet is
seen as a more efficient and less expensive medium (in comparison to multimedia
software). It offers a store of information and resources that teachers can use
to expose students to authentic language use (Gitsaki & Taylor, 1999, page
47). The expansion of the Internet has more recently brought with it a wide
variety of opportunities for exploitation in the language learning field. This
is contributing to the redefining of CALL and opening up new horizons which
were barely conceivable a decade ago (Woodin, 1997, page 30).
Computer–Mediated
Communication (CMC) has existed in primitive form since the 1960s (Warschauer,
1996, page 5), but has only become widespread in the last decade, thanks to the
Internet and the widespread availability of the Personal Computer (PC). Taking
all computer applications to date into account, it has probably had the
greatest impact on language learning and teaching. The use of electronic mail
(e-mail)/chat is one of the most efficient methods of communication. It has
become a reliable means of communication in everyday life in the last five
years. E-mail has widened the scope for language learners and become an
important tool in today’s methods of language teaching and learning.
Tandem learning developed firstly
as face-to-face meetings between learners with different mother tongues, both
of whom wanted to learn the other’s native language. Learners teamed up with a
native speaker, such as an exchange student, and learned each other’s language,
while being supported by a framework of counselling sessions, collaborative
tasks and activities, etc. Tandem learning occupies a place somewhere between
classroom learning and self-instruction (Schwienhorst, 1999, page 2).
There are three principles
of tandem learning. The first principle is reciprocity. Both learners must
benefit equally from the partnership. Each learner depends on the other’s help,
and expects to receive as much help as he/she gives. Each learner relies on the
other’s support to make the partnership a success. The second principle is
bilingualism. Each partner should contribute equal amounts of L1 and L2 to the
conversation. The third principle is learner autonomy. Tandem partners are
responsible for their own learning process. There is a mutual responsibility to
make the partnership as beneficial as possible to each other. With the support
of one another, both partners avail of the opportunity to communicate in their
target language, as frequent involvement in purposeful communication plays a crucial
role in the development of oral proficiency. Another benefit of this type of
learning is that learners develop new perspectives on their own and their
target language, precisely because they communicate with their partner
bilingually (Little & Ushioda, 1998, page 96). Unfortunately, face-to-face
tandem learning is not as widespread or as available as we would like it to be.
Organising two native speakers of different languages, wanting to learn each
other’s language in the same place at the same time is more difficult than it
seems. Tandem learning via email has become a more common medium in the last
few years, but that is not to say that face-to-face tandem learning is a thing
of the past.
For a number of years email
has been used as a tool for second language learning, formally and informally.
Formally, various projects have involved email to link language classrooms in
different countries, and informally, language learners with individual email
accounts have sought ‘pen’-friendships with native speakers of their target
language (Little & Ushioda, 1998, page 95 ). But in more recent years, mainly thanks to the International
Email Tandem Network (http://www.slf.ruhr-uni-bochum.de/email/ ), CALL’s
interest has turned towards email and has integrated this into Tandem Language
Learning. The three principles of
Tandem Learning (discussed above) are applied to Tandem Language Learning via
email. Email is an exclusively written medium, but it is accepted that email discourse is a
combination of written and spoken discourse (Murray, 1991, cited by Woodin,
1997, page 29). Tandem learning via email not only develops writing skills but
it also develops oral proficiency and linguistic and metalinguistic awareness.
Writing and reading provide an analytic insight into how language is
constructed.
Email Tandem Learning is
generally between two partners, who exchange one-to-one emails, but some
institutions also make discussion forums available to the learners, where every
message that is sent is automatically received by all members of the group.
With one-to-one emailing, each student is teamed up with a native speaker of
his/her target language. The topics of discussion can be agreed on and are
generally much more meaningful than classroom discussions. Tandem learning is
mainly used as part of a curriculum. It is integrated within the existing
course structure, and students from both universities are generally set similar
projects, based on their target language or target culture. This provides
ready-made topics of discussion for those shy students. Although, this seems to
be an exciting venture, it can be somewhat daunting for learners. Participants
need to form a relationship with a native speaker of a foreign language who
they have never met before and cannot see. They must negotiate a series of
exchanges and obtain information of interest to them, write in the foreign
language without the support of non-verbal clues and maintain the relationship
once the initial excitement has disappeared. They need to be more explicit than
in face-to-face conversations and yet be tactful so as not to offend (Woodin
1997, page 29). But, the students generally have the support and counselling of
their teachers.
Students email their
partners in both L1 and L2. This can be done by writing firstly in their native
language and then in their target language, or vice versa, but as the partners
become more acquainted and more comfortable with each other they can experiment
by mixing the two languages within paragraphs, sentences, etc. On the principle
of reciprocity, both partners should help each other, especially with regards
to error correction. Because the partners’ teachers are not explicitly
involved, they must rely on each other’s corrections and suggestions to improve
their language learning. This is difficult at first, as students do not want to
correct all their partner’s mistakes, and therefore must decide which are the
most important ones to correct. It is not motivating for the students if they receive
their emails back, with numerous corrections inserted. Students must be clear
when giving feedback. Native speakers generally base their corrections on
intuition, rather than knowledge of grammatical rules. Re-using their partner’s
corrections, and idioms and expressions from their emails shows an active
involvement in the learning process.
5.1 Advantages
of Email Tandem Learning
1. Partners can write to each other any time.
2. It is more relaxed and informal than classroom learning.
3. When writing messages, learners can think about what they are going to write,
and have time to write it correctly.
4.
Partners
can teach each other colloquial idioms, which are generally not taught in
the classroom.
5.
Learners
can receive first-hand information about the target language, country
and culture.
6.
Each
student has a record of all emails and corrections, and so can review them
any time.
1.
Some
students don’t view their partner as a real person, and consider them more as a
machine (according to a study by Woodin, 1997, page 27).
2.
Because
it is not synchronous communication, the response from the user is slower than
that of face-to-face tandem learning and so the learning process may be viewed
as slower. It may be argued that this may benefit the learner because he/she
has time to reflect on corrections and also to take in new concepts that he/she
ahs learned from his/her partner.
3. Because the participants are not teachers, errors may be left uncorrected, because they may be either unnoticed or considered unimportant, but one could argue that a native speaker would have more of an insight into what is important / unimportant because it is his / her native tongue
Natural Language Processing
(NLP) uses computers to process language – analyse, store, sort and search it. NLP
technology uses many tools to better aid us in the task of studying and
processing Natural Language. It is only natural for these technologies to be
applied to language learning and teaching. Below are some NLP applications
which have sought to contribute to CALL:
Concordancing
Text-Alignment
Speech Recognition and
Synthesis
Machine Translation
Syntactic Processing
Morphological Processing
These technologies are
applied to illustrate linguistic structures, make language comprehensible,
provide varied exercise material and spot and correct errors. For the more
advanced language learners, corpora are an invaluable source of information and
authentic material. A corpus allows students to appreciate linguistic patterns
and distinctions. Bilingual corpora provide similar information to monolingual
corpora and also a convenient translation into a known language. A bilingual
corpus is of little practical value to language learners and teachers without
the application of text alignment. One of the most convenient and useful tools
applied to corpora is a part-of-speech tagger.
7.0 Tagging
Part-of-Speech (POS) tagging is the process of assigning a POS or other lexical marker to each word in a corpus. Knowledge of the different POS in a text plays an important role in NLP. The input to a tagging algorithm is a string of words and a specified tagset. The output is a single best tag for each word. Take for example, the sample sentences below, taken from the ATIS corpus of dialogues about air-travel reservations.
VB DT NN
Book that flight.
VBZ DT
NN VB NN
Does that flight serve
dinner?
( DT = Determiner NN = Noun, singular or mass VB = Verb, base form VB = Verb, 3rd person singular present )
(see http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-HTMLDemo/PennTreebankTS.html for the full Penn Treebank Tagset)
The sentences above are simple and unambiguous as a whole, but individually some of the words are ambiguous. Most words in English have only one meaning (are unambiguous) and therefore have only a single tag. But many of the most commonly used words in English have more than one meaning (are ambiguous, when context is not taken into account) and so have at least two tags. In the first example above, “Book” is a verb, but it can also be a noun (“The book is on the shelf”). The problem of POS tagging is to resolve these ambiguities, choosing the proper tag for the context. Many ambiguous tokens are easy to disambiguate. This is because the various tags associated with a word are not equally likely.
Most tagging algorithms fall into one of two classes: Rule-based taggers and
Stochastic taggers. Rule-based taggers generally involve a large database of hand-written disambiguation rules which specify, for example, that an ambiguous word is a noun rather than a verb if it follows a determiner. Stochastic taggers generally resolve tagging ambiguities by using a training corpus to complete the probability of a given word having a given tag in a given context. These taggers are based on the Hidden Markov Model or HMM (see Jurafsky &Martin 2000, Chapter 7) For a given sentence or word sequence HMM taggers choose the tag sequence that maximises the following formula:
P(word | tag) * P(tag | previous n tags)
The transformation based tagger or Brill tagger shares features of both tagging architectures outlined above.
As part of my mini implementation, a piece text have been previously tagged by a language-independent POS tagger called the TreeTagger. The TreeTagger has been developed within the Text Corpora (TC) project (http://www.ims.uni-stuttgart.de/projekte/tc/) at the Institute for Computational Linguistics at the University of Stuttgart. This tagger has been successfully used to tag English, German, French and Italian text. The TreeTagger is a probabilistic POS tagger. It avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. The performance of the TreeTagger was tested on data from the Penn-Treebank corpus. Some 2 million words were used for training and 100,000 words from a different part of the Penn-Treebank corpus for testing. The TreeTagger achieves 96.36% accuracy on Penn-Treebank data, which is better than that of a trigram tagger (96.06%) on the same data. (for more information on the tagger see http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html). Decisions for implementing this particular tagger will be discussed Section 8.1 (TOOLS).
8.0 Mini
Implementation
My implementation is a presentation of a text, and its tags. The text has been previously tagged by the TreeTagger (discussed above), so the POS of each word is available. The text should be able to be viewed and the main POS (verb, noun, adjective, adverb, preposition, determiner, conjunction, pronoun) can in turn be selected, and will show up in different colours. The idea of the program is to equip learners of English with a better understanding of English POS, and the construction of sentences.
The
implementation is aimed at Foreign/Second Language Learners of English, at a
high intermediate level (probably Leaving Certificate equivalent / 1st
year university students). It could also be used by language teachers and
researchers. Teachers can use this in a language lab environment to show
students to illustrate the different POS, and their role in the language. As
outlined above, it should facilitate the students’ understanding of the POS in
English and English grammar in general. The students will be able to
choose which POS they want to see
highlighted. For example, if they choose to look at all the nouns in any text,
they will be highlighted in RED, or similarly verbs in BLUE. Viewing each
highlighted POS separately in a text will create a greater awareness of that
particular POS within that piece of text.
Learners should be able to see more clearly the structure of English
sentences, and how each POS plays a
role in this structure, ie. the order of subjects, adjectives etc. Generally
language learners literally translate sentences from their native language into
their target language, and do not take into account that sentence structure
varies from language to language. This
tool should aid them to become more aware of both their target spoken and
written language. An
alternative would be to show the tag labels of each word, for example e-mail
<NN>, as in a corpus, but
as discussed earlier, corpora are for more advanced learners. Studying texts
and their tags is a time-consuming process.
8.1 Tools
The use of colours should keep the student
interested, focused and motivated. Using colours is an important factor in the psychology
of learning. Colour brings black and white text alive. According to Rhodes and
Thame (1988, page 122), ‘colour’ highlighting of this kind is an ingenious tool
for revealing structure in written material, which is precisely the aim of the
implementation. The allocation of colours to the POS are as follows:
NOUN RED
VERB
DARK BLUE
PRONOUN ORANGE
ADJECTIVE GREEN
ADVERB LIGHT
BLUE
CONJUNCTION ILLUMINOUS GREEN
DETERMINER PURPLE
PREPOSITION YELLOW
Some
of the POS are closely linked, for example, Verb and Adverb (modifies the verb)
and Noun and Pronoun (refers to some noun phrase or entity or event), and so
their similar colours have be chosen to show the relationship between them.
The
programming language chosen for this implementation is PERL, as it is the most
appropriate language for dealing with texts. PERL was designed to write short
programs to process text files, using its elaborate and powerful pattern
matching techniques, and then produce results of that processing.
There
are many taggers freely available online for commercial and personal use. But I
chose the TreeTagger, as it has already been used to successfully tag German,
English, Italian and French texts, and it is easily adaptable to other
languages if a lexicon and a manually tagged training corpus are available.
This tagger is ideal for my project, because I hope to use it further as part
of my 4th year project, and incorporate texts in another language
(probably French – see Further Developments).
8.2
Perl Program
- The
program reads in the Input File (See Appendix A) which contains the tags to the
text used in this implementation.
- The
Input File consists of three columns – Word, Tag, Lemma (a Lemma is an entire collection
of related words. (Eg. provides, provided, providing. Lemma is Provide))
- Each line of the text is split into
these three elements.
- These
elements are processed in different ways, depending on the task (“action”)
specified on the command line (See Appendix C).
- “plain”
prints out the text similar to it’s original format. It simply takes each
Word (1st element of the
input text) and prints it to a file with a space in between each one.
- “tags”
uses the 1st and 2nd elements of the input text, and
“assigns” each Tag (POS) to its Word.
- “colour”
uses a similar process to “tags” as it
prints out the Word and it’s “Colour Tag” beside it.
This
program is very basic. It will need to be implemented in CGI (Common Gateway
Interface), which will allow me to have a more user-friendly interface, using
menus and different screens. CGI is used along with HTML, and so is ideal for
interactive websites. CGI allows the user to interact with the website, ie. it
prompts the user for input and processes this information. By programming this implementation in CGI,
I can use colours to highlight the words, rather than the primitive “Colour
Tags” used in this Perl program. Also, the user will have a choice as to which
POS he/she wants to view. This allows the
user to have more control of the tool, and so the user can learn at his/her own
pace.
9.0 Further Developments
My main area of interest in
the Computational Linguistics field is CALL, more precisely Computer Mediated
Communication (CMC). I have focused my attention on Tandem Language Learning,
as I see this area as an important tool in assisting the development of foreign
language learning. My mini implementation in this project is a basis for my 4th
year project implementation. Tandem Language Learners learning via email need
as much support and as many tools as possible to be made available to them,
since it is an “outside the classroom” activity. I propose to incorporate my
POS highlighting program (programmed in CGI and Perl) into a email tandem website, as a tool to aid language learners
and also their supervisors. The texts to be processed would not be pre-chosen.
They would be the students’ emails that they have sent and received. The emails
could be viewed, with the chosen POS highlighted (using this project’s
implementation). But because the language to be processed will be a natural
language (English/French), written by the learners of that language, problems
will be encountered. Spelling mistakes, slang words, metaphors, colloquial idioms
etc. will be some of the main obstacles.
I hope to get around these obstacles (perhaps by accessing a spell
checker and dictionary) and create a useful tool for foreign language learners
using email tandem learning.
Christiansen, T & Torkington, N 1998. Perl cookbook. Cambridge : O'Reilly
Gitsaki, C & Taylor, R.
P 1999. Internet-based activities for the
ESL classroom. RECALL Journal. Vol.11 No.1 May, 1999.
Jufafsky, D Martin, J 2000. Speech and Language Processing: An
Introduction to Natural Language Processing, Computational Linguistics, and
Speech Recognition. Prentice Hall Inc.
Levy, M 1997. Computer-Assisted Language Learning: Context
and Conceptualization. Oxford: Oxford University Press.
Little, D & Ushioda, E
1998. Designing, implementing and
evaluating a project in tandem language learning via e-mail. RECALL
Journal. Vol.10 No.1 May, 1998.
Murray, D 1991. Conversation for action: the computer
terminal as a medium of communication, John Benjamins, Amsterdam.
Nerbonne, J. Chapter 36. Natural Language Processing in Comuter-Assisted Language Learning. - Ruslan Mitkov (ed.) Handbook of Computational Linguistics, Oxford University Press, 2001.
Rhodes, J & Thame, S
1988. The Colours of Your Mind.
Collins
Schwienhorst, K 1997. Talking on the MOO: Learner autonomy and
language learning in tandem. Paper presented at the CALLMOO: Enhancing
Language Through Internet Technologies, Bergen, Norway.
Sebesta, R W. 1999. A little book on Perl. - Upper Saddle River, NJ : Prentice Hall,
Warschauer, M 1996.
Computer-Assisted Language Learning: An introduction. In S. Fotos (Eds)
Multimedia language teaching. Tokyo, Japan: Logos International.
Woodin, J 1997. Email tandem
learning and the communicative curriculum. RECALL Journal. Vol.9 No.1 May 1997.
URLS
http://www.slf.ruhr-uni-bochum.de/email/
last visited: 27/08/01
http://www.ims.uni-stuttgart.de/projekte/tc/ last visited: 20/08/01
http://wilde.cs.tcd.ie:2222/d-tandem.html last visited: 11/08/01
http://llt.msu.edu/ last
visited: 30/08/01
http://www.activestate.com last
visited 24/08/01
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-HTMLDemo/PennTreebankTS.html
last
visited 18/10/01
Text
For a number of years e-mail has
been used to support second language learning both formally and informally. At
the formal end of the spectrum there have been projects of various kinds
linking language classrooms in different countries (see, e.g., Eck et al. 1995); and at the informal end,
language learners with individual e-mail accounts have sought pen-friend-ships
with native speakers of their target language. More recently there has been a
surge of interest in the use of e-mail for tandem language learning, thanks
largely to the work of the International E-Mail Tandem Network, co-ordinated by
Helmut Brammerts at the R u h
r-Universität Bochum (see Little and Brammerts 1996). Inevitably, the Network’s
first concern has been to establish reliable infrastructures so that tandem
language learning by e-mail can actually take place. But members of the Network
recognize that long-term progress depends on elaborating appropriate theories,
using the theories to shape pedagogical experiments, and subjecting those
experiments to empirical evaluation. T
h i s paper is a preliminary contribution to that process. It first explores
some of the central issues of principle that arise from the concept of tandem
language learning in general and its e-mail version in particular, and then reports
on the pilot phase of an e-mail tandem project involving Irish university
students learning German and German university students learning English.
Input File
For IN for
a DT a
number NN number
of IN of
years NNS year
e-mail NN mail
has VBZ have
been VBN be
used VBN use
to TO to
support VB support
second JJ second
language NN language
learning VBG learn
both DT both
formally RB formally
and CC and
informally RB informally
. SENT .
At IN at
the DT the
formal JJ formal
end NN end
of IN of
the DT the
spectrum NN spectrum
there EX there
have VBP have
been VBN be
projects NNS project
of IN of
various JJ various
kinds NNS kind
linking VBG link
language NN language
classrooms NNS classroom
in IN in
different JJ different
countries NNS country
( ( (
see VB see
, , ,
e.g. NP e.g.
, , ,
Eck NP Eck
et NP et
al NN al
. SENT .
1995 CD @card@
) ) )
; : ;
and CC and
at IN at
the DT the
informal JJ informal
end NN end
, , ,
language NN language
learners NNS learners
with IN with
individual JJ individual
e-mail NN mail
accounts NNS account
have VBP have
sought VBN seek
pen-friend-ships NNS ship
with IN with
native JJ native
speakers NNS speaker
of IN of
their PP$ their
target NN target
language NN language
. SENT .
More RBR more
recently RB recently
there EX there
has VBZ have
been VBN be
a DT a
surge NN surge
of IN of
interest NN interest
in IN in
the DT the
use NN use
of IN of
e-mail NN mail
for IN for
tandem JJ tandem
language NN language
learning NN learning
, , ,
thanks NNS thanks
largely RB largely
to TO to
the DT the
work NN work
of IN of
the DT the
International NP International
E-Mail NP Mail
Tandem NP Tandem
Network NP Network
, , ,
co-ordinated VBN co-ordinated
by IN by
Helmut NP Helmut
Brammerts NP Brammerts
at IN at
the DT the
R NP R
u NP u
h NN h
r-Universität NP r-Universität
Bochum NP Bochum
( ( (
see VBP see
Little NP Little
and CC and
Brammerts NP Brammerts
1996 CD @card@
) ) )
. SENT .
Inevitably RB inevitably
, , ,
the DT the
Network's NP Network's
first JJ first
concern NN concern
has VBZ have
been VBN be
to TO to
establish VB establish
reliable JJ reliable
infrastructures NNS infrastructures
so IN so
that DT that
tandem JJ tandem
language NN language
learning VBG learn
by IN by
e-mail NN mail
can MD can
actually RB actually
take VB take
place NN place
. SENT .
But CC but
members NNS member
of IN of
the DT the
Network NP Network
recognize VBP recognize
that DT that
long-term JJ long-term
progress NN progress
depends VBZ depend
on IN on
elaborating VBG elaborate
appropriate JJ appropriate
theories NNS theory
, , ,
using VBG use
the DT the
theories NNS theory
to TO to
shape VB shape
pedagogical JJ pedagogical
experiments NNS experiment
, , ,
and CC and
subjecting VBG subject
those DT those
experiments NNS experiment
to TO to
empirical JJ empirical
evaluation NN evaluation
. SENT .
T NP T
h JJ h
i NN i
s NN s
paper NN paper
is VBZ be
a DT a
preliminary JJ preliminary
contribution NN contribution
to TO to
that DT that
process NN process
. SENT .
It PP it
first RB first
explores VBZ explore
some DT some
of IN of
the DT the
central JJ central
issues NNS issue
of IN of
principle NN principle
that WDT that
arise VBP arise
from IN from
the DT the
concept NN concept
of IN of
tandem JJ tandem
language NN language
learning VBG learn
in IN in
general NN general
and CC and
its PP$ its
e-mail NN mail
version NN version
in IN in
particular JJ particular
, , ,
and CC and
then RB then
reports NNS report
on IN on
the DT the
pilot NN pilot
phase NN phase
of IN of
an DT an
e-mail NN mail
tandem NN tandem
project NN project
involving VBG involve
Irish JJ Irish
university NN university
students NNS student
learning VBG learn
German JJ German
and CC and
German JJ German
university NN university
students NNS student
learning VBG learn
English. JJ English.
Appendix B – Code
# program.pl
# Input: File containing
Part-of-Speech tags of a tagged text,
# specified on the
command line
# Output: The text on its own, the
text with its POS tags, and the text with
# colour tags assigned.
# Claire Healy
# CL3
# 98284363
# 31.08.01
#! c:/perl/bin/perl.exe
($InFileName,
$OutFileName, $action) = @ARGV; # the 3 arguments on the
# command line-input file
#
output file, action
open (IN, $InFileName); # open input file
$OutFileName = ">" . $OutFileName; # allows output file to be
open (OUT, $OutFileName);
# open output file
#
written to
if ($action eq "tags") # if the 3rd argument on the
#
command line is "tags",procede
{
$wordCount = 0; #
set word count to 0
while ($line = <IN>) #
while the end of the file
{ #
hasn't been reached yet
($word, $tag, $lemma) = split /\t/,$line; # read in the line and split it
#
into 3 elements
print OUT "$word<$tag> "; # print (to the output file) the
#
word and its tag.
$wordCount++; #
increment word count.
if ($wordCount == 6) #
when the number of words on a
{ #
line reaches 6,
print OUT "\n"; #
print a newline (move onto a # new line)
$wordCount = 0; #
set the word count back to 0
}#endif #
- this ensures words are not
#
seperated from their tags.
}#endwhile
}#endif
elsif ($action eq "plain") # if the 3rd argument on the
{ #
command line is"plain",procede
$wordCount = 0;
while ($line = <IN>) #
same as above, except it only
{ #
prints out the words,
($word, $tag, $lemma) = split /\t/,$line; # so plain text is created.
print OUT
"$word ";
$wordCount++;
if ($wordCount == 10) # the number of words per line
{ # is 10.
print OUT "\n";
$wordCount = 0;
}#endif
}#endwhile
}#endelsif
elsif ($action eq "colour") # if the 3rd argument on the
{ #
command line is
#"colour"procede
while ($line = <IN>)
{
($word, $tag, $lemma) = split /\t/,$line;
if ($tag =~ /NN/i) #
print out the corresponding
{ #
colour for each tag, beside
print OUT "$word<RED> "; # the word
}#endif
elsif ($tag =~ /IN/i)
{
print OUT "$word<YELLOW> ";
}#endelsif
elsif ($tag =~ /DT/i)
{
print OUT "$word<PURPLE> ";
}#endelsif
elsif ($tag =~ /JJ/i)
{
print OUT "$word<GREEN> ";
}#endelsif
elsif ($tag =~ /RB/i)
{
print OUT "$word<LIGHT BLUE> ";
}#endelsif
elsif ($tag =~ /CC/i)
{
print OUT "$word<ILLUMINOUS GREEN>
";
}#endelsif
elsif ($tag =~ /VB/i)
{
print OUT "$word<BLUE> ";
}#endelsif
elsif ($tag =~ /PP/i)
{
print OUT "$word<ORANGE> ";
}#endelsif
else #
print out the word on its
{ #
own,if it does not have one
print OUT "$word "; # of the above 8 tags
}#endelse
}#endwhile
}#endelsif
close OUT; # close the ouput file
close IN; # close the input file
Appendix C – Results
Command line prompt:
H:\>perl
program.pl pos.txt plain.txt plain
Result:
plain.txt
For a number of years e-mail has been
used to
support second language learning both
formally and informally . At
the formal end of the spectrum there
have been projects
of various kinds linking language
classrooms in different countries (
see , e.g. , Eck et al . 1995 )
; and at the informal end , language
learners with
individual e-mail accounts have sought
pen-friend-ships with native speakers of
their target language . More recently
there has been a
surge of interest in the use of e-mail
for tandem
language learning , thanks largely to
the work of the
International E-Mail Tandem Network ,
co-ordinated by Helmut Brammerts at
the R u h r-Universität Bochum ( see
Little and
Brammerts 1996 ) . Inevitably , the
Network's first concern
has been to establish reliable
infrastructures so that tandem language
learning by e-mail can actually take
place . But members
of the Network recognize that long-term
progress depends on elaborating
appropriate theories , using the
theories to shape pedagogical experiments
, and subjecting those experiments to
empirical evaluation . T
h i s paper is a preliminary
contribution to that
process . It first explores some of the
central issues
of principle that arise from the
concept of tandem language
learning in general and its e-mail
version in particular ,
and then reports on the pilot phase of
an e-mail
tandem project involving Irish
university students learning German and German university students learning
English.
Command line prompt:
H:\>perl
program.pl pos.txt tag.txt tags
Result:
tag.txt
For<IN> a<DT>
number<NN> of<IN> years<NNS> e-mail<NN>
has<VBZ> been<VBN>
used<VBN> to<TO> support<VB> second<JJ>
language<NN> learning<VBG>
both<DT> formally<RB> and<CC> informally<RB>
.<SENT> At<IN>
the<DT> formal<JJ> end<NN> of<IN>
the<DT> spectrum<NN>
there<EX> have<VBP> been<VBN> projects<NNS>
of<IN> various<JJ>
kinds<NNS> linking<VBG> language<NN> classrooms<NNS>
in<IN> different<JJ>
countries<NNS> (<(> see<VB> ,<,>
e.g.<NP> ,<,> Eck<NP>
et<NP> al<NN> .<SENT>
1995<CD> )<)> ;<:>
and<CC> at<IN> the<DT>
informal<JJ> end<NN>
,<,> language<NN> learners<NNS> with<IN>
individual<JJ> e-mail<NN>
accounts<NNS> have<VBP> sought<VBN>
pen-friend-ships<NNS>
with<IN> native<JJ>
speakers<NNS> of<IN> their<PP$> target<NN>
language<NN> .<SENT>
More<RBR> recently<RB> there<EX> has<VBZ>
been<VBN> a<DT>
surge<NN> of<IN> interest<NN> in<IN>
the<DT> use<NN>
of<IN> e-mail<NN> for<IN> tandem<JJ>
language<NN> learning<NN>
,<,> thanks<NNS> largely<RB> to<TO>
the<DT> work<NN>
of<IN> the<DT> International<NP> E-Mail<NP>
Tandem<NP> Network<NP>
,<,> co-ordinated<VBN> by<IN> Helmut<NP>
Brammerts<NP> at<IN>
the<DT> R<NP> u<NP> h<NN>
r-Universität<NP>
Bochum<NP> (<(> see<VBP> Little<NP> and<CC>
Brammerts<NP> 1996<CD>
)<)> .<SENT> Inevitably<RB> ,<,>
the<DT> Network's<NP>
first<JJ> concern<NN> has<VBZ> been<VBN>
to<TO> establish<VB>
reliable<JJ> infrastructures<NNS> so<IN> that<DT>
tandem<JJ> language<NN>
learning<VBG> by<IN> e-mail<NN> can<MD>
actually<RB> take<VB>
place<NN> .<SENT> But<CC> members<NNS>
of<IN> the<DT>
Network<NP> recognize<VBP> that<DT> long-term<JJ>
progress<NN> depends<VBZ>
on<IN> elaborating<VBG> appropriate<JJ> theories<NNS>
,<,> using<VBG>
the<DT> theories<NNS> to<TO> shape<VB>
pedagogical<JJ>
experiments<NNS> ,<,> and<CC> subjecting<VBG>
those<DT>
experiments<NNS> to<TO>
empirical<JJ> evaluation<NN> .<SENT> T<NP>
h<JJ> i<NN> s<NN>
paper<NN> is<VBZ> a<DT>
preliminary<JJ>
contribution<NN> to<TO> that<DT> process<NN>
.<SENT>
It<PP> first<RB>
explores<VBZ> some<DT> of<IN> the<DT>
central<JJ> issues<NNS>
of<IN> principle<NN> that<WDT> arise<VBP>
from<IN> the<DT>
concept<NN> of<IN> tandem<JJ> language<NN>
learning<VBG> in<IN>
general<NN> and<CC> its<PP$> e-mail<NN>
version<NN> in<IN>
particular<JJ> ,<,> and<CC> then<RB>
reports<NNS> on<IN>
the<DT> pilot<NN> phase<NN> of<IN>
an<DT> e-mail<NN>
tandem<NN> project<NN> involving<VBG> Irish<JJ>
university<NN>
students<NNS> learning<VBG> German<JJ> and<CC>
German<JJ>
university<NN> students<NNS>
learning<VBG> English.
<JJ>
Command line prompt:
H:\>perl
program.pl pos.txt colour.txt colour
Result:
colour.txt
For<YELLOW> a<PURPLE>
number<RED> of<YELLOW> years<RED> e-mail<RED>
has<BLUE> been<BLUE> used<BLUE> to support<BLUE>
second<GREEN> language<RED> learning<BLUE>
both<PURPLE> formally<LIGHT
BLUE> and<ILLUMINOUS GREEN>
informally<LIGHT BLUE> . At<YELLOW> the<PURPLE>
formal<GREEN> end<RED> of<YELLOW> the<PURPLE>
spectrum<RED> there have<BLUE> been<BLUE> projects<RED>
of<YELLOW> various<GREEN> kinds<RED> linking<BLUE>
language<RED> classrooms<RED> in<YELLOW>
different<GREEN> countries<RED> ( see<BLUE> , e.g. , Eck et
al<RED> . 1995 ) ; and<ILLUMINOUS GREEN> at<YELLOW>
the<PURPLE> informal<GREEN> end<RED> , language<RED> learners<RED>
with<YELLOW> individual<GREEN> e-mail<RED>
accounts<RED> have<BLUE> sought<BLUE>
pen-friend-ships<RED> with<YELLOW> native<GREEN>
speakers<RED> of<YELLOW> their<ORANGE> target<RED>
language<RED> . More<LIGHT
BLUE> recently<LIGHT BLUE>
there has<BLUE> been<BLUE> a<PURPLE> surge<RED>
of<YELLOW> interest<RED> in<YELLOW> the<PURPLE>
use<RED> of<YELLOW> e-mail<RED> for<YELLOW>
tandem<GREEN> language<RED> learning<RED> ,
thanks<RED> largely<LIGHT
BLUE> to the<PURPLE> work<RED> of<YELLOW> the<PURPLE>
International E-Mail Tandem Network , co-ordinated<BLUE> by<YELLOW>
Helmut Brammerts at<YELLOW> the<PURPLE> R u h<RED>
r-Universität Bochum ( see<BLUE> Little and<ILLUMINOUS GREEN>
Brammerts 1996 ) . Inevitably<LIGHT
BLUE> , the<PURPLE> Network's first<GREEN> concern<RED>
has<BLUE> been<BLUE> to establish<BLUE> reliable<GREEN>
infrastructures<RED> so<YELLOW> that<PURPLE>
tandem<GREEN> language<RED> learning<BLUE> by<YELLOW>
e-mail<RED> can actually<LIGHT
BLUE> take<BLUE> place<RED> . But<ILLUMINOUS GREEN>
members<RED> of<YELLOW> the<PURPLE> Network
recognize<BLUE> that<PURPLE> long-term<GREEN>
progress<RED> depends<BLUE> on<YELLOW>
elaborating<BLUE> appropriate<GREEN> theories<RED> ,
using<BLUE> the<PURPLE> theories<RED> to shape<BLUE>
pedagogical<GREEN> experiments<RED> , and<ILLUMINOUS GREEN>
subjecting<BLUE> those<PURPLE> experiments<RED> to
empirical<GREEN> evaluation<RED> . T h<GREEN> i<RED>
s<RED> paper<RED> is<BLUE> a<PURPLE>
preliminary<GREEN> contribution<RED> to that<PURPLE>
process<RED> . It<ORANGE>
first<LIGHT BLUE> explores<BLUE> some<PURPLE>
of<YELLOW> the<PURPLE> central<GREEN> issues<RED>
of<YELLOW> principle<RED> that<PURPLE> arise<BLUE>
from<YELLOW> the<PURPLE> concept<RED> of<YELLOW>
tandem<GREEN> language<RED> learning<BLUE> in<YELLOW>
general<RED> and<ILLUMINOUS GREEN> its<ORANGE>
e-mail<RED> version<RED> in<YELLOW> particular<GREEN> ,
and<ILLUMINOUS GREEN>
then<LIGHT BLUE> reports<RED> on<YELLOW>
the<PURPLE> pilot<RED> phase<RED> of<YELLOW>
an<PURPLE> e-mail<RED> tandem<RED> project<RED>
involving<BLUE> Irish<GREEN> university<RED>
students<RED> learning<BLUE> German<GREEN> and<ILLUMINOUS
GREEN> German<GREEN> university<RED> students<RED>
learning<BLUE> English.
<GREEN>