French interlanguage oral corpora

Project start date: 2003-03 Project end date: 2004-02
Unlike first language acquisition (L1) research, which has made use of digital technologies for over 20 years to assist its research (in the shape of a powerful suite of software tools for the transcription, analysis and storage of L1 oral learner data, the CHILDES system, now used as standard), the field of second language acquisition (L2) research has been very slow in taking advantage of the new computerised technologies now available. This one-year project aimed to (1) apply and adapt the CHILDES tools to French L2 oral data, (2) to construct a database of French Learner Language Oral Corpora (FLLOC), and (3) to make it available to the research community on the web, via an informative and user-friendly searchable interface. The database contains five oral corpora, representing a total of 6400 oral tasks and a range of ability levels, from complete beginners to university level students. Each corpus comprises digital soundfiles and accompanying transcripts formatted using the CHILDES software tools. Additionally, most transcripts have been tagged for Parts of Speech, enabling powerful searches to be carried out directly on the morphosyntactic output. The database is available on the project website (www.flloc.soton.ac.uk), which contains full information about each corpus (e.g. learners, tasks used etc.), about the CHILDES tools and adaptations made, and a search engine enabling to target and download files according to specified criteria (e.g. task; learner etc.). Additionally, the project staff provided training to the L2 research community in the use of the CHILDES tools for L2 research.
Subject domains: 
Era(s): 
Country/region(s): 
Methods usedCategory
Coding and standardisationData structuring and enhancement
CollocatingData analysis
IndexingData analysis
Content analysisData analysis
Sound recordingData capture
Text encoding - descriptiveData structuring and enhancement
ParsingData analysis
Searching and queryingData analysis
Sound editingData structuring and enhancement
Topic Detection and TrackingData analysis
Statistical analysisData analysis
Use of existing digital dataData capture
Manual input and transcriptionData capture
Funding sources: 
Arts and Humanities Research Council (AHRC), Economic and Social Research Council (ESRC)
Content types created: 
Dataset/structured data, Sound, Text
Software tools used: 
CHILDES software
Source material used:  
The corpora included in the database are as follows: Progression corpus: a longitudinal study of 60 secondary school learners of French in the UK, performing 13 oral tasks each at termly intervals over a period of 2 ¼ years (years 7, 8 and 9). 650 transcripts and accompanying sound files. Source: Southampton research team Linguistic development corpus: a cross-sectional study of 60 secondary school learners performing 4 oral tasks each (20 learners in each of years 9, 10 and 11). Some of the tasks are repeated from the progression project above. 240 sound files and accompanying transcripts. Source: Southampton research team Salford corpus: a longitudinal study of 12 undergraduate learners performing 23 oral tasks over the course of their undergraduate course, some of them repeated before and after the year abroad. 300 transcripts and sound files. Source: ESRC-funded project directed by Prof Towell (Salford) and Prof Hawkins (Essex) Brussels corpus: a cross-sectional study of 150 Dutch-speaking adolescents learning French (year 6 in the Dutch schooling system; age 18). The corpus included in the database contains 111 sound files and transcripts of these learners performing a narrative task. Source: Alex Housen, Free University, Brussels. Reading corpus: recordings of 34 learners performing their GCSE oral examination. 26 French-speaking teenagers performing the same task are also included for comparison purposes. 60 transcripts (no sound files are included, for ethical reasons, as permission had not been granted). Source: Brian Richards, Reading University.
Digital resource created:  
The FLLOC database contains 5 substantial corpora of oral learner French, representing classroom learners at different stages of acquisition, from complete beginners to final-year undergraduates. It contains a total of 1335 sound files, and the accompanying 1335 transcripts in CHAT format (the CHILDES transcription conventions; see original bid). For each of these corpora (except the Reading one which does not include soundfiles), the audiocassettes have been digitised in wav format and anonymised, and all the transcripts have been formatted following the CHAT transcription system which is part of the CHILDES suite of software tools. Additionally, all transcripts for the Progression and Linguistic Development corpora (a total of 890 files) have been tagged morphosyntactically, giving rise to separate analysis files on which researchers can carry out searches directly on the morphosyntactic output. A web interface has been created, from which corpora can be downloaded, either by going directly to the corpus concerned, or by using the search facilities which accompany each corpus, and which enable criteria such as e.g. task, learner, year group, sex etc. to be selected in order to access a specific subset of the corpus. Additionally, the website contains a detailed description of the CHILDES system, details of the modifications we have made to the CHAT transcription system in order to address some SLA -specific transcription issues, details of the search mechanisms available, as well full details of the elicitation procedures, of the learners and of any project-specific adaptations to the CHAT transcription system.
Access to digital resource:  
Open Access
We are currently working on making all the data XML-compatible
Publications:  
MYLES, F., 2008: Investigating learner language development with electronic longitudinal corpora: Theoretical and methodological issues. In Ortega, L & Byrnes, H. (eds.) The longitudinal study of advanced L2 capacities. Hillsdale: Lawrence Erlbaum, pp. 58-72.

MYLES, F., 2007: Using electronic corpora in SLA research. In Ayoun, D. (ed.) Handbook of French Applied Linguistics. Amsterdam: John Benjamins, pp. 377-400.

MYLES, F., 2007: Complexité, exactitude et fluidité : le rôle que jouent les séquences préfabriquées dans l’interlangue des débutants. In Van Daele, S., Housen, A., Kuiken, F., Pierrard, M. & Vedder I. (eds.) Complexity, accuracy and fluency in second language use, learning and teaching. Brussels: KVAB, pp.167-81.

MYLES, F., 2005: The emergence of morpho-syntax in French L2. In J-M. Dewaele (ed.) Focus on French as a foreign language: multidisciplinary approaches. Clevedon, Multilingual Matters, pp. 88-113.

MYLES, F. 2005. Interlanguage corpora and second language acquisition research. Second Language Research, 21(4), 373-391.

MYLES, F. & MITCHELL, R. (2004): Using information technology to support empirical SLA research. Journal of Applied Linguistics, 1.2, pp. 169-196.

MARSDEN, E., MYLES, F., RULE, S. & MITCHELL, R., 2002: Oral French Interlanguage Corpora: Tools for Data Management and Analysis. Centre for Language in Education Occasional Papers no. 58. University of Southampton

MYLES, F. 2003: The early development of L2 narratives: a longitudinal study. Marges Linguistiques, 5, pp. 40-55.

MYLES, F. (2004): From data to theory: the over-representation of linguistic knowledge in Second Language Acquisition. In Hawkins, R. & Towell, R. (guest editors) ‘Empirical evidence and theories of representation in current research into second language acquisition’, Special Issue of Transactions of the Philological Society. pp. 139-168.

RULE, S. 2004: French interlanguage oral corpora: recent developments. Journal of French Language Studies,14, 3, pp. 343-356.

MYLES, F. & MITCHELL, R. (in press). Using information technology to support empirical SLA research. Journal of Applied Linguistics.

MARSDEN, E., MYLES, F., RULE, S., & MITCHELL, R. (2003). Using CHILDES tools for researching second language acquisition. In S. Sarangi & T. van Leeuwen (Eds.), Applied Linguistics and Communities of Practice (Vol. 18, pp. 98-113). London: British Association for Applied Linguistics /Continuum.

MITCHELL, R. Scaffolding and microgenesis in classroom learner French: a corpus based study In: Formal and Functional Approaches to Second Language Acquisition: Proceedings of 13th Annual Meeting of the European Second Language Association (EUROSLA), Universities of Edinburgh/ Heriot-Watt, 2003, p. 74.

MYLES, F., (2005): The emergence of morpho-syntax in French L2. In J-M. Dewaele (ed.) Focus on French as a foreign language: multidisciplinary approaches. Clevedon, Multilingual Matters.

RULE, S., MARSDEN, E., MYLES, F. & MITCHELL, R. 2003: Constructing a database of French interlanguage oral corpora. In Archer, D., Rayson, P., Wilson, E. & McEnery, T. (eds) Proceedings of the Corpus Linguistics 2003 Conference, UCREL Technical Papers no. 16, pp. 669-77, University of Lancaster.

MYLES, F., 2002: The development of verb morphosyntax in L2 French. University of Surrey (talk/workshop).

MYLES, F., 2003: Corpus d’interlangue française orale: outils de gestion et d’analyse. Université de Chambéry (whole day workshop).

MYLES, F., 2003: Utiliser CHILDES pour une analyse morphosyntaxique de la négation en français langue étrangère. Journée scientifique, Université de Paris (talk/workshop).

MYLES, F., 2003: Rethinking language research methods for a digital age. University of the West of England (talk/workshop).

MYLES, F. 2003: Rote-learned chunks and interlanguage development. Vrije Universiteit, Brussels (talk/workshop).

MYLES, F. 2003: Theoretical approaches to Second Language Acquisition research. Vrije Universiteit, Brussels (talk/workshop).

MYLES, F. 2003 (September): Second Language Acquisition research. North West Centre for Linguistics’ Research Training programme, University of Manchester (talk/workshop).

MYLES, F. 2004 (February): Using corpora for second language acquisition research: the case of verb morphosyntax. University of Lancaster (talk/workshop).

MYLES, F. 2004 (September): Using CHILDES for French Second Language Acquisition research. Workshop, AFLS Conference, Birmingham, UK (talk/workshop).

MYLES, F. 2005 (March): Longitudinal corpora: theoretical and methodological issues. Georgetown University Round Table. Washington DC, USA (talk/workshop).

MYLES, F. 2005 (March): Plenary address. UNTELE Conference: Usage des nouvelles technologies dans l’enseignement des langues étrengères. Compiègne, France (talk/workshop).

RULE, S. 2003 (January): Using CHILDES for researching SLA. University of Cambridge (talk/workshop).

RULE. S. 2004 (July): Using CHILDES in second language acquisition research. VARG Conference, University of Swansea (talk/workshop).

MARSDEN, E., MITCHELL, R., MYLES, F. & RULES, S. (September 2002). Using CHILDES tools for researching second language acquisition. British Association for Applied Linguistics. Cardiff, Wales (conference presentation).

MITCHELL, R., MYLES, F. & RULE, S. (April 2004). The use of corpora in second language acquisition research. Symposium (4 papers); American Association of Applied Linguistics; Portland, Oregon.

MYLES, F. (August 2002). The role of the verb phrase in L2 narratives: a longitudinal study. Association for French Language Studies. St Andrews, Scotland (conference presentation).

MYLES, F. (September 2002). The development of verb morphosyntax in L2 French. European Second Language Association. Basel, Switzerland (conference presentation).

MYLES, F. (September 2003). Nouvelles méthodologies pour l’informatisation de corpus d’interlangue orale française. Association for French Language Studies. Tours, France (conference presentation).

MYLES, F. (April 2004). The development of verb morphosyntax in French L2. American Association of Applied Linguistics; Portland, Oregon (conference presentation).

MYLES, F. (September 2004). The syntax-morphology interface in early French L2. AFLS Conference, Birmingham, UK (conference presentation).

MYLES, F. (September 2004). The development of verb morphosyntax in French L2. EUROSLA Conference; San Sebastian, Spain (conference presentation).

MYLES, F., MARSDEN, E., RULE, S. & MITCHELL, R. (May 2002). Corpus d’interlangue française: outils de gestion et d’analyse. Journée d’étude de l’ATALA: Constitution et exploitation de corpus du français parlé. Paris, France (conference presentation).

MYLES, F. & MITCHELL, R. (March 2003). Rote-learned chunks and interlanguage development: a corpus-based study. American Association of Applied Linguistics. Washington, USA (conference presentation).

RULE, S. (June 2000) A cross-sectional study of French Interlanguage in an instructional setting. Durham Postgraduate Linguistics Conference (conference presentation).

RULE, S. (August 2000). A cross-sectional study of French Interlanguage in an instructional setting. AFLS Conference, Québec (conference presentation).

RULE, S. (August 2002). The Development of Negatives in the French L2 Classroom. AFLS Conference, St Andrews, Scotland (conference presentation).

RULE, S. (September 2002). The acquisition of negatives by classroom learners of French. European Second Language Association. Basel, Switzerland (conference presentation).

RULE, S. (September 2002). The acquisition of negatives by classroom learners of French. European Second Language Association. Basel, Switzerland (conference presentation).

RULE, S., MARSDEN, E., MYLES, F. & MITCHELL, R. (March 2003). Constructing a database of French interlanguage oral corpora. Corpus Linguistics conference. Lancaster, UK (conference presentation).

RULE, S. & MYLES, F. (September 2002). The acquisition of negatives by classroom learners of French. European Second Language Association. Basel, Switzerland (conference presentation).

Institutions affiliated with this project: 

UK HE institutions involved:
University of Southampton
University of Newcastle upon Tyne

Project staff and expertise: 

Principal staff member:Professor Florence Myles; Professor Rosamond Mitchell
Other staff:Computing officer(s) / Technical supporter(s), Postdoctoral researcher(s) / Research assistant(s)
External expertise:


Metadata on this arts-humanities.net record
Author(s) of recordFlorence Myles
TitleFrench interlanguage oral corpora
Record created2005-11-07
Record updated2011-01-14 16:19
URL of recordhttp://www.arts-humanities.net/node/2073
Citation of recordFlorence Myles: French interlanguage oral corpora.
<http://www.arts-humanities.net/node/2073>
created: 2005-11-07, last updated 2011-01-14 16:19