Linguistics and Education 51 (2019) 69–78 Contents lists available at ScienceDirect Linguistics and Education journal homepage: www.elsevier.com/locate/linged Which instructional programme (EFL or CLIL) results in better oral communicative competence? Updated empirical evidence from a monolingual context Juan de Dios Martínez Agudo University of Extremadura (Spain) a r t i c l e i n f o Article history: Received 22 January 2018 Received in revised form 2 April 2019 Accepted 26 April 2019 Available online 10 May 2019 Keywords: CLIL EFL Oral skills Extramural exposure to English a b s t r a c t This cross-sectional study examines the impact of CLIL programmes on Primary and Secondary Education learners’ oral abilities in the monolingual community of Extremadura (Spain). The evolution of the bilingual (CLIL) and non-bilingual (EFL) strands from Primary Education to Compulsory Secondary Education to Baccalaureate is traced through the administration of post- and delayed post-tests. Results indicate that the experimental group (CLIL) obtains better results in both oral abilities than the control group (EFL), with such differences being much more noticeable with time and experience, especially speaking. Contrary to what might be thought, the extramural exposure intensity factor does not operate with the expected impact on CLIL learners’ developing oral competence in terms of statistical significance. © 2019 Elsevier Inc. All rights reserved. 1. Literature review Content and Language Integrated Learning (CLIL) appeared on the European scene towards the end of the last century as an effective alternative to the Communicative Language Teaching (CLT) approach (Coyle, Hood, & Marsh, 2010; Dalton-Puffer, 2011). This new educational approach aims at overcoming the perceived communicative limitations of traditional educational approaches, and thereby contributing to improving learners’ overall target language competence. Certainly, such language expectations, which are fuelled by an overall dissatisfaction with the observable language learning outcomes (Dalton-Puffer, Nikula, & Smit, 2010), are somewhat ambitious as well as unrealistic (Cenoz, Genesee, & Gorter, 2014). Perhaps the potential of CLIL in terms of language benefits, as rightly stated by Dallinger, Jonkmann, Hollm, and Fieg (2016), has been overestimated. Undoubtedly, what today becomes ‘THE’ answer or solution to language teaching in the 21st century indeed presents serious limitations and shortcomings, and that is why CLIL programmes “should only be introduced if the conditions to make it successful are met” (Lasagabaster, 2008, p. 35) and only if “programmes are carefully designed and developed in each school context” (Roquet & Pérez-Vidal, 2015, p. 20), something we com- E-mail address: [email protected] https://doi.org/10.1016/j.linged.2019.04.008 0898-5898/© 2019 Elsevier Inc. All rights reserved. pletely subscribe to. Contrary to most expectations and in view of the shortcomings recently reported by CLIL research, what becomes clear is that CLIL is not a panacea for language teaching (DaltonPuffer, 2011) as “there is a wide delivery gap between what is provided in teaching, and what comes out in terms of learning” (Dentler, 2007, p. 170). Although the results are very promising provided CLIL programmes occur under certain conditions, Meyer, Coyle, Halbach, Schuck, and Ting (2015) suggest that CLIL instruction may perhaps not be reaching its full potential. In the same vein, Bruton (2013) recognises that CLIL does not produce the expected FL benefits. Hence, there is a clear need for a ‘reality check’, as suggested by Dalton-Puffer (2009), so as to confirm the purported linguistic benefits of CLIL classrooms. What characterises CLIL more than anything is the existing diversity of CLIL programme formats and practices (Cenoz et al., 2014; Dalton-Puffer et al., 2010; Hüttner & Smit, 2014) because this new educational approach is understood in different ways (Cenoz et al., 2014; Mehisto, Marsh, & Frigols, 2008). Such diversity or flexibility evidently leads to serious challenges in CLIL research, thus preventing generalised conclusions about CLIL effectiveness (Marsh, 2008; Nikula, Dalton-Puffer, & Llinares, 2013). Over the last two decades the linguistic potential of CLIL has been extensively discussed and reported by the research literature (Admiraal, Westhoff, & de Bot, 2006; Coral, Lleixà, & Ventura, 2016; Coyle, 2007; Dalton-Puffer, 2008; Dalton-Puffer et al., 2010; Gallardo del Puerto & Gómez, 2013; Lorenzo, Casal, & Moore, 2009; Meyer 70 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 et al., 2015; Navés, 2011; Pérez-Cañado & Lancaster, 2017; Rumlich, 2014; Zydatiß, 2012), although perhaps such claims “are all too often made without substantial empirical evidence” (Cenoz et al., 2014, p. 256), hence the need to be more cautious about the purported linguistic benefits of CLIL programmes. Second Language Acquisition (SLA) research, as Roquet and Pérez-Vidal (2015, p. 1) remind us, has established associations between differences in linguistic outcomes and differences in learning contexts, investigating in particular “what such contexts offer in terms of language exposure and opportunities for practising the target language (DeKeyser, 2007)”. This is the main approach of the present study. According to Dalton-Puffer (2011), the theoretical justifications for CLIL have been explained in terms of the Input (Krashen, 1985), Interaction (Long, 1996) and Output (Swain, 1995) Hypotheses. According to SLA-inspired research, CLIL classrooms are expected to provide more input and exposure to the target language as well as plenty of opportunities for communicative practice, thus creating the optimal conditions for language acquisition to take place by offering a linguistically more challenging environment, resulting in improved L2 competence (Eurydice, 2006; García-Mayo & Basterrechea, 2017; Lasagabaster & López, 2015; Roquet & Pérez-Vidal, 2015). Perhaps such increased and continued exposure to the target language input is likely the key variable for the success of CLIL programmes in terms of language benefits, as rightly recognised by Cenoz et al. (2014). Certain language aspects or areas are developed more than others in CLIL programmes when compared to conventional foreign language classrooms, with vocabulary being the only linguistic aspect which is explicitly treated in CLIL lessons, as pointed out by Dalton-Puffer (2008) and Ruiz de Zarobe (2011). Specifically, Dalton-Puffer (2008) concludes that research on CLIL shows that the receptive skills in particular (listening and reading) are positively affected by CLIL education. While Dalton-Puffer (2008) reports that productive language skills (especially speaking) do not seem to be promoted in the CLIL classrooms, Ruiz de Zarobe (2010) argues, in contrast, that CLIL has a clear impact preferably on oral communicative competence since this new educational approach provides more opportunities for students to practise oral skills when using the target language, which is conceived as the medium of communication in the classroom and not as an end in itself. In the same vein, Marsh (2008) argues that CLIL theoretically boosts risk-taking and linguistic spontaneity (talk), among other aspects. Hüttner and Smit (2014, p. 166) also make it clear that CLIL programmes promote communicative interaction in providing “occasion and communicative need to students”. While it is true that the CLIL impact on the reading ability seems evident as a consequence of continued exposure to written input, positive CLIL-effects on listening are in contrast less clear-cut. Contrary to what has traditionally been sustained in bilingual education, Pérez-Cañado and Lancaster (2017) conclude that it is productive, as opposed to receptive, oral skills which are more positively affected by CLIL in the medium- and long term. That is why pronunciation, an aspect on which CLIL has been reported to have little impact (Dalton-Puffer, 2008; Gallardo del Puerto & Gómez, 2017; Rallo & Jacob, 2015; Ruiz de Zarobe, 2011), indeed calls for further investigation. In short, what actually emerges from current research is that CLIL programmes contribute to the improvement of overall target language competence. However, Gallardo del Puerto and Gómez (2013, 2017) remind us that the purported language benefits of CLIL instruction may not be such because certain intervening variables have not been sufficiently controlled for in CLIL research (Bruton, 2011). Divergent results could in fact be ascribed to the differential effects of input-related variables such as the amount of in-class and extramural exposure to the target language. Although the pro-CLIL arguments only address the alleged potential or benefits of this educational approach, Bruton (2013, p. 590), however, reminds us that “there are other aspects of the CLIL arguments that are much more questionable, or at least debatable”, as for example the effects of intensity or amount of language exposure as in some classes the FL is used extensively while in others minimally. Since CLIL learners are expected to receive greater in-school exposure to the target language as a consequence of their participation in CLIL programmes, recent studies have also controlled the effects of extramural exposure to English in particular (in the form of TV series and movies, songs, the Internet and social networks, videogames, books and magazines, private lessons and visits to English speaking countries. . .) on oral competence development (Olssen & Sylvén, 2015Olsson and Sylvén, 2015; Sylvén, 2006), which is the main approach of this investigation. 2. Research questions Overall, longitudinal CLIL-research reports somewhat contradictory as well as inconclusive results, perhaps due to the variability of CLIL implementation which negatively affects the comparability of CLIL studies in Europe as well as the methodological shortcomings which could question the validity of the results obtained (Bruton, 2011, 2013, 2015; Dallinger et al., 2016; Paran, 2013; Pérez-Cañado & Lancaster, 2017). While it is true that oral comprehension and production constitute the least researched language skills in CLIL research (Pérez-Cañado & Lancaster, 2017), the few existing studies of the oral abilities in CLIL contexts to date have certainly offered arbitrary results (especially concerning the speaking skill), as Gallardo del Puerto and Gómez (2017) point out. Accordingly, this research paper aims to add further updated empirical evidence to the already existing one by investigating the potential effects of CLIL programmes on learners’ oral communicative competence, controlling in particular the differential effects, if any, of the extramural L2 exposure contextual variable, thus addressing the following research questions: RQ1: How do different instructional contexts (CLIL and EFL classrooms) affect oral communicative competence? RQ2: Which oral skills, if any, does CLIL benefit the most? RQ3: How do CLIL and EFL students’ oral communication skills evolve from Primary Education to Baccalaureate? RQ4: How does continued exposure to the target language (e.g., beyond the classroom) interact with CLIL? What is the differential effect of the amount of extramural exposure to English on CLIL and EFL learners’ oral communicative competence? 3. Method This study is framed within a broader research project focusing on a three-year large-scale evaluation of CLIL programmes carried out in those Spanish monolingual communities with the least tradition in bilingual education (Andalusia, Extremadura, and the Canary Islands), after exactly ten years of CLIL implementation. The main approach of this quantitative study is to ascertain the impact of CLIL education on learners’ oral competence at the end of Primary (6th grade) and Compulsory Secondary (4th grade) Education. The evolution of the bilingual (CLIL) and non-bilingual (EFL) learners’ oral communication skills from Compulsory Secondary Education to Baccalaureate is also traced through the administration of delayed post-tests. Dependent (oral competence results), independent (CLIL programmes) and intervening (extramural exposure to English) variables have been taken into consideration in this study so as to determine whether CLIL is truly responsible for the potential differences observed or whether the aforementioned intervening contextual variable can account for a greater proportion of the J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 variance. Inter-group comparisons were carried out in each educational stage (in Primary and Compulsory Secondary Education), namely between the experimental (CLIL) and control (EFL) groups. 3.1. Context and participants The present study was conducted within the monolingual autonomous community of Extremadura which is situated in the south-west of Spain on the border with Portugal and which has very little tradition in bilingual education (from 2004 onwards). At the present time there are 293 bilingual sections set up in 253 schools in Extremadura at Primary and Secondary Education stages. Education in Spain, which is relatively decentralised from the central government and thus transferred to each of the 17 autonomous communities, is compulsory from ages 6 to 16, which covers 6 years of Primary Education followed by 4 years of Compulsory Secondary Education. Indeed, English is the most frequently chosen (first) foreign language both in Primary and Secondary Education, which is taught three-four hours a week since the very beginning of pre-primary education. The start of bilingual education in Spain can be traced back to the academic year 1996–1997 as a result of the agreement signed between the Spanish Ministry of Education and the British Council. Since Spain is clearly below the average for the European Community concerning foreign language ability according to the results of the Eurobarometer survey (European Commission, 2012), the Spanish Ministry of Education has committed to promoting bilingual education over the last two decades. Bilingual education programmes are in fact being fostered with the overall objective of improving second language competence. As mentioned above, bilingual education in Spain is mainly characterised by the existing diversity of CLIL programme formats and practices (Cenoz et al., 2014), mainly due to the aforementioned decentralisation of our educational system, which transfers educational powers to each autonomous community, as pointed out by Fernández (2009). The fact is that CLIL implementation initiatives are certainly developing at different paces depending on the Spanish autonomous community we refer to. In fact, there do exist large differences across regions in relation to the number of nonlanguage subjects included in bilingual programmes and the total amount of exposure to the target language. While some regions recommend certain non-language subjects, but allow schools to choose from a selection, other regions, by contrast, leave it up to each school to decide for itself. For example, the specific subject of mathematics is excluded in the region of Madrid while it is currently taught in Extremadura. Depending on the available teachers’ competence profile, each school can decide the non-language subjects taught through the foreign language, although at least one specific subject must correspond to the area of Natural and Social Sciences. The amount of time or exposure to the target language also varies, with most Spanish regions setting the minimum and maximum proportions. For example, in most Spanish regions 20–50% of the timetable should be taught in the target language, including English language lessons, but it can be more. In particular, in Extremadura the minimum proportion required is 20% of weekly timetable, including both foreign language and non-language subjects classes. Additionally, the foreign language must be used for at least one session a week in each non-language subject considered in the programme. In short, such differences in the number of non-language subjects taught in the bilingual programme and the amount of time that should be taught in the target language provide an idea of the existing diversity of the bilingual programmes in Spain, as advocated by Cenoz et al. (2014). Despite such differences, certain common features can also be identified concerning bilingual education programmes in both Primary and Secondary Education throughout Spain: students generally volunteer to join 71 bilingual education programmes; bilingual cohorts co-exist with mainstream groups who only receive L2 input in their FL classes; the weekly schedule and contents of non-language subjects are the same as those followed by the rest of non-CLIL students; the number of content subjects that can be taught through a foreign language in bilingual programmes can range from two to four subjects, to mention only a few. It should also be added that, in most Spanish regions, bilingual schools are allowed to establish selection procedures to incorporate students in the programme, only if demand exceeds the number of available places, including their previous experience of bilingual education and language level. Such selection criteria will be determined by the staff members involved in the programme. B2 level as established in the CEFR is the minimum required by educational authorities to teach in a bilingual section in almost all Spain, except in Madrid and Navarra where a C1 level is demanded. CLIL teachers’ language proficiency level is assessed by a University degree in the foreign language or external certificates issued by other institutions (Cambridge University Proficiency test, Official Spanish School of Languages. . .). CLIL teachers are also encouraged and supported by educational administration to improve their overall target language competence and methodological knowledge abroad. To this end, both CLIL teachers and learners can benefit from study visits abroad and language immersion programmes. Among the teachers involved in the bilingual programme, a foreign language specialist teacher will be in charge of coordinating the development of such bilingual programme in the school. Schools involved in these bilingual programmes are also provided with native assistant teachers who serve as linguistic models and cultural ambassadors in bilingual classrooms. CLIL approach was launched in Extremadura in the academic year 2004–2005 with the implementation of the so-called bilingual section projects in three different foreign languages (English, French or Portuguese). In 2008, the Government of Extremadura, in line with European initiatives on language policy, promoted the Plan Linguaex to immerse this traditionally monolingual region in plurilingualism with particular actions aimed at improving learners’ overall second language competence. Since the 2017–2018 academic year, bilingual education programmes in this Spanish region exist at all educational levels and are implemented gradually from the first year of every educational stage (from pre-primary education to Baccaulaurate). In the context of the study, only four non-language subjects (Social Sciences, Natural Sciences, Art and Physical Education) are taught in the target language in Primary Education, while a larger number of content subjects are studied in Compulsory Secondary Education (History, Geography, Biology, Physics, Chemistry, Mathematics, Physical education, Music and Technology). However, such offer of CLIL subjects may be extended in the rest of the schools of the region, mainly depending on the availability of qualified teachers. Given the diversity of CLIL practices suggested by Cenoz et al. (2014) and Hüttner and Smit (2014), Bruton (2013), for example, refers to the fact that in some classes the foreign language is used extensively while in others minimally, depending mainly on CLIL teachers’ language competence and the complexity of subject contents taught at schools, among other aspects. Although many teachers have enough command of the foreign language, however, more specific training in CLIL pedagogy is really needed. The population sample under control in this study is made up of 318 students from 10 randomly selected schools (8 public bilingual schools and 2 charter non-bilingual ones) located in both urban and rural areas of the monolingual autonomous community of Extremadura. Of these, half of participants were 6th grade primary education students, whose ages ranged between 11 and 12 years with a mean age of 11.4 years, while the other half were 4th grade compulsory secondary education students, aged 15–16 years 72 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 (mean age 15.6 years). The control group (EFL) consists of 162 learners while the remaining 156 learners form the experimental group (CLIL). Primary education learners have been receiving CLIL instruction for two years, while secondary education learners have been involved in bilingual programmes for at least six years. It is a fact that CLIL learners accumulate much more time of English language exposure than their EFL counterparts because of their participation in bilingual programmes. The sampling procedure employed in this study was probability sampling (random selection). From the outset of the investigation, homogeneity has been guaranteed in our sample with student cohorts with the same verbal intelligence, motivation, and English level for the sake of comparability, which undoubtedly contributes to the validity of the study. Those schools which revealed the greatest homogeneity in terms of the variables considered were in fact selected as the final sample for the study. Additionally, there is a perfect balance in terms of student cohorts (CLIL/EFL) and educational levels (primary and secondary education). It is worth mentioning that no private school participated in the present study, so the comparison with this type of school has not been possible in Extremadura. Nor do we have any data from charter schools at the Baccalaureate stage. Below, Table 1 provides an outline of the participating sample in this study. 3.2. Data collection instrument and procedure The cross-sectional data were gathered through the administration of pre- and post-tests to both student cohorts at the end of Primary and Compulsory Secondary Education, as well as delayed post-tests in 1st grade of Baccalaureate. In order to guarantee the above-mentioned homogeneity and, accordingly, comparability of the participating sample groups in terms of verbal intelligence, motivation and English level, the pre-tests employed for information-gathering were previously validated and tried-and-tested instruments in the field of psychology or language teaching research to measure the first two variables and by collecting participants’ English grades to address the third. Specifically, the verbal intelligence test, which comprises multiple-choice items involving analogies, antonyms and odd-one-out exercises, was part of Santamaría, Arribas, Pereña, and Seisdedos (2016) battery of tests aimed at a factorial evaluation of intellectual aptitudes. To measure learners’ motivation (in terms of willingness, self-commitment, lack of interest and anxiety), Pelechano’s (1994) MA test was used to isolate motivational factors of achievement and anxiety. Both tests were applied in the same session in each of the participating schools in February-March 2015. Such tests were analysed by a psychologist hired to this end. At a preliminary stage of the investigation, an initial questionnaire was also administered to the participants, collecting personal data and information on their parents’ educational background, which was taken as a proxy for socioeconomic status –SES-, and their extramural exposure to the English language. Parents’ educational level is identified as the SES indicator with the greatest influence (Golberg, Paradis, & Crago, 2008). The SES variable was exclusively measured by the educational attainment of parents, thus establishing three levels in this regard: low (no studies/Primary Education), medium (Secondary Education – Vocational Training) and high (Tertiary Education). This information was provided by students themselves once the meaning of each educational level was previously made clear through the provision of illustrating examples to ensure respondents’ understanding. Specifically, the amount of extramural exposure to English (in the form of TV series and movies, songs, the Internet and social networks, videogames, books and magazines, private lessons and visits to English speaking countries) respondents are weekly exposed to was calculated on the basis of their own perceptions through using a short survey designed to that end. With this in mind, the Sundqvist and Sylvén (2014) questionnaire, which is based on a language diary where the respondents must reflect on their exposure to English outside of school, offer examples and record the number of hours devoted to specific activities per week, was particularly employed. The data reported below were gathered through the administration of post-tests (English language tests) to both student cohorts at the end of Primary and Compulsory Secondary Education in MayJune 2015, as well as delayed post-tests to the same students who were previously in 4th grade of Compulsory Secondary Education and who were now in 1st grade of Baccalaureate (in a programme with significantly less exposure to English), six months later the completion of the bilingual programme, in December 2015. Specifically, the listening and speaking tests for each educational stage were carefully designed and validated for the purpose of this study (see Madrid, Bueno, & Ráez, 2019, for a detailed explanation of their internal reliability and validity properties). With this in mind and based on the Common European Framework of Reference (CEFR), an extensive examination of the national Decrees and the regional orders pertinent to both specific educational grades as well as a careful selection of textbooks designed for these particular levels of education were conducted in order to ensure content validity and material suitability. Additionally, a pilot procedure was conducted, with the tests being scrutinised by external experts who critically assessed their level of difficulty, clarity and length, among other aspects. Inter-group comparisons were in fact carried out across both types of schools (public-charter) in both educational stages (primary and compulsory secondary education) and social settings (urban and rural areas). The experimental and control groups were given the same language tests (one for 6th grade of Primary Education and another one for 4th grade of Compulsory Secondary Education). Tests were distributed during class time in every participating school under the researcher’s supervision. Participants’ oral communication skills were assessed via two English language competence tests: a listening test and an oral interview. While the listening test was administered to the whole group in one sitting under the same conditions, the speaking test was, however, applied to a much smaller number of students who, besides taking the listening task, were organised into pairs and selected according to their level of language proficiency (grades A, B and C). Overall, the listening test was mainly designed to assess participants’ oral comprehension in the target language in which they must deduce meanings and draw inferences from brief dialogues. Specifically, this test consisted of different dialogues containing true/false, matching and multiple choice questions. The recording was heard twice. The oral interview task consisted of an individual speaking exercise (about students’ personal lives as well as a picture description) and a spoken interaction activity (topic discussion, particularly for Compulsory Secondary Education students) so as to assess communicative functions such as describing, giving and justifying opinions, expressing preferences, making suggestions, agreeing and disagreeing, and making comparisons. In order to gauge their oral production, respondents were shown several pictures and asked to describe them and then answer several questions. The secondary education students were engaged in a discussion in which two topics were chosen to debate from a selection of four. In the oral interviews, the respondents were organised into pairs in a quiet room by the researcher. Respondents were asked to interact with each other. The 10-minute oral interviews were recorded for subsequent evaluation and scoring. A native English-speaking teacher and a non-native teacher with wide experience teaching English evaluated such recordings. While the scoring of the listening test was completely objective by following clear-cut criteria, the speaking test was in contrast Number Mean Student cohorts Gender Socioeconomic status (SES) Type of school Setting Male Female Educational stage Primary (6th grade) Secondary (4th grade) Baccal. (1st grade) Low Medium High Public Charter Urban Rural 67 47 (14.7%) 53 (16.6%) 56 (17.6%) 156 (49.1%) 0 (0.0%) 57 (17.9%) 99 (31.1%) 43 (13.5%) 65 (20.4%) 54 (16.9%) 96 (30.1%) 66 (20.7%) 56 (17.6%) 106 (33.3%) 90 (28.3%) 118 (37.1%) 110 (34.5%) 252 (79.2%) 66 (20.7%) 113 (35.5%) 205 (64.4%) N M CLIL 156 (49.1%) 90 (28.3%) 66 (20.7%) 82 (25.7%) 74 (23.2%) N M EFL 162 (50.9%) 84 (26.4%) 78 (24.5%) 80 (25.1%) 82 (25.7%) N M Total 318 (100.0%) 174 (54.7%) 144 (45.2%) 162 (50.9%) 156 (48.9%) 67 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 Table 1 The research sample. 73 74 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 more subjective in its scoring, thus following a holistic approach by considering essential aspects such as pronunciation, fluency, vocabulary, grammar accuracy and appropriacy of response (PérezCañado & Lancaster, 2017). To this end, a rubric was particularly designed and validated for the assessment of speaking performance, comprising the aforementioned criteria. The mean scores gathered in each test (on a scale from 0 to 10, with 10 being the top mark in the Spanish educational system) are displayed in the tables below. Accordingly, higher scores indicate greater language proficiency. The data were analysed statistically using the SPSS program (21.0 version). Means and standard deviations were calculated while differences between group means were compared statistically. To be able to address research questions, a one-way repeated measures analysis of variance (ANOVA) and paired samples t tests were employed to ascertain the existence of statistically significant differences between and within groups. A pvalue less than 0.05 was regarded as statistically significant. Lastly, Cohen’s d coefficient was employed to calculate effect sizes using Gpower 3.1, as recommended by the statistician hired to this end. 3.3. Findings and discussion As explained above, the main goal of the present study was to analyse the impact of CLIL programmes as opposed to traditional EFL teaching on the oral comprehension and production skills, providing updated empirical evidence of the purported beneficial linguistic effects of CLIL. The tables below provide an overview of the descriptive statistics. Both RQ1 (How do different instructional contexts (CLIL and EFL classrooms) affect oral communicative competence?) and RQ2 (Which oral skills, if any, does CLIL benefit the most?) will be jointly addressed below. Overall, the resulting data confirm that the experimental group (CLIL learners) significantly outperform the control group (mainstream EFL learners) in both oral skills at the end of Primary and Compulsory Secondary Education and in the first grade of Baccalaureate, although the greatest gains are undoubtedly observed in the speaking skill, contrary to what has traditionally been sustained in CLIL research. As can be seen below in Table 2, both cohorts surprisingly obtain very similar scores in the listening test at the end of Primary Education, as Cohen’s d is extremely low (−0.122), however, the experimental group (CLIL) scores higher on the oral production ability than the control group (EFL), such difference in fact turns out to be statistically significant, with a higher Cohen’s d (−1.184). When finishing their Compulsory Secondary Education studies, statistically significant differences (on the highest level of statistical significance: p < 0.001) emerge in favour of bilingually educated students in both oral skills, being such difference much more noticeable in the speaking skill, with a much higher Cohen’s d (−1.489). In short, the results obtained in the present study confirm that the positive CLIL effects on oral competence are mainly visible with time. When examining the effects of CLIL instruction, Lasagabaster (2008) confirms statistical significance for the listening skill in favour of CLIL learners. Based on the findings, Brevik and Moe (2012), Rumlich (2014) and Dallinger et al. (2016) also report positive CLIL-effects on the listening skill. By contrast, Mattheoudakis, Alexiou, and Laskaridou (2014), Pérez-Vidal and Roquet (2015) and Prieto-Arranz, Rallo, Calafat-Ripoll, and Catrain (2015) conclude that receptive skills are positively affected by CLIL but to a certain extent since the lack of evidence of substantial gains in listening in statistical terms. Contrary to most expectations that the listening skill would be significantly improved by CLIL instruction as a result of the continued exposure to oral input (Dalton-Puffer, 2008), the longitudinal study by Pladevall and Vallbona (2016) surprisingly reveals that the control group (EFL-only exposure) significantly outperforms the experimental group (EFL + CLIL exposure) in their listening skills. In relation to this, Pladevall and Vallbona (2016) conclude that in contexts of minimal and equal exposure to the target language, CLIL has no remarkable effects. In the same vein, Pérez-Vidal and Roquet (2015) found no differences between both cohorts in their listening competence since reading but not listening improves significantly. All in all, gains in listening competence remain inconclusive in recent CLIL research. What is true is that the results obtained in the present study do corroborate in some way the assumption made by Ruiz de Zarobe (2008) who concluded that the more exposure to the foreign language, the better the oral outcomes by making reference to the continued exposure or input provided by CLIL classrooms. However, the results obtained are in contrast with findings published in previous studies, as mentioned above. Contrary to what has traditionally been claimed in bilingual education, the situation is different, however, for the speaking skill as statistically significant differences across both cohorts are detected in favour of CLIL learners in each educational stage. This finding mirrors previous results from research where the oral production skill is more developed than the oral comprehension one, as for example the study by Gassner and Maillat (2006) who report substantial improvements in CLIL learners’ spoken production. The studies by Hüttner and Rieder-Bünemann (2010) and Zydatiß (2012) also reveal a substantial difference in CLIL students’ oral proficiency. In the same vein, Gallardo del Puerto and Gómez (2013, 2017) also confirm the benefits of CLIL in terms of oral abilities in the Spanish context. The pilot study by Czura and Kotodynska (2015) also reveals the beneficial effects of CLIL-based instruction on oral communicative competence (in terms of fluency, pronunciation and vocabulary) in a primary school setting in Poland. Similarly, Delliou and Zafiri (2016) also provide positive evidence of the impact of CLIL in the development of the speaking skill. Lastly, in the same vein as the longitudinal study by Nieto (2016) in which substantial differences in favour of CLIL learners were found in oral production and interaction in Primary Education, but not in oral comprehension, the recent study by Pérez-Cañado and Lancaster (2017) also concludes that productive, as opposed to receptive, oral skills are more positively affected by CLIL programmes in the medium and long term. In short, such finding is in line with what has been reported by Dalton-Puffer (2011, p. 189) “the area where a difference between CLIL students and mainstream learners is most noticeable is their spontaneous oral production”. By contrast, this finding diverges from that obtained by Rallo and Jacob (2015) who concluded that the uniformity of results in both cohorts in terms of pronunciation achievement seriously questions the effectiveness of CLIL programmes to enhance learners’ oral skills, also differs from the study by Rumlich (2014) who reported an insignificant effect on productive English skills. Regarding RQ3 (How do CLIL and EFL students’ oral communication skills evolve from Primary Education to Baccalaureate?), the results gathered confirm the relatively long-lasting effects of CLIL education after the completion of the programme, as evidenced by significant differences in favour of CLIL learners in both skills in the first grade of Baccalaureate, with a much higher Cohen’s d (−2.045), especially in speaking, which shows that positive CLIL effects remain relatively observable over time. Or, to put it another way, the differential effects exerted by CLIL instruction on learners’ oral communicative competence endure even after the completion of the programme, in the first grade of Baccalaureate. Accordingly, this result is in line with the findings by Mattheoudakis et al. (2014) who concluded that the effect of CLIL instruction on learners’ productive skills takes more time to be evidenced. This result also corroborates the findings from Pérez-Cañado and Lancaster (2017) who report that productive, as opposed to receptive, oral skills are more positively affected by CLIL in the medium and long term. J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 75 Table 2 Achievement mean scores and intergroup comparisons in the listening and speaking tests at each educational stage. Educational stage Primary Education Compulsory Secondary Education Oral communication skills Listening Speaking Listening Speaking Listening Baccalaureate Speaking * Group N Mean Standard deviation EFL CLIL EFL CLIL 80 82 11 20 6.132 6.295 4.273 7.075 1.625 0.981 1.992 2.540 EFL CLIL EFL CLIL 82 74 17 9 5.331 7.065 6.588 9.556 2.057 2.363 2.418 0.464 EFL CLIL EFL CLIL 29 38 4 8 4.236 6.616 7.625 9.625 3.250 3.865 1.652 0.443 Cohen’s d p value −0.122 0.443 −1.184 0.004* −0.786 <0.001* −1.489 <0.001* −0.659 0.010* −2.045 0.007* The level of significance was set at p < 0.05. Fig. 1. Mean scores on the listening test. Fig. 2. Mean scores on the speaking test. In the same vein, Pladevall and Vallbona (2016) suggest that language gains in CLIL begin to be noticed with time and experience, that is, positive CLIL-effects on both receptive skills might only be observable in the long run with more intensive exposure. However, this finding is not congruent with that obtained by Admiraal et al. (2006) who suggest that the gap between CLIL and non-CLIL groups does not widen over time, which means that potential differences between both cohorts may become invisible with time. In short, in the present study statistically significant differences transpire between both cohorts over time, as the mean scores obtained by the CLIL students are significantly higher to those of the non-CLIL students in both oral abilities, especially in the speaking skill, as shown in Table 2. A graphic overview of the testing results is provided in Figs. 1 and 2. The vertical axis represents the ability scores: 10 refers to maximum test score, while 0 refers to minimum score. Positive values refer to increasing competence and negative values to decreasing competence. The horizontal axis refers to the development of learners’ underlying oral competence in each educational stage. The straight line refers to the experimental group (CLIL learners), while the dashed line to the control group (EFL learners). Figs. 1 and 2 show the curves for both listening and speaking skills. What is most striking is that the control group (EFL) tends to have lower listening scores as they progress through their schooling years. Perhaps the lower amount of both in-school and out-of-school exposure to English when compared to their CLIL counterparts could be one the reasons behind this result. However, it is surprising that in Baccalaureate the EFL cohort behaves more positively than the CLIL group in terms of speaking. In short, the two curves showing the speaking test are more or less parallel, which means that both cohorts behave in a similar way, although it is a fact that the experimental group (CLIL) is superior in both oral skills in each educational stage. All in all, statistical analysis allows us to conclude that bilingual learners’ English oral comprehension and production abilities are positively affected by CLIL programmes. Specifically, the results reveal that the effects of the independent variable (CLIL instruction) are substantial on the dependent variable (English oral competence results, particularly speaking), especially at the end of Compulsory Secondary Education and in the first grade of Baccalaureate. Such finding thus suggests that positive CLIL effects remain visible with time and experience. This result tallies with previous studies reported in the literature review, endorsing for example the findings by Pladevall and Vallbona (2016) and Pérez-Cañado and Lancaster (2017). Turning now to the last research question of our study (RQ4: How does continued exposure to the target language (e.g., beyond the classroom) interact with CLIL? What is the differential effect of the amount of extramural exposure to English on CLIL and EFL learners’ oral communicative competence?), as shown in Table 3, although no statistically significant differences were found between both cohorts in Primary Education, it can be seen, taking a closer look at the results, that Secondary Education CLIL students appear to be more exposed to English outside of school when compared to their EFL counterparts, which is congruent with the findings gathered by Sylvén (2006), Olssen and Sylvén (2015). As can be seen in Tables 4 and 5, it can be concluded that the effects of the amount of out-of-school exposure to English on both oral abilities in particular are hardly noticeable in both cohorts at the end of Primary Education and, accordingly, results are inconclusive in terms of statistical significance. When finishing their Compulsory Secondary Education studies, however, the differential effects of the extramural L2 exposure variable on both oral skills become somewhat more visible but in both cohorts (in listening for CLIL learners and, surprisingly, with some statistical significance in speaking for EFL learners – with a much higher Cohen’s d (−0.993). This finding is partially congruent with that obtained by Olssen and Sylvén (2015) who concluded in particular that extramural English does not seem to have any significant impact on progress of academic vocabulary over time. 76 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 Table 3 Number of hours of extramural English per week according to educational level. Educational level Group Mean Standard deviation p value Primary Education Compulsory Secondary Education EFL CLIL EFL CLIL 10.30 11.03 17.75 22.08 22.67 14.59 29.57 26.14 0.102 * 0.006* The level of significance was set at p < 0.05. Table 4 Listening results in terms of extramural exposure to English according to educational level and group. Listening Educational level Group CLIL Primary Education EFL Compulsory Secondary Education CLIL EFL Extramural exposure to English (average hours per week) N Mean Standard deviation ≤9 hours >9 hours ≤9 hours >9 hours 51 31 60 20 6.18 6.49 5.95 6.69 1.18 0.48 1.70 1.25 ≤9 hours >9 hours ≤9 hours >9 hours 22 52 40 42 6.30 7.39 5.04 5.61 2.78 2.11 2.07 2.03 Extramural exposure to English (average hours per week) N Mean Standard deviation ≤9 hours >9 hours ≤9 hours >9 hours 12 8 9 2 7.13 7.00 4.28 4.25 2.58 2.66 2.14 1.77 ≤9 hours >9 hours ≤9 hours >9 hours 1 8 9 8 9.00 9.63 5.56 7.75 0.44 2.71 1.44 Cohen’s d p value −0.324 0.159 −0.461 0.078 −0.469 0.069 −0.281 0.207 Cohen’s d p value 0.048 0.918 0.013 0.987 −0.993 0.055* *The level of significance was set at p < 0.05. Table 5 Speaking results in terms of extramural exposure to English according to educational level and group. Speaking Educational level Group CLIL Primary Education EFL Compulsory Secondary Education * CLIL EFL The level of significance was set at p < 0.05. 4. Conclusion Given that current CLIL research offers contradictory as well as inconclusive results in showing variability (Bruton, 2015) in their findings and in view of the scarcity of research into the impact of CLIL programmes on oral competence, this cross-sectional study aims at adding further updated empirical evidence to the already existing one by reporting on the outcomes of the potential effects of CLIL instruction on learners’ developing oral competence. More specifically, the differential effects of two different instructional contexts (EFL and CLIL programmes) on learners’ oral comprehension and production competence have been examined and contrasted. Based on the findings, it can be concluded that greater exposure to the foreign language as a result of the participation in CLIL programmes results in a more developed oral competence, which is in line with Ruiz de Zarobe (2008). In relation to this, Gallardo del Puerto and Gómez (2017) made it clear that the exposure intensity factor might be operating. Overall, the results show that the experimental group (CLIL) obtains better results than the control group (EFL) on both oral communication skills, being such differences statistically significant and, most importantly, noticeable with time and experience, mostly when finishing their Compulsory Secondary Education studies and even at followup six months later after the completion of the programme (in the first grade of Baccalaureate), especially the speaking skill. Contrary to what has been traditionally assumed, this finding rejects the hypothesis that CLIL instruction develops more listening than speaking as a result of continued exposure to the target language oral input. Perhaps the results obtained in the present study for the speaking skill may be inconclusive due mainly to the fact that its scoring is generally more subjective and, therefore, findings in terms of statistical significance are relatively questionable and, consequently, should be addressed with caution. What is most striking about this finding, at least in the context under control in this study, is that while success in oral skills is mainly ascribed to the potential of CLIL programmes per se (in terms of continued exposure to L2 input inside the classroom), the extramural exposure intensity factor does not operate with the expected impact on CLIL learners’ developing oral competence in terms of statistical significance. Accordingly, the contradictory results of current CLIL research may in fact lead us to conclude that the particularity of CLIL contexts and conditions under which such programmes are actually implemented may account for the differences observed across both cohorts regarding the linguistic benefits. Or, to put it another way, the contradictory results obtained from the different studies to date are probably related to the variability in the implementation of CLIL programmes (Bruton, 2015). The findings from this quantitative cross-sectional study should be interpreted with caution not only for its limitations but also because the effects of other possible factors affecting the results obtained have not been considered. Given the limited sample size (in particular, the Baccalaureate sample), the geographical area that this study covers as well as the idiosyncrasies of the CLIL approach in Extremadura, what becomes clear is that the findings might not J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 be transferable to other educational settings. Despite these limitations, this study provides updated empirical evidence of the positive effects of CLIL programmes on the learners’ developing oral competence, whilst filling a void in the existing literature on CLIL by expanding its discussion to an area which has hardly been explored so far: the relationship or interaction between CLIL education and input-related variables such as extramural exposure to English. Without any doubt, the effectiveness of CLIL instruction depends largely on countless factors, as for example the intensity of exposure to the target language as well as CLIL teachers’ language proficiency, to mention just a few. In my opinion, a higher level of language proficiency should be required to teach in bilingual programmes. With this in mind, future studies should address the effects of out-of-school exposure on learners’ language competence more in depth, in particular the intensity and the conditions of exposure to the target language. Or, to put it another way, the quantity and quality of the oral input learners are exposed to should be examined in detail in future investigation (Rallo & Jacob, 2015). Lastly, as Cenoz et al. (2014, p. 256) rightly suggest, “more balanced reflection on both the strengths and shortcomings or gaps in our understanding of CLIL and its effectiveness in diverse contexts” is actually needed so as to better understand the differential impact of the CLIL approach on the overall target language competence. Funding This study derived from two governmentally-funded research projects financed by the Spanish Ministry of Economy and Competitiveness [research grant number FFI2012-32221] and the Government of Andalusia (Spain) [research grant number P12HUM-2348] (Project: The Effects of Content and Language Integrated Learning in Monolingual Communities: A Large-Scale Evaluation). References Admiraal, W., Westhoff, G., & de Bot, K. (2006). Evaluation of bilingual secondary education in the Netherlands: Students’ language proficiency in English. Educational Research and Evaluation, 12(1), 75–93. Brevik, L. M., & Moe, E. (2012). Effects of CLIL teaching on language outcomes. In D. Tsagari, & I. Csépes (Eds.), Collaboration in language testing and assessment (pp. 213–227). Bern: Peter Lang. Bruton, A. (2011). Is CLIL so beneficial, or just selective? Re-evaluating some of the research. System, 39, 523–532. Bruton, A. (2013). CLIL: Some of the reasons why. . . and why not. System, 41, 587–597. Bruton, A. (2015). CLIL: Detail matters in the whole picture. More than a reply to J. Hüttner and U. Smit (2014). System, 53, 119–128. Cenoz, J., Genesee, F., & Gorter, D. (2014). Critical analysis of CLIL: Taking stock and looking forward. Applied Linguistics, 35(3), 243–262. Coral, J., Lleixà, T., & Ventura, C. (2016). Foreign language competence and content and language integrated learning in multilingual schools in Catalonia: An ex post facto study analysing the results of state key competences testing. International Journal of Bilingual Education and Bilingualism, 1–12. Coyle, D. (2007). Content and language integrated learning: Towards a connected research agenda for CLIL pedagogies. The International Journal of Bilingual Education and Bilingualism, 10, 543–562. Coyle, D., Hood, P., & Marsh, D. (2010). CLIL: Content and language integrated learning. Cambridge: Cambridge University Press. Czura, A., & Kotodynska, A. (2015). CLIL instruction and oral communicative competence in a primary school setting. In K. Ożańska-Ponikwia, & B. Loranc-Paszylk (Eds.), Cross-cultural perspectives on bilingualism and bilingual education (pp. 123–153). Wydawnictwo Naukowe Akademi Techniczno-Humanistycznej. Dallinger, S., Jonkmann, K., Hollm, J., & Fieg, Ch. (2016). The effects of content and language integrated learning on students’ English and history competences – Killing two birds with one stone? Learning and Instruction, 41, 23–31. Dalton-Puffer, C. (2008). Outcomes and processes in Content and Language Integrated Learning (CLIL): Current research from Europe. In W. Delanoy, & L. Volkmann (Eds.), Future perspectives for English language teaching (pp. 139–157). Heidelberg: Universitätsverlag Winter. Dalton-Puffer, C. (2009). Communicative competence and the CLIL lesson. In Y. Ruiz de Zarobe, & R. M. Jiménez (Eds.), Content and language integrated learning: Evidence from research in Europe (pp. 197–214). Bristol: Multilingual Matters. Dalton-Puffer, C. (2011). Content-and-language integrated learning: From practice to principles? Annual Review of Applied Linguistics, 31, 182–204. 77 Dalton-Puffer, C., Nikula, T., & Smit, U. (2010). Language use and language learning in CLIL: Current findings and contentious issues. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 279–292). Amsterdam: John Benjamins. Delliou, A., & Zafiri, M. (2016). Developing the speaking skills of students through CLIL: A case of sixth grade Primary School students in Greece. EIIC – Electronic International Interdisciplinary Conference, 5(1). Dentler, S. (2007). Sweden. In A. Maljers, D. Marsh, & D. Wolff (Eds.), Windows on CLIL: Content and language integrated learning in the European spotlight (pp. 166–171). The Hague: The European Platform for Dutch Education. European Commission. (2012). Europeans and their languages. Special Eurobarometer 386. Brussels: European Commission. Eurydice. (2006). Content and language integrated learning (CLIL) at school in Europe. Brussels: Eurydice. Fernández, A. (2009). Spanish CLIL: Research and official actions. In Y. Ruiz de Zarobe, & R. M. Jiménez Catalán (Eds.), Content and language integrated learning. Evidence from research in Europe (pp. 3–21). Bristol: Multilingual Matters. Gallardo del Puerto, F., & Gómez, E. (2013). English oral skills in CLIL and non-CLIL learners: An attempt to control for exposure. In Applied Linguistics Perspectives on Content and Language Integrated Learning (ALP-CLIL) in Madrid, Spain. Gallardo del Puerto, F., & Gómez, E. (2017). Oral production outcomes in CLIL: An attempt to manage amount of exposure. European Journal of Applied Linguistics, 5(1), 31–54. García-Mayo, M. a . P., & Basterrechea, M. (2017). CLIL and SLA. Insights from an interactionist perspective. In A. Llinares, & T. Morton (Eds.), Applied linguistics perspectives on CLIL (pp. 33–50). Amsterdam: John Benjamins. Gassner, D., & Maillat, D. (2006). Spoken competence in CLIL: A pragmatic take on recent Swiss data. In C. Dalton-Puffer, & T. Nikula (Eds.), Current research on CLIL. VIEWZ, Vienna English working papers, 15(3) (pp. 15–22). Golberg, H., Paradis, J., & Crago, M. (2008). Lexical acquisition over time in minority first language children learning English as a second language. Applied Psycholinguistics, 29, 1–25. Hüttner, J., & Rieder-Bünemann, A. (2010). A cross-sectional analysis of oral narratives by children with CLIL and non-CLIL instruction. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 61–80). Amsterdam: John Benjamins. Hüttner, J., & Smit, U. (2014). CLIL (Content and Language Integrated Learning): The bigger picture. A response to A. Bruton. 2013. CLIL: Some of the reasons why . . . and why not. System, 44(2), 160–167 (System 41 (2013): 587–597). Krashen, S. (1985). The input hypothesis. Issues and implications. London, UK: Longman. Lasagabaster, D. (2008). Foreign language competence in content and language integrated courses. The Open Applied Linguistics Journal, 1, 30–41. Lasagabaster, D., & López, R. (2015). The impact of type of approach (CLIL versus EFL) and methodology (book-based versus project work) on motivation. Porta Linguarum, 23, 41–57. Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie, & T. K. Bahtia (Eds.), Handbook of second language acquisition (pp. 413–468). New York: Academic Press. Lorenzo, F., Casal, S., & Moore, P. (2009). The effects of Content and Language Integrated Learning in European education: Key findings from the Andalusian bilingual sections evaluation project. Applied Linguistics, 31(3), 418–442. Madrid, D., Bueno, A., & Ráez, J. (2019). Investigating the effects of CLIL on language attainment: Instrument design and validation. In M. L. Pérez Cañado (Ed.), Content and Language lntegrated Learning in monolingual settings: New insights from the Spanish context. Amsterdam: Springer (in press). Marsh, D. (2008). Language awareness and CLIL. In J. Cenoz, & N. H. Hornberger (Eds.), Encyclopedia of language and education, Vol. 6: Knowledge about Language (pp. 233–246). New York: Springer. Mattheoudakis, M., Alexiou, T., & Laskaridou, C. (2014). To CLIL or not to CLIL? The case of the 3rd experimental primary school in Evosmos. In N. Lavidas, T. Alexiou, & A. M. Sougari (Eds.), Major trends in theoretical and applied linguistics: Selected papers from the 20th International Symposium from Theoretical and Applied Linguistics (pp. 215–233). London: DeGruyter Versitas Publications. Mehisto, P., Marsh, D., & Frigols, M. J. (2008). Uncovering CLIL: Content and Language Integrated Learning in bilingual and multilingual education. Oxford: Macmillan Education. Meyer, O., Coyle, D., Halbach, A., Schuck, K., & Ting, T. (2015). A pluriliteracies approach to content and language integrated learning – Mapping learner progressions in knowledge construction and meaning-making. Language, Culture and Curriculum, 28(1), 41–57. Navés, T. (2011). How promising are the results of integrating content and language for EFL writing and overall EFL proficiency? In Y. Ruiz de Zarobe, J. Sierra, & F. Gallardo del Puerto (Eds.), Content and foreign language integrated learning (pp. 103–128). Bern: Peter Lang. Nieto, E. (2016). The impact of CLIL on the acquisition of L2 competences and skills in primary education. International Journal of English Studies, 16(2), 81–101. Nikula, T., Dalton-Puffer, C., & Llinares, A. (2013). CLIL classroom discourse. Research from Europe. Journal of Immersion and Content-Based Language Education, 1(1), 70–100. Olssen, E., & Sylvén, L. K. (2015). Extramural English and academic vocabulary. A longitudinal study of CLIL and non-CLIL students in Sweden. Apples – Journal of Applied Language Studies, 9(2), 77–103. Paran, A. (2013). Content and language integrated learning: Panacea or policy borrowing myth? Applied Linguistics Review, 4(2), 317–342. 78 J.d.D. Martínez Agudo / Linguistics and Education 51 (2019) 69–78 Pelechano, V. (1994). . pp. 71–72. Prueba MA. Análisis y Modificación de la Conducta (Vol. 20). Pérez-Cañado, M. L., & Lancaster, N. K. (2017). The effects of CLIL on oral comprehension and production: A longitudinal case study. Language, Culture and Curriculum, 30(3), 300–316. Pérez-Vidal, C., & Roquet, H. (2015). CLIL in context: Profiling language abilities. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based language learning in multilingual educational environments (pp. 237–255). Amsterdam: Springer. Pladevall, E., & Vallbona, A. (2016). CLIL in minimal input contexts: A longitudinal study of primary school learners’ receptive skills. System, 58, 37–48. Prieto-Arranz, J. I., Rallo, L., Calafat-Ripoll, C., & Catrain, M. (2015). Testing progress on receptive skills in CLIL and non-CLIL contexts. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based language learning in multilingual educational environments (pp. 123–137). Amsterdam: Springer. Rallo, L., & Jacob, K. (2015). Does CLIL enhance oral skills? Fluency and pronunciation errors by Spanish-Catalan learners of English. In M. Juan-Garau, & J. Salazar-Noguera (Eds.), Content-based language learning in multilingual educational environments (pp. 163–177). Amsterdam: Springer. Roquet, H., & Pérez-Vidal, C. (2015). Do productive skills improve in content and language integrated learning contexts? The case of writing. Applied Linguistics, 1–24. Ruiz de Zarobe, Y. (2008). CLIL and foreign language learning: A longitudinal study in the Basque country. International CLIL Research Journal, 1(1), 60–73. Ruiz de Zarobe, Y. (2010). Written production and CLIL: An empirical study. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 191–210). Amsterdam: John Benjamins. Ruiz de Zarobe, Y. (2011). Which language competencies benefit from CLIL? An insight into Applied Linguistic research. In Y. Ruiz de Zarobe, J. Sierra, & F. Gallardo del Puerto (Eds.), Content and foreign language integrated learning (pp. 129–153). Bern: Peter Lang. Rumlich, D. (2014). Gauging the CLIL effect: Results from a large-scale longitudinal study on CLIL programmes at German secondary schools. In Paper presented at the AILA world conference of Applied Linguistics in Brisbane. Santamaría, P., Arribas, D., Pereña, J., & Seisdedos, N. (2016). EFAI. Evaluación Factorial de las Aptitudes Intelectuales. Madrid: TEA Ediciones. Sundqvist, P., & Sylvén, L. K. (2014). Language-related computer use: Focus on young L2 English learners in Sweden. ReCALL, 26(1), 3–20. Swain, M. (1995). Three functions of output in second language learning. In G. Cook, & B. Seidlhofer (Eds.), Principle and practice in Applied Linguistics (pp. 125–144). Oxford: Oxford University Press. Sylvén, L. K. (2006). How is extramural exposure to English among Swedish school students used in the CLIL classroom? Vienna English Working Papers, 15(3), 47–53. Zydatiß, W. (2012). Linguistic thresholds in the CLIL classrooms: The threshold hypothesis revisited. International CLIL Research Journal, 1(4), 17–28.