Innovations of Corpus Linguistics in Asia: An Analysis
By Eulalia C. Tolentino (MATEL)
De La Salle University
The proliferation of corpus systems and techniques has enabled researchers worldwide to conduct research in their own geographical location with minimal hindrance. Over the years, corpus techniques have transformed the landscape of empirical research relating to linguistic studies and language education. It has been increasingly common for researchers to apply corpus techniques in their research as well as compiling their own corpora for specific purposes (Xie, 2013).
Corpus linguistics is the study of language using naturally occurring language samples. It employs specialized software programmes on a computer to analyze various aspects of language. Hence, it helps to obtain and analyse data quantitatively and qualitatively rather than relying on mere theoretical studies of language (Aswini, & Srinivasan, 2016). It is one of the fastest evolving language teaching methods in modern applied linguistics. Several studies have been carried out to study the effectiveness of using Corpus Linguistics to comprehend the dynamics of language learning and to harness the benefits of learning a language though corpus based approach (Aswini, & Srinivasan, 2016). It offers so much in everything that talks about linguistic issues, such as language use, grammar, trends, etc. There are so many researches about corpus linguistics, but the actualizations of those researches are not that realized maybe because of the fact that the results only cater to the curiosities rather than teachable materials. Example, if we had obtained the result that is totally different from our practice in teaching, we tend to be a little flabbergasted but our way of teaching would remain the same. So the question is, how these findings could eventually be adopted in a sense that the results are easily acceptable because the language books that we are using are based in corpus.
This paper aims to analyze the innovations of Corpus Linguistics in Asia through highlighting the significant findings of each study to help the researchers, the teachers, as well as the students using corpora.
Criteria for Selection
This paper only uses research articles published from 2008 to 2018. The articles should be in Asian context that talk about the significant findings in corpus linguistics.
Significant Findings in the Corpus Linguistics in Asia
a. Corpus-Based Teaching
Teaching collocations through corpus-based tools was of significant help to the students’ retention and learning of collocations. The study of Akbari (2015) points out that the experimental group (was taught the collocations and lexical chunks through using corpus-based tools) outperformed the control group (was taught through traditional method in which they did not receive any instructional tools, innovative materials, and instruments) in both post-test and delayed post-test. It is also true in the result of the study of Ucar ; Yukselir (2015) that reveals the impacts of corpus-based activities on verb-noun collocation learning in EFL classes. It is also carried out on two groups – experimental and control groups- each of which consists of 15 students. Throughout the study, the experimental group was taught verb-noun collocation through corpus-based materials taken from Corpus of Contemporary American English (COCA), and the control group was taught through a conventional method. The results demonstrated that there is a statistically significant difference between experimental and control group in terms of the type of treatment, which signifies that corpus-based activities has a significant impact on verb-noun collocations in EFL classes.
Based on the two researches being mentioned, using corpus based in teaching collocations has major difference than those teachings that are conventionally thought.
In the study of Chen (2016) that talks about corpus-based evaluation that focuses on text difficulty as a crucial factor for building linguistic competence. It is concluded that corpus-based evaluation for difficulty development in the textbook series may not only correspond to the local needs of Taiwan senior high school curriculum, but also provide a common framework for the assessment of the ELT materials in other global contexts.
It is not only textbook difficulty is the concern of corpus-based evaluation, it could also be the accuracy and fluency of EFL learners’ language use to comprehend a sentence when reading and listening, and to generate a sentence in writing and speaking (pronunciation); and serve as a language resource for the statistical development of various basic modules of a of computer-assisted language teaching (CALT) system. The study of Kotani, et.al (2016) implies that learner corpus study should examine different linguistic skills from different perspectives. Since EFL learners are on the learning process, their linguistic skills have not yet been stable. If the learner corpus study compiles the language use data in different linguistic skills from different perspectives, the contribution of learner corpus data to the development of computer-assisted language teaching (CALT) system will be enhanced more.
b. Corpus in Grammar and Vocabulary Teaching
Lexicogrammar is a term used in systemic functional linguistics (SFL) to emphasize the interdependence of – and continuity between – vocabulary (lexis) and syntax (grammar). The term lexicogrammar (literally, lexicon plus grammar) was introduced by linguist M.A.K. Halliday (thoughtco, 2017). Liu’s ; Jiang’s (2009) study shows that the use of corpora and lexicogrammar can enhance students’ language awareness, improve their command of lexicogrammatical rules and usage patterns, increase their appreciation of context in language use and their critical understanding of grammar, and promote discovery learning, thus making learning more effective. It also reveals some challenges of corpus-based lexicogrammar learning, including the daunting difficulty many students feel in corpus analysis. The study also identifies some variables influencing learners’ experience in using the approach, such as course content, student learning styles, and learning settings. In the corpus-based approach to online materials development for writing research articles by Chang & Kuo (2011), they found out that analyses relating lexico-grammatical features of a genre, like research articles, to its information structure, particularly different communicative purposes or rhetorical functions of the sections, are linguistically illuminating and pedagogically useful. Results from such analyses provide data and information for contextualized learning materials. Since RAs are a conventionalized genre, non-native EAP learners need to be explicitly taught these generically distinctive features
It is quite visible that grammar and vocabulary are connected and can affect each other greatly. This study proves that effective grammar teaching needs an improve vocabulary on the part of the learners. Researchers are finding ways on how to teach these two concepts simultaneously.
Some researchers wanted to deviate from the traditional grammar and vocabulary teaching. Grammar teaching should be within the context of the learners to easily grasp and apply the situation, as well as vocabulary teaching. Example of which is the phrasal verb. Phrasal verbs are word combinations often used by native speakers in conversation because of their colloquial tone (Biber, Johansson, Leech, Conrad, & Finegan, 1999). In the study of Dita & Ella (2017), they determine the most common forms of phrasal-prepositional verbs (PPVs) in Philippine English using the ICE-PHI and describes their syntactic and semantic features, following Quirk et al.’s (1985) framework. Thirty nine out of the forty-eight words from the list of Quirk et al. (1985) and Biber et al. (1999) were found in the corpus using AntConc 3.4. Results show that come up with, get out of, look forward to, come out with, hold on to, and catch up with are the most frequently used PPVs by Filipinos. These PPVs are intransitive verbs, also inseparable, and occur in active voice. Findings further reveal that the meanings of the PPVs are the same as the single-word verb meanings provided by the online dictionaries of phrasal verbs, and those single-word verb meanings can replace the PPVs. Hence, they are idiomatic. The study implies that Filipinos use minimal number of PPVs. They appear to be conservative in their choice of PPV structure, but generally show proficiency in using PPVs in their utterances.
The same scenario happened four years ago in the study of Ryoo (2013). It sheds light on one of the most creative, thus challenging, grammatical classes of the English language, namely phrasal verbs (PV) in a corpus of Korean EFL students’ writing. It was found that the top 4 most frequent verbs (e.g. GO, COME) and adverbial particles (e.g. up, out) in both corpora were almost identical, and more than half of the top 20 verbs overlapped. These findings can provide evidence indicating that Korean EFL students lack the formulaic competence of PVs. The implications of the findings for English language teaching and learning are also discussed.
The two studies differ from each other due to the fact that the first one talks about the Filipinos who are L2 learners and the second study is about Koreans who are EFL learners. It is because ESL learners had more and had much larger variety of contact in English language (Longcope, 2009).
Asian countries have different researches in CL due to its potentials in teaching language in authentic situation. In the study of Lai et. al (2013), it shows that there is a correlation between verbal tense and aspectual adverbs in the colloquial register of Singapore English (SCE) where the Optional Omission of Past Tense (OPT) is substantially instigated by the use of perfective adverbs in SCE sentences. In Singapore English sentences that contain the perfective aspectual adverbs of already or yesterday, the OPT-ional phenomenon is found to be present 68.2% of the time. Although this phenomenon is also found in Hong Kong English, it is significantly more prominent in Singapore English.
Another study investigates whether a corpus-based instruction could deepen EFL learners’ knowledge of periphrastic causatives: make, cause, and let. The results indicated that the experimental group improved and outperformed the control group significantly in the post-test. The questionnaire results confirmed that the instruction was effective in increasing students’ knowledge of the three causatives. However, the field notes revealed learners’ difficulties in using certain causatives (Huang, 2012).
In the corpus-based study on Asian learners’ use of English linking adverbials of Ishikawa (2011) in Japan, it finds out that the gap between native speakers (NS) is rather limited in terms of the quantity of English linking adverbials (LA), which plays an important role in the logical cohesion of a text, used in essay writing. He also finds out that Asian non-native speakers (NNS) tend to use addictive types of LA, intensification of meaning, while underusing LA items concerning the introduction of parallel information and sequential introduction of information. Finally, Japanese learners of English (JLE), Chinese learners of English (CLE), and NS use major LA items in their own ways, which are hardly influenced by L2 proficiency.
It can be noticed that one concept of English is used differently by different speakers of English language. Another research points out that the uniqueness of the English language is dependent on the speakers of English. In the study of Crosthwaite & Choy (2016), A Learner Corpus investigation of Filipino L2 English article use: The way forward for language teachers, it presents a learner corpus analysis of L2 English definite article use by L1 Tagalog speakers, collected from the International Corpus of Learner English (ICNALE, Ishikawa, 2013), totalling 24,253 words from 94 essays. The results, the overall article accuracy is higher than that reported for L2 English learners from article-less L1s in Author (2016). Thus, L1 Tagalog/Filipino speakers certainly have the advantage over speakers of true ‘article-less’ languages.
Using corpora in language classrooms has proven to be an effective tool in teaching vocabulary, grammar and language use to learners of English as a second/foreign language. However, many EFL teachers find integrating corpus-based activities in their classrooms a challenging teaching practice.
Vocabulary is one of the concerns of the EFL learners because of their limited contact with English language. Different researchers wanted to find out the easiest route to learn the target language. The pedagogical values of corpora for ELT have been widely acknowledged and exploited, but their direct application in classroom teaching has entailed many difficulties respectively.
Paker & Ozcan (2017) conduct a study in Turkey about the effectiveness of corpus-based vocabulary teaching activities as well as students’ attitudes towards concordance-based materials when corpus-based tasks in English vocabulary learning. The statistical analysis indicated that using corpus-based vocabulary tasks were more effective than the tasks in the textbook. Additionally, findings indicated that the attitudes of the students were positive in the use of corpus-based vocabulary tasks. Furthermore, Shi (2017) investigates the impact of the pedagogical application of corpora on the vocabulary ability of intermediate-level ESL learners in mainland China. Findings indicate that the pedagogical application of a corpus with adequate instruction is more effective and efficient in improving learners’ vocabulary. Likewise, in A Corpus-based Study of the Size and the Level of the Vocabulary Used by Japanese Learners of English at Different Proficiency Levels of Ishikawa (2017), it analyses topic-controlled speeches and writings by Japanese learners of English (JLE) and English Native Speaker (ENS) to observe the size and the level of the vocabulary they use. It has been shown that the number of lemmas becomes smaller in speeches than in writings, and for JLE than for ENS hence, Vocabulary Level Index (VLI) becomes lower in speeches than in writings, for JLE than for ENS, and for the speeches by novice JLE than for those by intermediate and advanced JLE. Learners’ lexical development is more salient in speeches than in writings.
In Creating a Corpus-Based Daily Life Vocabulary for English to young learners (TEYL) of Chujo et. al (2011), they create a list of children’s everyday vocabulary in English which will provide a foundation for daily life vocabulary for Japanese elementary school students and which will complement and augment existing English vocabulary currently taught in Japanese junior and senior high schools. It was found that the identified words are at the appropriate grade level (grades 1 to 3), that the semantic content areas are grade-appropriate and complement the semantic categories of junior and senior high school (JSH) vocabulary, and that this vocabulary supplements JSH vocabulary in text coverage over 18 activities. Furthermore, Li, et.al (2011) have shown a high performance of text categorization in which semantic relations of terms drawing upon two kinds of thesauri, a corpus-based thesaurus (CBT) and WordNet (WN), were sought. When a combination of CBT and WN was used, they obtained the highest level of performance in the text categorization. Thus, to facilitate semantic relations of terms is critical for getting a satisfactory result in the text categorization.
The integration of corpus in the context of L2 learning and teaching has become increasingly appealing in recent years (Phoocharoensil, 2012). The ability to simultaneously integrate language skills such as vocabulary, grammar, writing and reading, makes corpus consultation an attractive complement to the traditional method of language teaching and learning.
c. Corpus in Academic Writing
Using corpus in academic writing has gained tremendous positive feedbacks all over the world. In the study of Almutairi (2016) using personal statements, one of the most important requirements of university programs admissions, of the law students finds out that using the personal pronouns in this genre of academic writing is strongly favorable because it serves the self-promotion purpose. On the other hand, grammatical features showed the dominance of past tense when listing academic qualifications. In addition, the passive voice is used commonly because the focus should be in the student and not on whoever the subject is. The findings are very timely because they can help design various classroom activities to equip students with the appropriate academic vocabulary and grammatical features that suit personal statement genre. The goal of these classroom activities is to encourage students to present themselves, their achievements, goals, and qualification positively and favorably in an academic language to sell themselves and win a place in the desired school.
Another remarkable finding using corpus in written text is the study of Yuanyuan (2017). The research finds that there are a considerable number of similarities of usage and remarkable differences in English writings of the Chinese learners of English (CLE’s) and native speakers of English (NSE’s) corpora. CLE prefer shorter sentences and active voices, while NSE prefer longer sentences and passive voices; CLE prefer verbs, while NSE prefer nouns, adjectives, and prepositions, which indicates CLE pay more attention to the men, while NSE concentrate on nature and reason. Some words are overused or used with inappropriateness in CLE’s English writings, such as the, I, to, and, a, my, in, was, of, it, that, is, we, you, me, for, but, he, on, with, and so on, and some CLE cannot make full use of pronouns and prepositions. These differences could result from mastery of the language itself, awareness of writing skills, transference of Chinese in the second language learning, different thinking patterns, or cultural factors.
Kobayashi’s & Abe’s (2016) A corpus-based approach to the register awareness of Asian learners of English has remarkable discovery in investigating the impact of learners’ L1s and proficiency levels on their written production. The results suggest that the L1s of learners affect the degree of their register awareness. Hong Kong learners display a set of stylistically appropriate features, such as nominalizations, predictive modals, and conjuncts, in their academic prose whereas Japanese learners exhibit many of informal features, such as first person pronouns, private verbs, and independent clause coordination, in their written production. Besides, Korean and Taiwanese learners show several features typical of speech, including second person pronouns, in their writing. In addition, this study demonstrates the effectiveness of Biber’s list of linguistic features in the study of spoken nature in L2 writing.
Furthermore in the study of Lee, et.al (2015) find out much more common for students to use the present simple tense where the past was needed, as compared with the reverse direction. This tendency may be influenced by the students’ L1: since Chinese verbs are not inflected, students preferred by analogy the uninflected form in English, which happens to be the present simple. If true, this would point to the need for giving more detailed feedback for error categories involving complex grammatical constructions. Additionally, McCrostie & Bunka (2010) find out that academic essays written by Japanese learners contain far more writer/reader visibility features than similar native English speaker writing.
Another comparison of data is done in the study of Wei (2013). It finds out that the two groups of English learners exhibit more similarities in topical Theme choices than differences. The Chinese English learners and the Swedish English learner display closer performance to each other not only in all three types of topical Themes, but also in two of the five elements in informational Themes, two of the three elements in interactional Themes and all three elements in discoursal Themes. The results of the study also accords with past research findings in that both Chinese English learners and Swedish English learners deviate from native speakers in topical Theme choices. It is somehow a reminder to us teachers that in the method of development of texts, themes have greater contributions.
d. Movements in Corpus Building
Corpus Linguistics is substantial in the language classroom because it is highly relevant in this digital age. However one of the challenges in using corpus linguistics is the ability to use computers with ease and the skill to exploit the nuances of online corpora in the classroom (Aswini & Srinivasan, 2016). It is very notable that in L2 corpus construction it is important to define the goals and objectives of the project and decide the source of the data and how the data will be collected it is also important for the corpus designer to study existing corpus, available recording devices, annotation tools and their documentations, database management systems and their interface designs, and tools for transcription and other annotation, with a view to the nature and expertise of the workforce that would be readily available for the project for researchers who are interested in building learner corpus is that they should be prepared to secure enough funding for various aspects of the contemplated project at various stages. Lastly, Corpus builders should be aware of various real-life trade-off relationships among various factors that are involved in data collection and data sharing in case of learner corpora (Huang, et.al, 2010).
In the article “The Necessities, Feasibilities and Principles for EFL Teachers to Build A Learner-oriented Mini-corpus for Practical Classroom Uses” of Zhang (2008), he gives sound advice that it is necessary and feasible for EFL teachers, focusing on some basic principles, to build a learner-oriented mini-corpus to complement the existing shortcomings of the established corpora in EFL teaching. In addition, this paper also points out that an EFL teacher should endeavor to use various teaching methods or measures to meet EFL learners’ diverse needs, including the use of corpora, either the self-built or the established ones or their collaborations.
However, Lee’s (2011) Challenges of Using Corpora in Language Teaching and Learning conclude that the appropriate and effective use of corpora in the classroom is partly a technical issue, but primarily a pedagogical one. If the use of corpora in the classroom is not extensively discussed and researched to develop a pedagogical blueprint for the integration, the expected pedagogical outcomes that a number of corpus linguists simply expected may not accrue to learners and teachers.
In 2015, a group of Malaysian researchers aim to build corpus that cater ESP research and teaching in Malaysia. It is due to the fact that Malaysia does not develop any specialized corpora that would encompass the English used by specific discourse communities in Malaysia. The needs of developing Malaysian ESP corpora are due to following reasons according to Aziz, et.al (2015):
Specialized corpora such as the English for Specific Purposes (ESP) corpora are not easily accessible. To date, only a small number of specialized corpora can be publically accessed online. They include the Michigan Corpus of Academic Spoken English (MICASE), the Corpus of London Teenager Language (COLT), the Hong Kong Engineering Corpus (HKEC) and the Hong Kong Financial Services Corpus (HKFSC).
The majority of other specialized corpora (for instance the Cambridge Business English Corpus, the Cambridge Legal English Corpus, and the Cambridge Academic English Corpus) are only accessible through purchase or subscription (Warren, 2010).
The exposure to genuine language use within its contexts can enrich learners’ understanding and repertoire in the use of the target language.
Another reason that makes CiP more interesting and attractive is the computer-based nature of corpus consultation. The use of computer provides L2 learners greater exposure to the target language and allows for greater opportunities for interaction with it.
The completion of the (Malaysian Corpus of Financial English) MaCFE would mark an important chapter in the corpus linguistics field in Malaysia. It will be the first specialized corpus to be built in Malaysia and will be among the very few in the world that could be freely accessible online. It will also pave an important path for the creation of other specialized corpora that would contribute to the building of a larger and more comprehensive Malaysian Corpus of English for Specific Purposes or MaCESP.
In Japan, Ishikawa”s (2015) Contribution of Learner Corpus Studies for Dictionary Making: Identification of Deviant L2 Vocabulary Use by Asian Learners identified a series of words overused and underused by four groups of Asian learners, Japanese, Chinese, Taiwanese, and Koreans. The results show that some of those words are specific to a particular learner group, while others are common to several learner groups in the area. Contrastive inter-language analysis based on a learner corpus can be applied to “a wide range of linguistic features — orthographic, lexical, grammatical, phraseological, stylistic, pragmatic” to find “interesting patterns of overuse, underuse, and misuse” by learners. It is doubtless that they are qualified to be included in future EFL dictionaries, which are expected to be a good learning guide for learners as well as a reliable database of the target language.
Furthermore in Saudi Arabia, Alfaifi, et.al. (2014) introduce the Arabic Learner Corpus (ALC) being developed at Leeds University, and comprises of 282,732 words, collected from learners of Arabic in Saudi Arabia. The corpus includes written and spoken data produced by 942 students, from 67 different nationalities studying at pre-university and university levels. The paper focuses on two angles of this corpus; the design criteria and the content. The goal of the ALC is to provide an open-source of data for some linguistic research areas related to Arabic language learning and teaching. So, the corpus data is available for download in TXT and XML formats, hand-written sheets which are in PDF format as well as the audio recordings which are available in MP3 format.
In the Philippines, there is also an update about the building of online corpus for Philippine languages. Dita & Roxas (2011) discuss the updates about Philippine Languages Online Corpora (PLOC) where in Dita et al. (2009) have reported, there were many things to consider in conceptualizing the first phase of the project. The second phase of the project has so far completed a 2-million word corpus of the eight major Philippine languages (Tagalog, Cebuano, Ilocano, Hiligaynon, Bicol, Kapampangan, Pangasinense, and Waray). In summary, the present PLOC now contains a 250,000-word written texts of the eight major languages in the Philippines.
In China, Bond & Wang (2014) present preliminary results from an on-going project to construct large-scale sense-tagged parallel corpora. They divided the annotation scheme into two phrases: monolingual sense annotation and multilingual concept alignment. They mainly discuss a breadth first approach, where they are trying to increase the coverage uniformly to cover all words. They are also using the corpora as a test-bed to look at individual phenomena of interest in detail, including the use of Chinese traditional idiomatic expressions (??ch´engy?u), English possessive idioms (X looses X’s head) and the differences in pronoun distribution across languages.
In Taiwan, Hsu’s (2009) study, it aims to create a corpus of General English (GE) reading textbooks used in universities in Taiwan to form the basis of an analysis. The operational measures for comparison involved vocabulary size, vocabulary levels (distribution among the British National Corpus 1st–14th 1,000 high-frequency word families) and text coverage. It may be useful in preparing learners for an intermediate GEPT by covering 24.55% to 65% of the vocabulary involved in the test. It is hoped that the indices examined in this study would help English teachers to take into account vocabulary size and levels in curriculum design.
e. Challenges of using Corpora in Language Teaching and Learning
Many corpus linguists have identified the advantages of using corpora in language teaching. It has been also contended that corpus-based language teaching has potentials to motivate learners and promote learner autonomy that are highly valued in pedagogy (Aijmer, 2009). However, the challenges and limitations of the use of corpora have not been extensively discussed, and without critically examining the use of corpora in language pedagogy it seems premature to urge teachers to use them in their classroom (Lee, 2011). It also leads to question of (Lee, 2011) why teachers rarely use corpora in their classroom despite readily-available computers and some of undeniable merits of corpora in language teaching. These are the questions that may arise when using a corpora:
Are the corpora authentic?
Flowerdew (2009) suggested that authentification of corpora can be assisted by including contextual information (e.g., MICASE has been marked up with socio-cultural information such as gender, age, academic position/role of interlocutor)
Are the corpora relevant to the learners?
There are number of corpora (The Birmingham Collection of English Texts, The British National Corpus, The Brown Corpus, The Helsinki Corpus of English Texts: Diachronic and Dialectal, (ICE) (SEC) (LOB)) in different languages and the accessibility has improved as they are readily available in the Web. Even though they are readily available, corpora, despite of their potentials as resource and language tools, have little impact in language pedagogy. One of the fundamental problems stems from how corpora were created on what purposes. For example, a large-sized corpus is essential in lexicography because they need to obtain a sufficient number of occurrences of lexical or structural items in order to compare a relative frequency of occurrences (Lee, 2011).
f. Advantages of using Corpus Based-Approach
For learners to benefit from the use of corpora, language teachers must first of all be equipped with a sound knowledge of the corpus-based approach. So the teacher sets the student this task, to explore some patterns in some of the corpora, and gets them to formulate the usage rules for this form (Dazdarevic, et al., 2015):
the two main advantages of using corpora are the facts that they give authentic evidence of changes through time in a language
and they are also a great tool for studying language use and variation across different types of speakers.(Essays, UK, 2013)
g. Disadvantages of using Corpus Based-Approach
Only 10% of the corpus is based on spoken language so there is not much information about it (Essays, UK, 2013).
The second disadvantage is that a corpus will never tell you what is grammatically or syntactically wrong or right (Essays, UK, 2013).
Given the different innovations of Corpus Linguistics in teaching and learning process, there may be far greater discoveries using CL. Because of these discoveries, the building of corpora in every country in Asia would be a possibility. Also, language books writers would cater the idea of using corpora as basis in conceptualizing language books. And because of these trends, seminars after seminars about corpus-based teaching would be implemented as well as the use of concordance with ease on the teachers’ part. The use of corpora in English Language Teaching within the frameworks of progressive educational practices and the use of ICT in education is deemed to be largely successful. In conclusion, CL would be the future of language teaching due to its “naturalness” in terms of language and the application of this in the given context of the learner. Aside from the given the disadvantages being mention in using CL, these would be in minimal consideration than the overwhelming efficacy of the use of CL in language teaching and learning.
Aijmer, K. 2009. Introduction: Corpora and language teaching. In K. Aijmer (Ed.),
Corpora and language teaching (pp. 1-10). Amsterdam: John Benjamins.
Akbari, J., Haghverdi, H., & Biria, R. (2015). Instructional efficacy of corpus-based
tools in teaching collocations to Iranian University students with different
majors. Journal of Applied Linguistics and Language Research, 2(8), 218-229.
Retrieved from www.jallr.com
Alfaifi, A. Y. G., Atwell, E., & Hedaya, I. (2014, June). Arabic learner corpus (ALC) v2: a
new written and spoken corpus of Arabic learners. In Proceedings of Learner
Corpus Studies in Asia and the World 2014 (Vol. 2, pp. 77-89). Kobe
International Communication Center.Almutairi, N. (2016). The effectiveness of corpus- based approach to language
description in creating corpus-based exercises to teach writing personal
statements. English Language Teaching, 9(7). Retrieved from
http://dx.doi.org/10.5539/elt.v9n7p103Aswini, P., & Srinivasan, R. (2016). Corpus-based studies – Some perspectives.
International Journal of Applied Engineering Research, 11(4), 2340-2342.
Retrieved from http://www.ripublication.com.Aziz, R. A., Nordin, N. M., Ismail, M. R., Baharum, N. D., & Sadjirin, R. (2015). Building
the Malaysian Corpus of Financial English (MaCFE).
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Grammar of
spoken and written English. Harlow: Longman.
Bond, F., & Wang, S. (2014). Issues in building English-Chinese parallel corpora with
WordNets. In Proceedings of the Seventh Global Wordnet Conference (pp. 391-
Chang, C., & Kuo, C. (2011). A corpus-based approach to online materials development
for writing research articles. English for Specific Purposes, 30, 222–234.
Chen, A. C. H. (2016). A critical evaluation of text difficulty development in ELT textbook
series: A corpus- based approach using variability neighbour clustering.
System, 58, 64-81.
Chujo, K., Oghigian, K., Utiyama, M., & Nishigaki, C. (2011). Creating a corpus-based
daily life vocabulary for TEYL. Asian EFL Journal, 49, 30-59.
Dita, S., & Roxas, R. E. (2011). Philippine Languages Online Corpora: Status, issues,
and prospects. In Proceedings of the 9th Workshop on Asian Language
Resources (pp. 59-62).
Dazdarevic, S., Zoranic, A., & Fijuljanin, F. (2015). Benefits of corpus-based approach
to language teaching. Balkan Distance Education Network – BADEN Newsletter,
3(7). Retrieved from http://www.badennet.org/Ella, J., & Dita, S. (2017). The Phrasal-Prepositional Verbs in Philippine English: A
Corpus-based Analysis. In Proceedings of the 31st Pacific Asia Conference on
Language, Information and Computation (pp. 34-41).
Essays, UK. (November 2013). Two Disadvantages Of Using Corpora For Language
English Language Essay. Retrieved from
Flowerdew, L. 2009. Applying corpus linguistics to pedagogy: A critical evaluation. International Journal of Corpus Linguistics, 14(3), 397-417.
Hsu, W. (2009). College English textbooks for general purposes: A corpus-based
analysis of lexical coverage. Electronic Journal of Foreign Language Teaching,
Huang, C. R., Cheung, W., Harada, Y., Hong, H., Skoufaki, S., & Chen, H. K. (2010).
English learner corpus: global perspectives with an Asian focus. In A New Look
at Language Teaching and Testing: English as Subject and Vehicle: Selected
Papers from the 2009 LTTC International Conference on English Language
Teaching and Testing March (pp. 6-7).Huang, L. (2012). The effectiveness of a corpus-based instruction in deepening EFL
learners’ knowledge of periphrastic causatives. TESOL Journal, 6, 83-108.
Retrieved from http://www.tesol-journal.com
Ishikawa, S. I. (2017). A Corpus-based Study of the Size and the Level of the
Vocabulary Used by Japanese Learners of English at Different Proficiency Levels.
Ishikawa, S. I. (2011). A corpus-based study on Asian learners’ use of English linking
adverbials. Themes in Science and Technology Education, 3(1-2), 139-157.
Kobayashi, Y., & Abe, M. (2016). A Corpus-Based Approach to the Register Awareness
of Asian Learners of English. Journal of Pan-Pacific Association of Applied
Linguistics, 20(2), 1-17.
Kotani, K., Yoshimi, T., Nanjo, H., & Isahara, H. (2016). A Corpus of Writing,
Pronunciation, Reading, and Listening by Learners of English as a Foreign
Language. English Language Teaching, 9(9), 139.Lai, E. Y. M., Tan, L., Wong, V., Loke, L. T. T., & Bond, F. (2013). The OPT-ional
Phenomenon in Singapore English: A Corpus-based Approach Using Time
Annotated Corpora. Procedia-Social and Behavioral Sciences, 95, 431-441.
Lee, J., Yeung, C. Y., Zeldes, A., Reznicek, M., Lüdeling, A., & Webster, J. (2015). City
U corpus of essay drafts of English language learners: a corpus of textual revision
in second language writing. Language Resources and Evaluation, 49(3), 659-683.
Lee, Shinwoong. (2011). Challenges of using corpora in language teaching and
Learning: Implications for secondary education. Linguistic Research 28(1),
159-178.Li, C. H., Yang, J. C., & Park, S. C. (2012). Text categorization algorithms using
semantic approaches, corpus-based thesaurus and WordNet. Expert Systems
with Applications, 39(1), 765-772.
Liu, D., & Jiang, P. (2009). Using a corpus-based lexicogrammatical approach to
grammar instruction in EFL and ESL contexts. The Modern Language Journal,
Longcope, P. (2009). Differences between the EFL and the ESL language learning
context. Studies in Language and Culture, 30(2), 203-320.
McCrostie, J., & Bunka, D. (2009). Writer visibility in EFL learner academic writing: A
corpus-based study. ICAME Journal, 32.Paker, T. & Ozcan, Y. (2017). The effectiveness of using corpus-based materials in
vocabulary teaching. International Journal of Language Academy, 5(1), 62-81.
Retrieved from http://dx.doi.org/10.18033/ijla.3494
Phoocharoensil, S. (2012). Cross-linguistic influence: Its impact on L2 English
collocation production. English Language Teaching, 6(1), 1.Ryoo, M. L. (2013). A Corpus-based Study of the Use of Phrasal Verbs in Korean EFL
Students’ Writing. The Journal of AsiaTEFL, 10(2), 63-89.
Shi, J. (2017). Biting off More Than They Can Chew? The Impact of Pedagogical
Application of Corpus on Vocabulary Ability of Intermediate-Level ESL Learners
in Mainland China: A Quasi-Experimental Study. English Language
Teaching, 10(9), 232.Thought.co, 2017. https://www.thoughtco.com › … › English Grammar › Glossary of Key
Ucar, S., & Yukselir, C. (2015). The effect of corpus-based activities on verb-noun
collocations in EFL classes. The Turkish Online Journal of Educational
Wei, J. (2013). Corpus-based research on topical theme choices in Chinese and
Swedish english learner writings. Theory and Practice in Language Studies,
Xie, Q. (2013). Corpus Linguistics and Corpus-Based Research in Hong Kong: A State-
of-Art Review. English Language and Literature Studies, 3(3), 48.Yuanyuan, H. (2017). A Corpus-Based Contrastive Study on Brown a & b and WECCL.
In Asia International Symposium on Language, Literature and Translation (p. 170).Zhang, S. (2008). The Necessities, Feasibilities and Principles for EFL Teachers to
Build A Learneroriented Minicorpus for Practical Classroom Uses. Asian EFL
Journal, 29, 1-15. http://asian-efl-journal.com/pta_July_08.pdf