The Realization of Lexical Cohesion in EFL Students’ Explanatory Texts Across Two Levels of Proficiency

Lexical cohesion is the most prominent resource of cohesion, which is a property usually associated with writing quality. Around forty to fifty percent (Hoey, 1991; Kafes, 2012) even two-thirds (Witte & Faigley, 1981) of cohesion in texts are lexical regardless of proficiency levels. This research investigated how lexical cohesion (involving repetition, synonymy and collocation) is realized in the explanatory texts written by the two groups of participants (high and low achievers) and whether or not the denser realization of lexical cohesion is positively related to the writing quality. e results of the analyses conducted largely qualitatively showed that repetition came first as the most-frequently exploited sub-class of lexical cohesion, followed by collocation and synonymy. Unlike collocation and synonymy, repetition contributed negatively to the writing quality though complex repetition, one sub-type of repetition, contributed positively as synonymy and collocation did. Surprisingly, taken together as lexical cohesion, the three sub-classes in their percentages of occurrences in the corpus did not have positive effects on writing quality. Therefore, denser lexical cohesion when involving repetition was not always an indicator of good writing. Thus, this study presents, in relationship with writing quality, the discussion of each cohesive sub-class as one entity be more reliable than that of (lexical) cohesion as a superordinate. The study also recommends making use of exercises available or self-made to build up students’ skills in using synonymy instead of repetition, and in creating well-formed collocation.


INTRODUCTION
Cohesion in a text is created by a set of linguistics resources, which are also called cohesive devices. The resources are represented in various lexico-grammatical forms to build semantic relations and grouped into lexical and grammatical cohesion. For example, the relation of reference, which is subsumed under grammatical cohesion, is realized in a text by the lexico-grammar of personal pronouns or the definite article the. As another example, the relation of reiteration, which is a sub-class of lexical cohesion, is represented by the lexico-grammar of repetition/repeated words and (near) synonymy.
Due to those lexico-grammartical forms, a text has its textual cohesion or becomes cohesive.
Some studies (Allard & Ulatowska, 1991;Liu & Braine, 2005;Witte & Faigley, 1981) suggested that there is a close connection between the number of cohesive resources and writing quality. Other studies (Alarcon & Morales, 2011;Chen, 2008;Green, 2012) claimed that no significant relationship is found between the cohesion and the writing proficiency. In some more other studies, it was surprisingly found that advanced student writers use less cohesive resources, compared with other students with lower proficiency (McNamara, 2011in Green 2012; and more uses of cohesive resources were found to correlate with poor writing (Green et al., 2000).
To the best of the researcher's knowledge, just two studies (Green, 2012 andSaudin, 2017) have attempted to offer an empirical account for the inconsistency. Green suggested that the contradictory findings be partially explained by the construct of cohesion which was defined differently. Topic fronting, logical connectors, words having just similar sounds/being built with the same morphemes and even subject-verb agreement are included as cohesive resources in some studies but not in some others. Saudin (2017) argued that the inconsistency also resulted from the fact that cohesion was investigated in its less complete construct, focusing only on a few cohesive resources or even just one cohesive resource. This research has several differences with other studies (Allard & Ulatowska, 1991;Liu & Braine, 2005;Saudin, 2017) which also investigate the resource of lexical cohesion. One striking difference is that this research breaks down repetition, one prominent member of lexical cohesion, into simple and complex repetition in its analysis, which the other studies did not. As far as the researcher's knowledge is concerned, his research is the first to do so. Another important difference is the corpus used. In this study, the corpus is a division and classification essay, which belongs to an explanatory or informative text. Other studies, for example Hasan's (1985) and Saudin's (2017), used Narrative and Argumentative text as their corpora.
This research aims to investigate how lexical cohesion (involving its major subclasses: repetition, synonymy and collocation) is applied in the explanatory texts written by students of English Department at Politeknik Negeri Bandung across two proficiency levels: high and low. Further, this research tries to report whether or not the denser realization of lexical cohesion has positive relationship with writing quality and lastly imparts practical suggestions as to how to improve learners' writing proficiency. The suggestions are made by taking into consideration the characteristics of the uses of the three sub-classes of lexical cohesion in the explanatory/informative texts written by these two groups of learner participants.

Lexical Cohesion
Lexical cohesion, along with grammatical cohesion, is known as "lexicogrammatical resources of cohesion" (Halliday & Matthiessen, 2004, p. 532). These resources make a text cohesive as they help link information in a piece of writing and help it flow and hold together (Halliday & Hasan, 1976;Knapp & Watkins, 2005).
Through the resources, a text becomes an interpretable whole rather than unconnected sentences because they direct us to relate one item for its full interpretation to another since there is a certain dependent relation between the two. Different from grammatical cohesionwhich exploits grammar to build cohesion in a text by using the resources of reference, conjunction, ellipsis and substitutionlexical cohesion creates cohesion through the choice of lexical items or vocabulary. The lexical items such as repetition, synonymy, antonymy, and superordinate/ hyponymy connect consistently the text to its area of focus (Droga & Humphrey, 2003in Emilia et al., 2018. According to Halliday and Hasan (1976), lexical cohesion has the resources of reiteration and collocation. Each resource has several members subsumed under it.
Collocation, the association of lexical items that regularly co-occur (Halliday & Hasan 1976), is associative words such as boy and girl or basement and roof.
In the further development of the theory of lexical cohesion, the sub-classes have experienced changes (see Eggins, 2004;Paltridge, 2006;Salkie, 1995). It is generally accepted that lexical cohesion has directly such sub-classes as repetition (repeated words), antonymy (words having a contrasting meaning), synonymy (similar meanings shared by lexical items), hyponymy (class-member relation between lexical items) including co-hyponymy (member-member relation), meronymy (whole-part relation) including co-meronymy (part-part relation), and related words, words which "are hard to say precisely what the relationship between the words is, but it is clear that they come from the same general area of vocabulary …" (Salkie, 1995, p. 28), and lastly collocation or word partnership.
In the following part, the three sub-classes of lexical cohesion and the text-type under study are further discussed.

Repetition
Repetition is the prominent sub-class of lexical cohesion. It is defined as a semantic relation when lexical items in a text are (1) simply repeated or (2) repeated with inflection for tense, verb form, number, and parts of speech (Paltridge, 2006;Malah, 2015). Halliday & Hasan (1985;Hoey 1991) term the former as simple repetititon and the latter complex repetition. For the same two concepts, Taboada (2004in Malah, 2015 uses the term exact repetition and inexact repetition. This sub-class of lexical cohesion is known as the very frequently-used type of cohesive tie of lexical cohesion (Hoey, 1991;Kafes, 2012;Palmer, 1999). However, repetition is an indicator of poor writing quality if heavily relied on as suggested by Witte &Faigley, 1981 andWu, 2010.

(Near) Synonymy
As another sub-class of lexical cohesion, synonymy is a semantic relation established when different lexical items similar in meaning are used at one point and another in a text. It is, however, possible that two lexical items lexicographically different in meaning are considered to be synonymous when they are put in a text. Learners of English are advised to use synonymy instead of repetition since the reliance on repetition would affect the quality of writing negatively, while the reliance on synonymy would be otherwise. Competence in deploying synonymy can show learners' level of vocabulary diversity or lexical variation (LV), which is proven to correlate positively with good writing as suggested by Grobe (1981 in Juanggo 2018) and Laufer & Nation (1995).

Collocation
Collocation, the term first coined by J.R. Firth (1950), is a major sub-class of lexical cohesion. Halliday & Hasan (1976) refer to collocation as a linguistic device used to achieve cohesion through the association of lexical items that regularly co-occur. The most common and well-known definition of collocation is the tendency of a lexical item to co-occur with one or more other words (Hsu, 2007). Benson, Benson and Ilson (1997) define two major types of collocation, namely lexical collocation (combinations of content words) and grammatical collocation (combinations between content words and grammatical/function words).

Explanatory Writing
This specific genre, often also called informative text, aims to inform the reader of a subject rather than ask the reader to take a position on an issue as an Argumentative text attempts to do (www3.wayne.kyschools.us/userfiles/250/Classes/11798/ Informative.pptx). The act of explaining is a fundamental language process in providing learners with new understandings of the world and how it operates (Knapp & Watkins, 2005). Through the genre, they can accumulate knowledge about the world and demonstrate that knowledge. Explanatory writing usually has three parts or stages. The first one takes the form of an introductory paragraph which generally has the function of classifying and describing a particular subject or topic. The second stage is the body which may consist of more than one paragraph, each of which functions to provide explanatory sequence. Lastly, the third stage in the form of a concluding paragraph usually offers evaluative judgement and interpretation.
There are several patterns of how this explanatory writing is organized. The common patterns are Cause/Effect, Division and Classification, Comparison/Contrast, Definition, and Procedure (see www3.wayne.kyschools.us/userfiles/250/Classes/ 11798/Informative.pptx). The corpus of this research is Division and Classification essay, in which a broad topic is discussed by breaking it down into individual parts and then classify them into groups that have something in common.

METHODOLOGY
The study adopts qualitative method based on an interpretive belief that knowledge and meaning are acts of interpretation. The method argues that there is no objective knowledge independent of thinking, reasoning human (Holliday, 2007). This study was conducted in an English regular class where students and their teacher were interacting in a process of teaching and learning. The study also applied the inductive method in analyzing its corpus -the student participants' written textsto arrive at general findings and conclusions. In showing its findings more understandable and manageable way, this qualitative study used tables to display them.

Research Site and Participants
The research site is the English Department of Politeknik Negeri Bandung. The participants of this study were 56 second-year students of two English classes. In general, their English proficiency belonged to Intermediate level. The participants were asked to write an explanatory text consisting of around 300-400 words within 50 minutes in their mid-semester test. In the text, they had to explain kinds/types of a subject and discuss the common characteristics of each type.

Research Corpus and Its Analysis
After the 56 texts being collected, they were selected purposively. Six pieces of writing were then chosen, each of three represented high-and low-quality pieces of writing. In their writing pieces, the high achiever learners produced 1,537 words and the low achiever learners 1,024 words altogether. A model of holistic scoring guide known as Test of Written English (TWE, which is also used for TOEFL Writing Test) was adopted to classify the text quality. To determine the classification more valid, the researcher crosschecked the writers' English grade performance average (GPA) in the documents of their academic achievements available in the administrative section of the English Department.
In analyzing the corpus, first the tokens involved in the formation of each subclass of lexical cohesion (repetition, synonymy and collocation) were identified. To identify them, the color red was used to mark the token of repetition, green to (near) synonymy and yellow to collocation. The identified tokens for each particular sub-class was generally one lexical item (in particular a content word) except for collocation because it typically appears in two words (even sometimes three words). In identifying collocation, one particular word pairing was counted just once although the pairing appeared more than once in one piece of writing. This research then considered collocation variety to be more prominent than simply its any occurrence. In the process of identifying collocation, the study consulted Oxford Collocations Dictionary (McIntosh, 2009) to avoid inaccuracy.

RESULTS AND DISCUSSION
This part discusses the results of the analyses of the research's corpus, six pieces of learners' explanatory writing of two proficiency levels: high and low quality of writing.
Prior to the discussion, the data resulting from the analyses are displayed in table 1. Then the data displayed will be elaborated to respond to the aims of the research: to show how the resources of repetition, synonymy and collocation are realized across the two proficiency levels and to reveal whether or not the more uses of the resources are closely related to the writing quality. Furthermore, another aim is also added: to impart practical suggestions of how to improve writing proficiency after taking into consideration the characteristics of the uses of the three sub-classes of lexical cohesion in the explanatory/informative texts written by the two groups of learner participants.
The data tabulated show that repetition comes first as the most-frequently exploited sub-class of lexical cohesion across the two proficiency levels. Learner writers of high achievers used 122 tokens on average (24%) to contribute to the textual cohesion.
Their peers from low achievers more heavily relied on repetition to create cohesion (using 103 tokens or 30.2%). This finding is quite expected and in line with other studies (Witte & Faigley, 1981;Wu, 2010) which suggest that the poorer the writing quality is, the more it uses repetition. However, the realization of complex repetition across the proficiency shows a fascinating phenomenon. The realization of this complex type of repetition increases in number in the texts written by the learners of higher proficiency, unlike that of simple repetition. The group of high achievers used 44.7 tokens of complex repetition on average (36.5%), while the group of low achievers were just able to deploy 31 tokens on average (30%). This finding indicates the language competence of the former is greater in using more varieties of expressions to put ideas than the latter.

Sub-class of Lexical Cohesion # Tokens in High Achievers' Texts # Tokens in Low Achievers' Texts
The data in table 1 also displays that sub-class that comes next across levels is collocation. As seen, the high achiever writers deployed 75.6 words (14.8%) and the low achiever writers 37.6 words (11%) on average. This finding indicates that collocation is found more in texts of higher quality. The finding is in line with the results of other studies (Ghadessy, 1998;Hsu, 2007;Zhang, 1993).
Further, table 1 shows that four types of collocation appeared to be the most realized across the two proficiency levels of learners. In the high achiever learners' texts, from the first to fourth mostly used types are v+prep (16.7 tokens or 22.1% on average), v+n (16 tokens or 21.1%), prep+n (13 tokens or 17.2) and adj+n (10 tokens or 13.2). The low achievers, on the other hand, realized from the first to the fourth such types of collocation as v+n (11.7 tokens on average or 31%), adj+n (6 tokens or 16%), prep+n (6 tokens or 16%) and n+prep (4.7 tokens or 12.5%).
The data of the collocational realizations across learners' levels also present one interesting phenomenon. That is, v+prep/phrasal verb represents the type of collocation most closely related to the writing quality. This claim needs to be proven further by future research since to the best of the researcher's knowledge, this present research is the first to report the sequential realizations of collocational types. Now, the discussion turns to the realization of synonymy. This sub-class of lexical collocation occupies the third or last position in terms of its contribution to the textual cohesion. This is indicated by 15.7 tokens/words (3.1%) used by the high achievers and 8 words (2.3%) by the low achievers on average. Therefore, the study reveals that as learners' proficiency increases, the numbers of synonymy they deploy also get more, the case which is the same as that of collocation. The finding proves that the application of more variations of vocabulary (synonymy) are closely related with text's quality. This corresponds to the results reported by other researchers (Grobe, 1981in Juanggo 2018Laufer & Nation, 1995).
Considering the realization of the three sub-classes altogether as that of lexical cohesion, a surprising finding is revealed. It is shown that the total percentages of lexical cohesion in the low achiever learners' texts (43.5%) are a bit higher than that in the high achiever learners' (41.9%). This finding that lexical cohesion is less dense as the proficiency increases is conflicting with the reports of other studies (Allard & Ulatowska, 1991;Liu, 2000;Liu & Braine, 2005). The finding, however, is in line with some other studies (Alarcon & Morales, 2011;Chen, 2008;Green, 2012) which claim that cohesion is not significantly related to the writing proficiency. Even, cohesion is less dense in the texts of higher quality.
The finding might result from the fact that the present study did not investigate lexical cohesion in its complete construct. This study did not include other sub-classes of lexical cohesion such as antonymy, hyponymy, meronymy and related words/lexical set.
However, this finding suggests that there is also inconsistency in the results of the studies that relate lexical cohesion to writing quality. The case is just like the inconsistent results reported by the research that connects cohesion with text's quality as alluded previously.
Therefore, a more comprehensive study on the connection between (lexical) cohesion and writing proficiency needs to be conducted in the future.

Pedagogical Implication
As has been previously stated, learners tend to rely much on repetition to create cohesion in their writing pieces regardless of their level of proficiency. It is important for a writing teacher to remind the students that a heavy reliance on repetition is an indicator of poor writing quality. This is so because they show that they actually do not have more ideas to add and possess a limited English vocabulary. Instead of using the same words or expressions to make meaning, learners should practise using synonymous words, which are regarded to have positive effects on their writing quality. If it is unavoidable to use the same item or word again, ask them to make sure that the word is used a different or new lexical environment.
It is advisable for a writing teacher to search on the Internet exercises which are related to vocabulary enrichment in general, or to practices of identifying synonymous expressions in particular. It would be much better if the teacher himself designs exercises for his learners to widen their vocabulary repertoire of synonymous expressions within one particular topic that the learners are going to write. Also, a reading material which is oftentimes used for writing classes can be exploited by, for example, providing it with additional exercises in synonymy to increase the learners' competence in lexical variations.
Learners should also be encouraged to increase their competence in collocation.
As this cohesive link is always regarded to have a close relation to texts' quality, learners are required more to have practices in using it appropriately. Exercises related to the effective uses of this cohesive tie is relatively more plentiful. English textbooks for learners are now mostly provided with exercises in collocation. There is also a lot of information related to collocation online, not only exercises to enrich and practice learners' collocational knowledge but also tests in collocation to measure learners' competence in it. Further, a writing teacher can obtain a collocations dictionary, where he can get massive information of how natural sounding word combinations are formed.
The dictionary is also provided with exercises for students to pair words appropriately to form acceptable collocations. A writing teacher can make use of all those available learning materials and facilities to help his learners gain more collocational knowledge and improve their writing quality.

CONCLUSION
In this study, it is shown that repetition comes first as the most-frequently exploited sub-class of lexical cohesion across the two proficiency, followed successively by collocation and synonymy. The results are expected since they are in line with other studies. Further, it is also revealed that the density of lexical cohesion does not have positive relation to writing quality. The case seems to be due to the fact that lexical cohesion under study is partial, including only three of its sub-classes: repetition, (near) synonymy and collocation, excluding some other sub-classes such as antonymy, hyponymy, meronymy and associative words/lexical sets.
There is another possible reason though; that is, repetition (whose heavy uses are known to be an indicator of poor writing quality) is greatly relied on by the low achiever participants of this study, which can result from the research's corpus namely informative texts, where the learner participants tend to be wordy and repetitive in saying ideas.
Though the percentages of the uses of the other two sub-classes show an increase in the high achievers' texts, the percentages cannot exceed the high percentage of exploitation of repetition in the low achievers' texts. This study, however, has a claim to make. That is, more repetition of its complex type, not of its simple one, does have a positive effect on writing quality. Therefore, learners' skills in changing the form of a word, not simply repeating the word, is essential for a better writing quality.
As to synonymy and collocation, both are indicated in this research to contribute significantly to the quality of texts. The findings come as expected. Since synonymy refers to similar meaning words and expressions involving more than a single word, its competence indicates learners' vast knowledge of language variations. As to collocation, its realization is found more in the higher achievers' explanatory writing. Therefore, collocation is proven to be significantly connected with writing quality. Further, it is shown that the frequency of using phrasal verb collocation is the most reliable gauge to judge whether a piece of writing is of high quality or not. Another finding is that in addition to v+n and adj+n collocation known as the mostly used, prep+n collocation needs to be considered as another major type of collocation.
Considering the significant roles of cohesive links in making a text to be of high quality, it is necessary that the teaching of various cohesive ties be integrated in the teaching of English as a foreign and second language. Learners should be shown that greater uses of repeated lexical items, not to mention the practices of copy-paste, contribute negatively to their texts' quality. The repetition needs to be kept minimum. By contrast, it is advisable that they be motivated to study and to deploy synonymy and collocation more. The deployment of (near) synonymy will demonstrate their competence in applying lexical variations that reveal their mastery level of the mental lexicon of the language they are learning. The uses of collocation, on the other hand, will make their language sound natural and native-like.