A Quantitative Analysis for Non-Numeric Data

: This study illustrates the use of an Association Rule General Analytic System (ARGAS) for analyzing non-numeric data. Previous research by Parente, Finley and Megalis (2021) showed how the ARGAS approach could be used to test hypotheses in conventional experimental designs. This study illustrates how ARGAS can be used in exploratory research settings such as single-case research, assessing organization in multi-trial learning experiments, analysis of social media, and case-oriented studies of individuals. This approach to analysis is appropriate in research settings where the units of measure are words, shapes, or other forms of non-numeric data .


INTRODUCTION
described an Association Rule General Analytic System (ARGAS) that can serve as an alternative or as an adjunct to the Generalized Linear Model (GLM (Nelder & Baker, 1972).The goal of this study is to expand the realm of the ARGAS approach to data analysis by illustrating its use in a variety of exploratory research settings.We chose these studies because they demonstrate how the ARGAS approach can be used when the data are nonnumeric (e.g., words, shapes).Specifically, the research paradigm involves analysis of consistencies in participants' ability to describe or to recall events, perceptions, or experiences.We begin our presentation by describing research paradigms appropriate for ARGAS analysis.We continue with research examples that use the ARGAS.We end with a discussion of specific issues for future research and development.

LITERATURE / THEORETICAL UNDERPINNINGS
The biggest difference between the ARGAS and the GLM procedures is the unit of measure.The GLM typically analyzes numerical data, whereas the ARGAS approach analyzes non-numeric (e.g., words, shapes) data.There are three ways to generate these data: free responding, restricted Print ISSN 2056-3620(Print) Online ISSN 2056-3639(Online) Website: https://www.eajournals.org/Publication of the European Centre for Research Training and Development -UK 2 responding, and transformational responding.Free responding requires participants to generate any words or phrases that describe their unique experience in a particular context or different conditions of an experiment.For example, using this approach in an independent group's comparison study (e.g., control, placebo, and experimental groups) would require that participants generate words or phrases that describe their feelings or experiences during their participation in one group or another.The same methodology can be used with individual participants in a correlated group design where each participant would be exposed to all treatment conditions.Free responding allows the researcher to identify consistencies in word choices that reveal nuances of the participants' experience.In this way, it can provide information relevant to the findings but not apparent from the conventional numerical analysis.Data collection can also involve restricted responses.For example, a participant may be given a page of adjectives and asked to circle those words that best describe their experience.The same list of descriptive words is therefore provided to each participant although he or she is free to choose those that are relevant to their experience.Translational responding is appropriate when numerical data are initially translated into words (e.g., "above the median," "below the median," or "high," "moderate," or "low").Parente, Finley, and Magalis (2021) illustrate how most univariate and multivariate experiments can be designed and analyzed when the measures of interest are translations of numbers into words.We refer the reader to this article for a thorough discussion and examples of transformational responding.

Proposed Analysis
The core feature of this analysis is the "Association Rule" (Webb, 2010), which identifies relationships among words in the text.Association Rule Analysis is a pattern recognition procedure (Webb, 2003(Webb, , 2010) that generates rules for predicting one set of events from another (Han, Pei, & Kamber, 2011).It is specifically designed for analyzing associative relationships in text.For example, participants might generate words that differentiate their experience of learning algebra versus statistics (Magalis, 2020).Whereas the primary goal of our earlier paper (Parente et al., 2021) was to illustrate how to use the ARGAS approach for hypothesis testing, our goal here is to illustrate how the same analysis is appropriate for exploratory research where the goal is to generate hypotheses and to make suggestions for future research (e.g., pilot studies).Parente & Finley (2018) present a detailed description of the alternative ARGAS statistics.We present an abbreviated overview below.
The goal of ARGAS is to associate antecedent and consequent events.Antecedent events are analogous to independent or predictor variables in conventional GLM statistics.Consequent events are analogous to dependent or outcome variables.The units of measure are any non-verbal data such as words, shapes, or numbers converted to words (e.g., 1 = one).The ARGAS analysis produces measures of association called rules (Balcazar Dogbey, 2013;Parente & Finley, 2018) that define the co-occurrence between the antecedent and consequent events.The mathematics of the procedure can be quite complex and usually require software to expedite the computations (e.g., SAShttps://www.sas.com/en_us/software/stat.html,SPSSmodeler https://www.ibm.com/products/spssstatistics,BigMLcom) or as standalone software (e.g., KHcoder -https://khcoder.net/en/).Parente & Finley (2018) discussed the computation of different rule statistics.Without going into the computational minutia, the output from the ARGAS analysis is a set of probabilities, i.e., rules, which express the co-occurrence of the participant's word choices within the inquiry.Each rule can be tested for significance either with conventional methods (e.g., p < .05)or by replication with a "holdout sample."Because space does not permit an example of how to apply ARGAS in all of the areas of qualitative or exploratory research, we have selected four diverse areas to illustrate application of the ARGAS model.

RESULTS / FINDINGS Example 1. Single Case Research
The ARGAS approach to single-case research (Isaac & Michael, 1997) involves a free-responding paradigm in which participants generate words or phrases that describe their experiences in different situations.For example, we collected pilot data from a single case regarding the person's experience during pre-and post-COVID vaccination.The person generated five words at the end of each week for seven weeks before and after vaccination.Table 1 displays examples of these word choices.The table includes only the first three weeks of data for each period, followed by dots representing a continuation of the data collection for the final four weeks.each time phase.The table includes only the first three weeks of each phase.These data were analyzed with the ARGAS to identify words that discriminated between the pre-and post-vaccine phases.The analysis identified ten rules (See Table 2) that described the person's emotional state during each phase.Each rule was significant (p < .05).The participant described their pre-vaccination experience generally as anxiety-provoking.Word choices that described a post-vaccination experience expressed relief, normalcy a return to social engagement.This single case analysis generated several research suggestions that relate to patient demographics.For example, younger participants may choose very different descriptors of their experience relative to older patients.Those with predisposing medical conditions might also generate different word choices.Political or religious differences may also color a participant's word choices.

Example 1. Grounded Theory
The purpose of this study was to illustrate the use of ARGAS for exploratory research and theory development within the context of a multi-trial learning paradigm with words and symbols (Mitchell, D., (2014).The research procedure involved learning a list of 12 unrelated nouns or unfamiliar shapes on each of the 12 study/test trials.This is an example of restricted responding because the participant studied and recalled the same list of words or shapes on each of the 12 study/test trials.Intrusions were not counted.Tulving (1966) proposed a theory of subjective organization (SO) which describes the human tendency to impose organization on seemingly unrelated words.Tulving noticed that participants recalled the same words together as trials progressed.The tendency to group words together illustrated the process of SO.This finding has been replicated several times, and the general finding is that the words become associated as trials progress The assumption is that association are correlated with recall.Because rules generated by the ARGAS model are measures of association, then it is reasonable to suggest that these rules would reflect the associative grouping of words in a multi-trial learning experiment.
SO theory development has traditionally involved learning lists of unrelated nouns.There is a dearth of studies that have investigated memory organization for non-verbal such as shapes or symbols.Therefore, the following research assessed whether the SO phenomenon that occurs when learning unrelated words also occurs when learning unfamiliar shapes.
The data for this study came from an unpublished master's thesis by Nickerson (2013).One hundred fifty college students learned a list of 12 unrelated nouns over 12 study/test trials.They also learned a list of 12 unfamiliar shapes over a 12-trial sequence.Nickerson (2013) presented the words and shapes randomly on each trial.She counterbalanced the stimulus type (words, shapes) with word recall preceding shape recall for half of the participants and the reverse order for the remaining half.She randomly designated half of the participants in each condition as a holdout sample.The study's goal was to document the level of SO with the words, which would serve as a baseline for comparison with the SO for the shapes.
The authors imported these data into the Magnum Opus (Webb, 2010) computer software which generated rules that described the relationship among the 12 words or shapes.We selected those rules that were statistically significant in the training sample and validated in a holdout sample.The word data yielded 33 significant rules, 23 of which (69%) were significant in both the training and holdout analyses.The shape data yielded 41 significant rules, 23 of which (56%) were significant in the training and holdout samples.The differences in these percentages were significant (Chi-Square = 7.22, p < .05),indicating that the participants developed significantly fewer association rules for the shapes relative to the words.
These results suggest at least one testable hypothesis.The results show that college students do subjectively organize their recall of unrelated words, which is consistent with Tulving's (1966) organizational theory.However, the data also indicate that this same organizational process occurs with non-verbal shapes although to a lesser extent.If SO underlies recall (Tulving, 1966), then it is reasonable to suggest that the number of rules derived from an analysis of individual participants' shape recall would correlate with their recall performance (Parente & Finley, 2018).Finley & Parente (2018, 2020) tested this hypothesis by computing the number of significant association rules from individual college students and correlated these measures with recall performance from the same participants.Half of the students were athletes who had reported multiple concussions over the years, and the remaining group did not report any concussions.There were two significant findings in these data; 1. Finley & Parente (2020) showed that the number of rules derived from the individual participants' shape recall correlated significantly with the number of shapes the participant recalled.2. Participants without brain injury generated significantly more association rules than did the head-injured group.This finding suggests that the number of ARGAS rules displayed significant discriminative validity when compared with the brain injured conditions.

Example 3. Social Media Analysis.
The purpose of this study was to explore the personalities and activity preferences for people who engage in on-line dating.The authors used two types of data from a social media website to assess the personality traits of male and female users and to identify different activity preferences apparent in four major dating groups: Never Married, Currently Separated, Widowed, and Divorced on-line users.
The first part of the study involved using the ARGAS to extract emotional themes from written self-descriptions that are provided by male and female users.Thematic analysis is a commonly used qualitative research method (Castleberry, & Nolen, 2018).It often involves extracting the themes available in written text or interviews.The authors used a similar technique to extract themes from written self-descriptions from a public domain internet site (Kozinets, 2019).Specifically, data obtained from www.match.comincluded brief paragraph self-descriptions from a cohort of 48 members (24 male and 24 females).We then extracted the emotional tone from each person's self-description that reflected their emotional status.The study's goal was to describe differences in the emotional status of the male and female participants.
The IBM Watson Tone Analyzer software (https://www.ibm.com/cloud/watson-tone-analyzer) was used to identify emotional themes in the paragraphs.The software analyzed the content of the text passage for several emotional themes, for example Extroversion, Joy, Sadness, Anger, Confidence, Fear, Tentativeness, etc.Although human scorers are usually used to extract thematic content, the Tone Analyzer software was used here because it is based on specific rules and the results were, therefore easily replicable.The analysis involved associating gender (antecedent) with the emotional themes extracted by the tone analyzer (consequents).Table 3 presents two rules that derived from the ARGAS analysis and that were significant in both the training and holdout samples.These association rules showed that males' written self-descriptions displayed significantly more extroversion relative to the females.The female self-descriptions showed significantly more tentativeness relative to the males.This finding suggests that women feel more cautious, uncertain, and less confident in their self-descriptions relative to men.
The second part of the study concerned exploring different dating activity preferences that were expressed by the participants.The match site provides a number of word choices which were collected for participants who were Never Married, Currently Separated, Widowed, or Divorced.Activity preferences included things like: coffee and conversation, camping, dining out, etc.The choice of words is an example of restricted responding because each user selected from the same group of words.The analysis involved associating the marital groups (antecedents) with the dating activity preferences (consequents).
This analysis yielded 10 rules that were significant in both the training and holdout samples.These rules are presented in Table 4.Although there are some consistencies in user preferences, (for example, (Separated and Divorced participants prefer informal engagements such as Coffee and Conversation), generally, the various groups show diverse interests.Case-oriented qualitative research involves an in-depth and detailed study of individual cases.For example, Parente, Anderson, Ottentein & Haus, (1981) developed a computer-assisted method of counseling that involved collecting personal data from therapy clients over several weeks.Personal data included measures of problematic behaviors for the client (called targets) and others that the client felt were related to the targets.Each client generated a set of measures that were unique to their lifestyle and experience.Interpreting these correlative relationships with the clients was sufficient to effect a positive change, such as reducing anxiety, lowering blood pressure, and lessening stuttering.
Parente & Herman (2010) describe a similar method of "behavioral charting" that involves selfratings of target behaviors and other behaviors that the client felt were related to the targets.The exercise begins with a discussion of which behaviors the person feels are germane to his or her life situation.For example, the client in this case study thought that their thinking skill (target) was related to the number of cups of coffee he drank each day, his perceived levels of depression and anxiety, the severity of his headaches, memory functioning, attention span, and his overall energy level.The therapist and client created a data sheet that allowed the client to rate the severity of the target or covariate each day (mostly on a scale of 1-10).These numerical ratings were then transformed into "Above the Median" / "Below the Median" phrases and then analyzed using the ARGAS, as described by Parente et al., (2021).The analysis yielded the following association rules presented in Table 5.Additional cross tabulations were performed on these rules to determine if the relationship was direct or inverse.Thereafter, several suggested treatment interventions could be derived from these results.For example, activities that increases energy (e.g., physical exercise), may also improve thinking.
Activities or medications that reduce or eliminate headaches will improve thinking and memory.Activities that reduce anxiety (e.g., YOGA) may lessen depression which may also improve memory.This process may therefore be helpful to clinicians when planning treatment interventions.
Validation using a holdout sample was unnecessary because the goal was to develop rules that applied only to the individual and not to a larger population.However, the suggestions described above were validated with continued data collection while implementing the suggested lifestyle changes dictated by the rules.

DISCUSSION
The purpose of this paper is to illustrate the use of the ARGAS methodology for analyzing nonnumeric data which is often the unit of analysis in exploratory or qualitative research.Although these study examples show that ARGAS is appropriate in a variety of different types of research, they also show that using association rules as an analytic tool requires some degree of caution.For example, there are few generally available, user-friendly, comprehensive software packages for ARGAS computations.The analysis may produce complex rules with multiple antecedents and consequents that are difficult to explain.There needs to be more instructional guidelines for interpreting the rules.Journal editors may not be familiar with the concept and computation of association rules.

IMPLICATIONS FOR FUTURE RESEARCH
Measures of Association.The number of rules the ARGAS generates is perhaps the most straightforward association index.We have also relied on the lift measure to index the strength of these relationships.However, others may be equally useful.For example, Balcázar, &Dogbey, (2013) andWebb, (2010) describe several other measures that may be equally or better suited for interpretive purposes.These include Confidence, Strength, Leverage, Support, and Coverage (Webb, 2010).In our experience, the Lift value is, perhaps, the easiest to interpret.As the value of lift approaches one, the rule becomes less and less useful.As the value of lift increases beyond one, so does the predictive value of the rule.
Interpretation.How does one explain an association rule?These statistics identify a significant cooccurrence of words or phrases or symbolic content in the data set.However, beyond that description, the interpretation of the rule depends upon alternative meanings of the word choices.For example, word choices such as "cool" or other currently trending synecdoche phrases may have multiple meanings with different age groups or genders.Even common words or phrases such as "attractive" may signify different meanings for different people.It may also be necessary to compute cross tabulations to evaluate the directionality of the relationship between the antecedent and consequent words.The researcher may not simply assume a direct or inverse relationship.This interpretive technique may be difficult in cases where the rules describe a multivariate relationship.(Parente et a., 2021).Second, the analysis provides for significance testing of results which may be especially useful in those studies that involve developing or testing hypotheses, e.g., Framework Analysis, Srivastava & Thompson (2009).Third, the ARGAS is can be used as an alternative to the GLM when there are assumption violations.Parente et al., (2021) identified several other areas of research with ARGAS that require further investigation.These include: verification of rules, significance testing of rules, issues of power and sample size, interpretation of complex rules, and a study of the relative efficiency of the various measures such as lift, confidence, leverage, etc. are necessary.Whereas the purpose of this research was to illustrate the use of ARGAS for exploratory or qualitative research, we fully acknowledge that additional research should focus on practical applications with in-depth consideration of ARGAS potential and limitations.In the meantime, we assert that the ARGAS model is ready for use with research paradigms that analyze non-numeric research data.

Table 1 .
Word Choice Examples pre and post-vaccination The pre and post-designations at the beginning of each line in the table indicate the weekly time intervals, and the words in each row reflect the mental and emotional state of the participant during : https://www.eajournals.org/Publication of the European Centre for Research Training and Development -UK Website

Table 2 .
Association Rules that Distinguish Pre vs Post COVID Vaccination Descriptions

Table 3 .
Personality rules for male and female match.commembers

Table 4 .
Word choices for marital groups that describe preferred social interactions