Review of A Guide to Doing Statistics in Second Language Research Using SPSS and R 


Review: 
Reviews Editor: Helen AristarDry
SUMMARY
‘A Guide to Doing Statistics in Second Language Research Using SPSS and R’ by Jenifer LarsonHall provides an introduction to hypothesis testing and statistics, specifically aimed at second language researchers. The book is split into two parts. The first, ‘Statistical Ideas’, introduces researchers to the general assumptions of statistics and hypothesis testing, while the second, ‘Statistical Tests’, demonstrates how to conducted specific tests in SPSS and R. All practice data comes from real published second language research experiments, allowing the reader to find the examples directly relevant to their own work. At several points in each chapter the reader is given exercises with this data to apply the skills they learned in the most recent lesson.
Part I: Statistical Ideas
Before introducing the statistical theory in Part I, ‘Chapter 1: Getting Started with the Software and Using the Computer for Experimental Details’ begins by acquainting the reader with SPSS and R. The reader is shown how to read in data, create data, and deal with missing data in the SPSS and R environments. Instead of using the standard R GUI, the reader is instructed to use R Commander, a wrapper for R installed through a package to allow the user to rely less on the command line. LarsonHall says that R Commander can ease the way for users who do not have previous experience with coding or a command line language.
In ‘Chapter 2: Some Preliminaries to Understanding Statistics’ the reader is introduced to basic concepts surrounding statistical analysis. These concepts include dependent and independent variables, hypothesis testing, pvalues, as well as a push for robust statistics (this is later discussed in more detail in Chapter 4). This chapter is designed to introduce new researchers to the principles necessary for clean experimental design and later data analysis. All of this is presented with exercises focusing on second language research data sets, by asking the reader to determine the appropriate variables in each design. The discussion of hypothesis testing is very clear and detailed in order to give the reader a solid foundation in the assumptions of most statistics.
‘Chapter 3: Describing Data Numerically and Graphically and Assessing Assumptions for Parametric Tests’ shows the reader how to get descriptive values of their data (e.g. mean, median), as well as how to plot the data for visual inspection. How values such as mean and standard deviation are obtained is explained in detail. There is also discussion of the assumptions of parametric tests, such as the fact that data must be normally distributed. Readers also learn about ways to transform their data to meet these assumptions. LarsonHall emphasizes the need to examine data before beginning analysis to be sure it meets the requirements of any planned tests.
‘Chapter 4: Changing the Way We Do Statistics’ presents a call to move away from classical statistics that focus on pvalues as the ultimate decider of significance, and instead turn to looking at confidence intervals and effect sizes. This chapter corresponds with a general trend in the field of linguistics and psychology to focus more on what it means to have an effect and how large that effect is, instead of being beholden to a specific pvalue cutoff. In order to allow the reader to fully understand the ‘new statistics’, the reader is also given a detailed discussion of the ‘old statistics’. Topics covered in the ‘old statistics’ section include null hypothesis testing and power analyses.
Part II: Statistical Tests
Before being shown how to run specific tests, in ‘Chapter 5: Choosing a Statistical Test’ the reader is given an overview of the tests covered in the book (correlation, regression, ttest, ANOVA, repeatedmeasures ANOVA). Additional tests are also made available in online materials. Each test is given a general summary as well as provided with a mnemonic device for the reader. There is a distinction made between ‘tests of relationship’ (correlation, regression) and ‘tests of group difference’ (ttest, ANOVA, repeatedmeasures ANOVA). For each test the reader is given example papers and data appropriate for use with that test.
The same format is displayed in each of the following chapters (‘Chapter 6: Finding Relationships Using Correlation’, ‘Chapter 7: Looking for Groups of Explanatory Variables through Multiple Regression’, ‘Chapter 8: Looking for Differences between Two Means with TTests’, ‘Chapter 9: Looking for Group Differences with a OneWay Analysis of Variance’, ‘Chapter 10: Looking for Group Differences with Factorial Analysis of Variance When there is More than One Independent Variable’, and ‘Chapter 11: Looking for Group Differences When the Same People are Tested More than Once’) . Each of the chapters begins with some recent examples from second language acquisition literature of the analysis under discussion. The reader is then walked through the steps in both SPSS and R (using the R commander) to first visually examine the data and then run the specific analysis. Assumptions for the tests are clearly stated to ensure the reader is using the tests appropriately. For tests with multiple comparisons, there are also discussions of how best to conduct posthoc analyses. The chapter ends by showing the reader how to report the results of their test, both numerically and in prose.
Throughout the chapters the reader learns how to make a variety of figures including scatterplots with regression lines, scatterplot matrices, boxplots, histograms, QQ plots, interaction plots, and parallel coordinate plots. For any R code used, all of the arguments are explained in tables so the reader can better understand each part of the code. Several R packages are also introduced and used to help with manipulating data, making figures, and running statistical tests.
EVALUATION
The book is a good introduction to new researchers on how to frame their research questions and collect data for later statistical analysis. Part I is thus a good read for any new researcher confused by how to approach their experimental questions. Having such a large number of real second language research data sets is a huge benefit, as most statistics books use data sets that are less accessible (or less interesting) to second language researchers. The fact that the data is from actual published papers also goes a long way toward showing how messy real data can be.
The specific tests chosen for explanation felt a little deficient. While very good for an introductory textbook, the tests chosen for explanation in this book are unlikely to teach anything to someone with experience in statistical analysis. Furthermore, while it is important for new researchers to have a basis in the more classical tests, in the current world of more advanced statistical analyses, having knowledge of only these tests will be insufficient. For example, more and more reviewers have come to expect linear mixed effects models as a replacement to ANOVAs. Similarly, the chisquared test, a useful nonparametric test, is missing from the book. Materials for both of these tests are available online in the new ‘A Guide to Doing Statistics in Second Language Research Using R’ which is free to download, but it is unfortunate that the material is not included in the main textbook. While more advanced topics such as these may be outside of the scope of an introductory book, this omission is worth considering if indeed the goal of this book is to take the reader from data analysis to publication.
Additionally, I would not recommend this book to someone who wants to become newly acquainted with R. The book is heavily reliant upon the R Commander GUI, which uses dropdown menus similar to SPSS to make figures and perform statistical tests. While this can be appealing for new users to R, I do not think it is a good longterm practice. For example, one of the benefits of using a command line language for statistics instead of one with a GUI, is the ability to easily share data and scripts so that other researchers can reproduce your analyses with old or new data. Saving code was mentioned only in a tip which stated “Because the command can get long, you may want to paste it onto a different place, like a Word document, to add in different arguments easily and then paste the result into the R Console” (p. 199). This is not a good practice and can lead to problems, as LarsonHall herself notes, “One thing to be careful of here, however, is your quotation marks. R will not accept Word’s ‘smart quotes’ which are curly. Copy and use the quotes that you pasted over from R which are just straight up and down”. A much better practice would be to simply save the code in an R document, which will not have formatting errors and can be opened in the future to conduct the analysis with the same or different data. Again, while I appreciate that using a GUI like the R Commander can seem attractive to someone new to R, in the long run I think it would be much better to risk initial frustrations by teaching the command line and scripts.
Overall I would recommend this book to a new researcher who wants a better understanding of how to ask research questions, and then design experiments to statistically answer those questions. The book is also a very good introduction to more traditional statistical tests, providing indepth discussions of when and how they should be used. This level of detail, complemented by real second language research data sets, makes it a good introductory textbook for new second language researchers.


ABOUT THE REVIEWER:
Page Piccinini is a Ph.D. candidate at University of California, San Diego. Her research interests are bilingualism, psycholinguistics, and phonetics. Her dissertation focuses on the phonetics of codeswitching, and its implications on theories of bilingual speech production and perception. 

Versions: 
Format: 
Hardback 
ISBN13: 
9781138024564

Pages: 
528 
Prices: 
U.K. £
95.00

Format: 
Paperback 
ISBN13: 
9781138024571

Pages: 
528 
Prices: 
U.K. £
49.99



