Package 'SIMPLE.REGRESSION'

Title: OLS, Moderated, Logistic, and Count Regressions Made Simple
Description: Provides SPSS- and SAS-like output for least squares multiple regression, logistic regression, and count variable regressions. Detailed output is also provided for OLS moderated regression, interaction plots, and Johnson-Neyman regions of significance. The output includes standardized coefficients, partial and semi-partial correlations, collinearity diagnostics, plots of residuals, and detailed information about simple slopes for interactions. The output for some functions includes Bayes Factors and, if requested, regression coefficients from Bayesian Markov Chain Monte Carlo analyses. There are numerous options for model plots. The REGIONS_OF_SIGNIFICANCE function also provides Johnson-Neyman regions of significance and plots of interactions for both lm and lme models.
Authors: Brian P. O'Connor [aut, cre]
Maintainer: Brian P. O'Connor <[email protected]>
License: GPL (>= 2)
Version: 0.2.1
Built: 2025-02-11 03:12:57 UTC
Source: https://github.com/cran/SIMPLE.REGRESSION

Help Index


SIMPLE.REGRESSION

Description

Provides SPSS- and SAS-like output for least squares multiple regression, logistic regression, and count variable regressions. Detailed output is also provided for OLS moderated regression, interaction plots, and Johnson-Neyman regions of significance. The output includes standardized coefficients, partial and semi-partial correlations, collinearity diagnostics, plots of residuals, and detailed information about simple slopes for interactions. The output for some functions includes Bayes Factors and, if requested, regression coefficients from Bayesian Markov Chain Monte Carlo (MCMC) analyses. There are numerous options for model plots.

The REGIONS_OF_SIGNIFICANCE function also provides Johnson-Neyman regions of significance and plots of interactions for both lm and lme models (lme models are from the nlme package).

References

Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373-400.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Darlington, R. B., & Hayes, A. F. (2017). Regression analysis and linear models: Concepts, applications, and implementation. Guilford Press.

Dunn, P. K., & Smyth, G. K. (2018). Generalized linear models with examples in R. Springer.

Hayes, A. F. (2018a). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (2nd ed.). Guilford Press.

Huitema, B. (2011). The analysis of covariance and alternatives: Statistical methods for experiments, quasi-experiments, and single-case studies. John Wiley & Sons.

Johnson, P. O., & Fey, L. C. (1950). The Johnson-Neyman technique, its theory, and application. Psychometrika, 15, 349-367.

Lorah, J. A. & Wong, Y. J. (2018). Contemporary applications of moderation analysis in counseling psychology. Counseling Psychology, 65(5), 629-640.

Orme, J. G., & Combs-Orme, T. (2009). Multiple regression with discrete dependent variables. Oxford University Press.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Wadsworth Thomson Learning.


Count data regression

Description

Provides SPSS- and SAS-like output for count data regression, including Poisson, quasi-Poisson, negative binomial, zero-inflated poisson, and zero-inflated negative binomial models. The output includes model summaries, classification tables, omnibus tests of the model coefficients, overdispersion tests, model effect sizes, the model coefficients, correlation matrix for the model coefficients, collinearity statistics, and casewise regression diagnostics.

Usage

COUNT_REGRESSION(data, DV, forced = NULL, hierarchical = NULL,
                 family = 'poisson',
                 offset = NULL,
                 plot_type = 'residuals',
                 CI_level = 95,
                 MCMC = FALSE,
                 Nsamples = 4000,
                 verbose = TRUE )

Arguments

data

A dataframe where the rows are cases and the columns are the variables.

DV

The name of the dependent variable.
Example: DV = 'outcomeVar'.

forced

(optional) A vector of the names of the predictor variables for a forced/simultaneous entry regression. The variables can be numeric or factors.
Example: forced = c('VarA', 'VarB', 'VarC')

hierarchical

(optional) A list with the names of the predictor variables for each step of a hierarchical regression. The variables can be numeric or factors.
Example: hierarchical = list(step1=c('VarA', 'VarB'), step2=c('VarC', 'VarD'))

family

(optional) The name of the error distribution to be used in the model. The options are:

  • "poisson" (the default),

  • "quasipoisson",

  • "negbin", for negative binomial,

  • "zinfl_poisson", for zero-inflated poisson, or

  • "zinfl_negbin", for zero-inflated negative binomial.

Example: family = 'quasipoisson'

offset

(optional) The name of the offset variable, if there is one. This variable should be in the desired metric (e.g., log). No transformation of an offset variable is performed internally.
Example: offset = 'Varname'

plot_type

(optional) The kind of plots, if any. The options are:

  • 'residuals' (the default),

  • 'diagnostics', for regression diagnostics, and

  • 'none', for no plots.

Example: plot_type = 'diagnostics'

CI_level

(optional) The confidence interval for the output, in whole numbers. The default is 95.

MCMC

(logical) Should Bayesian MCMC analyses be conducted? The default is FALSE.

Nsamples

(optional) The number of samples for MCMC analyses. The default is 10000.

verbose

(optional) Should detailed results be displayed in console?
The options are: TRUE (default) or FALSE. If TRUE, plots of residuals are also produced.

Details

This function uses the glm function from the stats package, and the negative.binomial function from the MASS package, and supplements the output with additional statistics and in formats that resembles SPSS and SAS output. The predictor variables can be numeric or factors.

The analyses for the zero-inflated poisson and zero-inflated negative binomial analyses are conducted using the pscl package (Zeileis, Kleiber, & Jackman, 2008).

Predicted values, for selected levels of the predictor variables, can be produced and plotted using the PLOT_MODEL funtion in this package.

The Bayesian MCMC analyses can be time-consuming for larger datasets. The MCMC analyses are conducted using functions, and their default settings, from the rstanarm package (Goodrich, Gabry, Ali, & Brilleman, 2024). Family = 'quasibinomial' analyses are currently not possible for the MCMC analyses. family = 'binomial' is therefore used instead. The Bayesian MCMC analyses are also currently not available for zero-inflated poisson and zero-inflated negative binomial models.

The MCMC results can be verified using the model checking functions in the rstanarm package (e.g., Muth, Oravecz, & Gabry, 2018).

Good sources for interpreting count data regression residuals and diagnostics plots:

Value

An object of class "COUNT_REGRESSION". The object is a list containing the following possible components:

modelMAIN

All of the glm function output for the regression model.

modelMAINsum

All of the summary.glm function output for the regression model.

modeldata

All of the predictor and outcome raw data that were used in the model, along with regression diagnostic statistics for each case.

collin_diags

Collinearity diagnostic coefficients for models without interaction terms.

Author(s)

Brian P. O'Connor

References

Atkins, D. C., & Gallop, R. J. (2007). Rethinking how family researchers model infrequent outcomes: A tutorial on count regression and zero-inflated models. Journal of Family Psychology, 21(4), 726-735.

Beaujean, A. A., & Grant, M. B. (2019). Tutorial on using regression models with count outcomes using R. Practical Assessment, Research, and Evaluation: Vol. 21, Article 2.

Coxe, S., West, S.G., & Aiken, L.S. (2009). The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment, 91, 121-136.

Dunn, P. K., & Smyth, G. K. (2018). Generalized linear models with examples in R. Springer.

Hardin, J. W., & Hilbe, J. M. (2007). Generalized linear models and extensions. Stata Press.

Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99119.
https://doi.org/10.20982/tqmp.14.2.p099

Orme, J. G., & Combs-Orme, T. (2009). Multiple regression with discrete dependent variables. Oxford University Press.

Rindskopf, D. (2023). Generalized linear models. In H. Cooper, M. N. Coutanche, L. M. McMullen, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology: Data analysis and research publication, (2nd ed., pp. 201-218). American Psychological Association.

Zeileis, A., Kleiber, C., & Jackman, S. (2008). Regression Models for Count Data in R. Journal of Statistical Software, 27(8). https://www.jstatsoft.org/v27/i08/.

Examples

COUNT_REGRESSION(data=data_Kremelburg_2011, DV='OVRJOYED', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'))

COUNT_REGRESSION(data=data_Kremelburg_2011, DV='OVRJOYED', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'),  family = 'negbin')


# negative binomial regression
COUNT_REGRESSION(data=data_Kremelburg_2011, DV='HURTATWK', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'),
                 family = 'negbin',
                 plot_type = 'diagnostics')

# with an offset variable
COUNT_REGRESSION(data=data_Orme_2009_5, DV='NumberAdopted', forced=c('Married'), 
                 offset='lnYearsFostered')

# zero-inflated poisson regression
COUNT_REGRESSION(data=data_Kremelburg_2011, DV='HURTATWK', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'),
                 family = 'zinfl_poisson',
                 plot_type = 'diagnostics')

# zero-inflated negative binomial regression
COUNT_REGRESSION(data=data_Kremelburg_2011, DV='HURTATWK', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'),
                 family = 'zinfl_negbin',
                 plot_type = 'diagnostics')

data_Bauer_Curran_2005

Description

Multilevel moderated regression data from Bauer and Curran (2005).

Usage

data(data_Bauer_Curran_2005)

Source

Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373-400.

Examples

head(data_Bauer_Curran_2005)

HSBmod <-nlme::lme(MathAch ~ Sector + CSES + CSES:Sector,
                   data = data_Bauer_Curran_2005, 
                   random = ~1 + CSES|School, method = "ML") 
summary(HSBmod)

REGIONS_OF_SIGNIFICANCE(model=HSBmod,  
                        plot_title='Johnson-Neyman Regions of Significance', 
                        Xaxis_label='Child SES',
                        Yaxis_label='Slopes of School Sector on Math achievement')

data_Bodner_2016

Description

Moderated regression data used by Bodner (2016) to illustrate the tumble graphs method of plotting interactions. The data were also used by Bauer and Curran (2005).

Usage

data(data_Bodner_2016)

Source

Bodner, T. E. (2016). Tumble Graphs: Avoiding misleading end point extrapolation when graphing interactions from a moderated multiple regression analysis. Journal of Educational and Behavioral Statistics, 41(6), 593-604.

Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373-400.

Examples

head(data_Bodner_2016)

# replicates p 599 of Bodner (2016)
MODERATED_REGRESSION(data=data_Bodner_2016, DV='math90',
                     IV='Anti90', IV_range='tumble',
                     MOD='Hyper90', MOD_levels='quantiles', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     COVARS=c('age90month','female','grade90','minority'),
                     center = FALSE, 
                     plot_type = 'interaction')

data_Chapman_Little_2016

Description

Moderated regression data from Chapman and Little (2016).

Usage

data(data_Chapman_Little_2016)

Source

Chapman, D. A., & Little, B. (2016). Climate change and disasters: How framing affects justifications for giving or withholding aid to disaster victims. Social Psychological and Personality Science, 7, 13-20.

Examples

head(data_Chapman_Little_2016)
 
# the data used by Hayes (2018, Introduction to Mediation, Moderation, and 
# Conditional Process Analysis: A Regression-Based Approach), replicating p. 239
MODERATED_REGRESSION(data=data_Chapman_Little_2016, DV='justify',
                     IV='frame', IV_range='tumble',
                     MOD='skeptic', MOD_levels='AikenWest', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = FALSE, 
                     plot_type = 'regions')

data_Cohen_Aiken_West_2003_7

Description

Moderated regression data for a continuous predictor and a continuous moderator from Cohen, Cohen, West, & Aiken (2003, Chapter 7).

Usage

data(data_Cohen_Aiken_West_2003_7)

Source

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Examples

head(data_Cohen_Aiken_West_2003_7)

# replicates p 276 of Chapter 7 of Cohen, Cohen, West, & Aiken (2003)
MODERATED_REGRESSION(data=data_Cohen_Aiken_West_2003_7, DV='yendu',
                     IV='xage', IV_range='tumble',
                     MOD='zexer', MOD_levels='AikenWest', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = TRUE, 
                     plot_type = 'regions')

data_Cohen_Aiken_West_2003_9

Description

Moderated regression data for a continuous predictor and a categorical moderator from Cohen, Cohen, West, & Aiken (2003, Chapter 9).

Usage

data(data_Cohen_Aiken_West_2003_9)

Source

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Examples

head(data_Cohen_Aiken_West_2003_9)

# replicates p 376 of Chapter 9 of Cohen, Cohen, West, & Aiken (2003)
MODERATED_REGRESSION(data=data_Cohen_Aiken_West_2003_9, DV='SALARY',
                     IV='PUB', IV_range='tumble',
                     MOD='DEPART_f', MOD_type = 'factor', MOD_levels='AikenWest', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = TRUE,  
                     plot_type = 'regions')

data_Green_Salkind_2014

Description

Mutiple regression data from Green and Salkind (2018).

Usage

data(data_Green_Salkind_2014)

Source

Green, S. B., & Salkind, N. J. (2014). Lesson 34: Multiple linear regression (pp. 257-269). In, Using SPSS for Windows and Macintosh: Analyzing and understanding data. New York, NY: Pearson.

Examples

head(data_Green_Salkind_2014)

# forced (simultaneous) entry; replicating the output on p. 263	
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               forced=c('quads','gluts','abdoms','arms','grip')) 

# hierarchical entry; replicating the output on p. 265-266	
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               hierarchical = list( step1=c('quads','gluts','abdoms'), 
                                    step2=c('arms','grip')) )

data_Halvorson_2022_log

Description

Logistic regression data from Halvorson et al. (2022, p. 291).

Usage

data(data_Halvorson_2022_log)

Source

Halvorson, M. A., McCabe, C. J., Kim, D. S., Cao, X., & King, K. M. (2022). Making sense of some odd ratios: A tutorial and improvements to present practices in reporting and visualizing quantities of interest for binary and count outcome models. Psychology of Addictive Behaviors, 36(3), 284-295.

Examples

head(data_Halvorson_2022_log)

log_Halvorson <-
  LOGISTIC_REGRESSION(data=data_Halvorson_2022_log, DV='Y', forced=c('x1','x2'), 
                      plot_type = 'diagnostics')

# high & low values for x2
x2_high <- mean(data_Halvorson_2022_log$x1) + sd(data_Halvorson_2022_log$x1)
x2_low  <- mean(data_Halvorson_2022_log$x1) - sd(data_Halvorson_2022_log$x1)

PLOT_MODEL(model = log_Halvorson, 
           IV_focal_1 = 'x1',   
           IV_focal_2 = 'x2',  IV_focal_2_values = c(x2_low, x2_high),
           bootstrap=FALSE, N_sims=1000, CI_level=95, 
           ylim = c(0, 1), 
           xlab = 'x1',
           ylab = 'Expected Probability', 
           title = 'Probability of Y by x1 and x2 for Simulated Data Example')

data_Halvorson_2022_pois

Description

Poisson regression data from Halvorson et al. (2022, p. 293).

Usage

data(data_Halvorson_2022_pois)

Source

Halvorson, M. A., McCabe, C. J., Kim, D. S., Cao, X., & King, K. M. (2022). Making sense of some odd ratios: A tutorial and improvements to present practices in reporting and visualizing quantities of interest for binary and count outcome models. Psychology of Addictive Behaviors, 36(3), 284-295.

Examples

head(data_Halvorson_2022_pois)

# replicating Table 3, p 293
pois_Halvorson <-
  COUNT_REGRESSION(data=data_Halvorson_2022_pois, DV='Neg_OH_conseqs', 
          forced=c('Gender_factor','Positive_Urgency_new','Planning','Sensation_seeking'), 
        plot_type = 'diagnostics')

# replicating Figure 4, p 294
PLOT_MODEL(model = pois_Halvorson, 
           IV_focal_1 = 'Positive_Urgency_new',   
           IV_focal_2 = 'Gender_factor',
           bootstrap=FALSE, N_sims=1000, CI_level=95, 
           ylim = c(0, 20), 
           xlab = 'Positive Urgency',
           ylab = 'Expected Count of Alcohol Consequences', 
         title = 'Expected Count of Alcohol Consequences by Positive Urgency and Gender')

data_Huitema_2011

Description

Moderated regression data for a continuous predictor and a dichotomous moderator from Huitema (2011, p. 253).

Usage

data(data_Huitema_2011)

Source

Huitema, B. (2011). The analysis of covariance and alternatives: Statistical methods for experiments, quasi-experiments, and single-case studies. Hoboken, NJ: Wiley.

Examples

head(data_Huitema_2011)

# replicating results on p. 255 for the Johnson-Neyman technique for a categorical moderator
MODERATED_REGRESSION(data=data_Huitema_2011, DV='Y', 
                     IV='X', IV_range='tumble',
                     MOD='D', MOD_type = 'factor',  
                     center = FALSE,  
                     plot_type = 'interaction',
                     JN_type = 'Huitema')

data_Kremelburg_2011

Description

Logistic and Poisson regression data from Kremelburg (2011).

Usage

data(data_Kremelburg_2011)

Source

Kremelburg, D. (2011). Chapter 6: Logistic, ordered, multinomial, negative binomial, and Poisson regression. Practical statistics: A quick and easy guide to IBM SPSS Statistics, STATA, and other statistical software. Sage.

Examples

head(data_Kremelburg_2011)

LOGISTIC_REGRESSION(data = data_Kremelburg_2011, DV='OCCTRAIN',
                    hierarchical=list( step1=c('AGE'), step2=c('EDUC','REALRINC')) )
         
COUNT_REGRESSION(data=data_Kremelburg_2011, DV='OVRJOYED', 
                 forced=c('AGE','EDUC','REALRINC','SEX_factor'))

data_Lorah_Wong_2018

Description

Moderated regression data from Lorah and Wong (2018).

Usage

data(data_Lorah_Wong_2018)

Source

Lorah, J. A. & Wong, Y. J. (2018). Contemporary applications of moderation analysis in counseling psychology. Journal of Counseling Psychology, 65(5), 629-640.

Examples

head(data_Lorah_Wong_2018)

model_Lorah <- 
MODERATED_REGRESSION(data=data_Lorah_Wong_2018, DV='suicidal',
                     IV='burden', IV_range='tumble',
                     MOD='belong_thwarted', MOD_levels='quantiles',
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     COVARS='depression', center = TRUE, 
                     plot_type = 'regions') 
       
REGIONS_OF_SIGNIFICANCE(model=model_Lorah,  
                        plot_title='Johnson-Neyman Regions of Significance', 
                        Xaxis_label='Thwarted Belongingness', 
                        Yaxis_label='Slopes of Burdensomeness on Suicical Ideation', 
                        legend_label=NULL)

data_Meyers_2013

Description

Logistic regression data from Myers et al. (2013).

Usage

data(data_Meyers_2013)

Source

Meyers, L. S., Gamst, G. C., & Guarino, A. J. (2013). Chapter 30: Binary logistic regression. Performing data analysis using IBM SPSS. Hoboken, NJ: Wiley.

Examples

head(data_Meyers_2013)

LOGISTIC_REGRESSION(data= data_Meyers_2013, DV='graduated', forced= c('sex','family_encouragement'))

data_OConnor_Dvorak_2001

Description

Moderated regression data from O'Connor and Dvorak (2001)

Details

A data frame with scores for 131 male adolescents on resiliency, maternal harshness, and aggressive behavior. The data are from O'Connor and Dvorak (2001, p. 17) and are provided as trial moderated regression data for the MODERATED_REGRESSION and REGIONS_OF_SIGNIFICANCE functions.

References

O'Connor, B. P., & Dvorak, T. (2001). Conditional associations between parental behavior and adolescent problems: A search for personality-environment interactions. Journal of Research in Personality, 35, 1-26.

Examples

head(data_OConnor_Dvorak_2001)

mharsh_agg <- 
  MODERATED_REGRESSION(data=data_OConnor_Dvorak_2001, DV='Aggressive_Behavior',
                       IV='Maternal_Harshness', IV_range=c(1,7.7), 
                       MOD='Resiliency',MOD_levels='AikenWest', 
                       quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                       center = FALSE,  
                       plot_type = 'interaction', 
                       DV_range = c(1,6), 
                       Xaxis_label='Maternal Harshness', 
                       Yaxis_label='Adolescent Aggressive Behavior', 
                       legend_label='Resiliency') 

REGIONS_OF_SIGNIFICANCE(model=mharsh_agg,  
           plot_title='Slopes of Maternal Harshness on Aggression by Resiliency', 
           Xaxis_label='Resiliency', 
           Yaxis_label='Slopes of Maternal Harshness on Aggressive Behavior ')

data_Orme_2009_2

Description

Logistic regression data from Orme and Combs-Orme (2009), Chapter 2.

Usage

data(data_Orme_2009_2)

Source

Orme, J. G., & Combs-Orme, T. (2009). Multiple Regression With Discrete Dependent Variables. Oxford University Press.

Examples

LOGISTIC_REGRESSION(data = data_Orme_2009_2, DV='ContinueFostering', 
                    forced= c('zResources', 'Married'))

data_Orme_2009_5

Description

Data for count regression from Orme and Combs-Orme (2009), Chapter 5.

Usage

data(data_Orme_2009_5)

Source

Orme, J. G., & Combs-Orme, T. (2009). Multiple Regression With Discrete Dependent Variables. Oxford University Press.

Examples

COUNT_REGRESSION(data=data_Orme_2009_5, DV='NumberAdopted', forced=c('Married','zParentRole'))

data_Pedhazur_1997

Description

Moderated regression data for a continuous predictor and a dichotomous moderator from Pedhazur (1997, p. 588).

Usage

data(data_Pedhazur_1997)

Source

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Fort Worth, Texas: Wadsworth Thomson Learning.

Examples

head(data_Pedhazur_1997)

# replicating results on p. 594 for the Johnson-Neyman technique for a categorical moderator	
MODERATED_REGRESSION(data=data_Pedhazur_1997, DV='Y', 
                     IV='X', IV_range='tumble',
                     MOD='Directive', MOD_type = 'factor', MOD_levels='quantiles', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = FALSE, 
                     plot_type = 'interaction', 
                     JN_type = 'Pedhazur')

data_Pituch_Stevens_2016

Description

Logistic regression data from Pituch and Stevens (2016), Chapter 11.

Usage

data(data_Pituch_Stevens_2016)

Source

Pituch, K. A., & Stevens, J. P. (2016). Applied multivariate statistics for the social sciences : Analyses with SAS and IBMs SPSS, (6th ed.). Routledge.

Examples

LOGISTIC_REGRESSION(data = data_Pituch_Stevens_2016, DV='Health', 
                    forced= c('Treatment','Motivation'))

Logistic regression

Description

Logistic regression analyses with SPSS- and SAS-like output. The output includes model summaries, classification tables, omnibus tests of model coefficients, the model coefficients, likelihood ratio tests for the predictors, overdispersion tests, model effect sizes, the correlation matrix for the model coefficients, collinearity statistics, and casewise regression diagnostics.

Usage

LOGISTIC_REGRESSION(data, DV, forced = NULL, hierarchical = NULL,
                    ref_category = NULL,
                    family = 'binomial',
                    plot_type = 'residuals',
                    CI_level = 95,
                    MCMC = FALSE,
                    Nsamples = 4000,
                    verbose = TRUE)

Arguments

data

A dataframe where the rows are cases and the columns are the variables.

DV

The name of the dependent variable.
Example: DV = 'outcomeVar'.

forced

(optional) A vector of the names of the predictor variables for a forced/simultaneous entry regression. The variables can be numeric or factors.
Example: forced = c('VarA', 'VarB', 'VarC')

hierarchical

(optional) A list with the names of the predictor variables for each step of a hierarchical regression. The variables can be numeric or factors.
Example: hierarchical = list(step1=c('VarA', 'VarB'), step2=c('VarC', 'VarD'))

ref_category

(optional) The reference category for DV.
Example: ref_category = 'alive'

family

(optional) The name of the error distribution to be used in the model. The options are:

  • "binomial" (the default), or

  • "quasibinomial", which should be used when there is overdispersion.

Example: family = 'quasibinomial'

plot_type

(optional) The kind of plots, if any. The options are:

  • 'residuals' (the default),

  • 'diagnostics', for regression diagnostics, and

  • 'none', for no plots.

Example: plot_type = 'diagnostics'

CI_level

(optional) The confidence interval for the output, in whole numbers. The default is 95.

MCMC

(logical) Should Bayesian MCMC analyses be conducted? The default is FALSE.

Nsamples

(optional) The number of samples for MCMC analyses. The default is 10000.

verbose

(optional) Should detailed results be displayed in console?
The options are: TRUE (default) or FALSE. If TRUE, plots of residuals are also produced.

Details

This function uses the glm function from the stats package and supplements the output with additional statistics and in formats that resembles SPSS and SAS output. The predictor variables can be numeric or factors.

Predicted values for this model, for selected levels of the predictor variables, can be produced and plotted using the PLOT_MODEL funtion in this package.

The Bayesian MCMC analyses can be time-consuming for larger datasets. The MCMC analyses are conducted using functions, and their default settings, from the rstanarm package (Goodrich, Gabry, Ali, & Brilleman, 2024). The MCMC results can be verified using the model checking functions in the rstanarm package (e.g., Muth, Oravecz, & Gabry, 201).

Good sources for interpreting logistic regression residuals and diagnostics plots:

Value

An object of class "LOGISTIC_REGRESSION". The object is a list containing the following possible components:

modelMAIN

All of the glm function output for the regression model.

modelMAINsum

All of the summary.glm function output for the regression model.

modeldata

All of the predictor and outcome raw data that were used in the model, along with regression diagnostic statistics for each case.

collin_diags

Collinearity diagnostic coefficients for models without interaction terms.

Author(s)

Brian P. O'Connor

References

Dunn, P. K., & Smyth, G. K. (2018). Generalized linear models with examples in R. Springer.

Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Los Angeles, CA: Sage.

Goodrich, B., Gabry, J., Ali, I., & Brilleman, S. (2024). rstanarm: Bayesian applied regression modeling via Stan. R package version 2.32.1, https://mc-stan.org/rstanarm/.

Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2014). Multivariate data analysis, (8th ed.). Lawrence Erlbaum Associates.

Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013) Applied logistic regression. (3rd ed.). John Wiley & Sons.

Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99119.
https://doi.org/10.20982/tqmp.14.2.p099

Orme, J. G., & Combs-Orme, T. (2009). Multiple regression with discrete dependent variables. Oxford University Press.

Pituch, K. A., & Stevens, J. P. (2016). Applied multivariate statistics for the social sciences: Analyses with SAS and IBM's SPSS, (6th ed.). Routledge.

Rindskopf, D. (2023). Generalized linear models. In H. Cooper, M. N. Coutanche, L. M. McMullen, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology: Data analysis and research publication, (2nd ed., pp. 201-218). American Psychological Association.

Examples

# forced (simultaneous) entry
LOGISTIC_REGRESSION(data = data_Meyers_2013, DV='graduated', 
                    forced=c('sex','family_encouragement'),
                    plot_type = 'diagnostics')
	
# hierarchical entry, and using family = "quasibinomial"
LOGISTIC_REGRESSION(data = data_Kremelburg_2011, DV='OCCTRAIN',
                    hierarchical=list( step1=c('AGE'), step2=c('EDUC','REALRINC')),
                    family = "quasibinomial")

Moderated multiple regression

Description

Conducts moderated regression analyses for two-way interactions with extensive options for interaction plots, including Johnson-Neyman regions of significance. The output includes the Anova Table (Type III tests), standardized coefficients, partial and semi-partial correlations, collinearity statistics, casewise regression diagnostics, plots of residuals and regression diagnostics, and detailed information about simple slopes. The output includes Bayes Factors and, if requested, regression coefficients from Bayesian Markov Chain Monte Carlo (MCMC) analyses.

Usage

MODERATED_REGRESSION(data, DV, IV, MOD,
                     IV_type = 'numeric', IV_range = 'tumble',
                     MOD_type='numeric', MOD_levels='quantiles', MOD_range=NULL,
                     quantiles_IV = c(.1, .9), quantiles_MOD = c(.25, .5, .75),
                     COVARS = NULL,
                     center = TRUE, 
                     CI_level = 95,
                     MCMC = FALSE,
                     Nsamples = 10000,
                     plot_type = 'residuals', plot_title = NULL, DV_range = NULL,
                     Xaxis_label = NULL, Yaxis_label = NULL, legend_label = NULL,
                     JN_type = 'Huitema', 
                     verbose = TRUE )

Arguments

data

A dataframe where the rows are cases and the columns are the variables.

DV

The name of the dependent variable.
Example: DV = 'outcomeVar'

IV

The name of the independent variable.
Example: IV = 'varA'

MOD

The name of the moderator variable
Example: MOD = 'varB'

IV_type

(optional) The type of independent variable. The options are 'numeric' (the default) or 'factor'.
Example: IV_type = 'factor'

IV_range

(optional) The independent variable range for a moderated regression plot. The options are:

  • 'tumble' (the default), for tumble graphs following Bodner (2016)

  • 'quantiles', in which case the 10th and 90th quantiles of the IV will be used (alternative values can be specified using the quantiles_IV argument);

  • 'AikenWest', in which case the IV mean - one SD, and the IV mean + one SD, will be used;

  • a vector of two user-provided values (e.g., c(1, 10)); and

  • NULL, in which case the minimum and maximum IV values will be used.

Example: IV_range = 'AikenWest'

MOD_type

(optional) The type of moderator variable. The options are 'numeric' (the default) or 'factor'.
Example: MOD_type = 'factor'

MOD_levels

(optional) The levels of the moderator variable to be used if MOD is continuous. The options are:

  • 'quantiles', in which case the .25, .5, and .75 quantiles of the MOD variable will be used (alternative values can be specified using the quantiles_MOD argument);

  • 'AikenWest', in which case the mean of MOD, the mean of MOD - one SD, and the mean of MOD + one SD, will be used; and

  • a vector of two user-provided values.

Example: MOD_levels = c(1, 10)

MOD_range

(optional) The range of the MOD values to be used in the Johnson-Neyman regions of significance analyses. The options are: NULL (the default), in which case the minimum and maximum MOD values will be used; and a vector of two user-provided values.
Example: MOD_range = c(1, 10)

quantiles_IV

(optional) The quantiles of the independent variable to be used as the IV range for a moderated regression plot.
Example: quantiles_IV = c(.10, .90)

quantiles_MOD

(optional) The quantiles the moderator variable to be used as the MOD simple slope values in the moderated regression analyses.
Example: quantiles_MOD = c(.25, .5, .75)

COVARS

(optional) The name(s) of possible covariates.
Example: COVARS = c('CovarA', 'CovarB', 'CovarC')

center

(optional) Logical, indicating whether the IV and MOD variables should be centered (default = TRUE).
Example: center = FALSE

CI_level

(optional) The confidence interval for the output, in whole numbers. The default is 95.

MCMC

(logical) Should Bayesian MCMC analyses be conducted? The default is FALSE.

Nsamples

(optional) The number of samples for MCMC analyses. The default is 10000.

plot_type

(optional) The kind of plot, if any. The options are:

  • 'residuals' (the default)

  • 'diagnostics' (for regression diagnostics)

  • 'interaction' (for a traditional moderated regression interaction plot)

  • 'regions' (for a moderated regression Johnson-Neyman regions of significance plot), and

  • 'none' (for no plots).

Example: plot_type = 'diagnostics'

plot_title

(optional) The plot title.
Example: plot_title = 'Interaction Plot'

DV_range

(optional) The range of Y-axis values for the plot.
Example: DV_range = c(1,10)

Xaxis_label

(optional) A label for the X axis to be used in the requested plot.
Example: Xaxis_label = 'IV name'

Yaxis_label

(optional) A label for the Y axis to be used in the requested plot.
Example: Yaxis_label = 'DV name'

legend_label

(optional) A legend label for the plot.
Example: legend_label = 'MOD name'

JN_type

(optional) The formula to be used in computing the critical F value for the Johnson-Neyman regions of significance analyses. The options are 'Huitema' (the default), or 'Pedhazur'.
Example: JN_type = 'Pedhazur'

verbose

Should detailed results be displayed in console? The options are: TRUE (default) or FALSE. If TRUE, plots of residuals are also produced.

Details

The Bayesian MCMC analyses can be time-consuming for larger datasets. The MCMC analyses are conducted using functions, and their default settings, from the BayesFactor package (Morey & Rouder, 2024). The MCMC results can be verified using the model checking functions in the rstanarm package (e.g., Muth, Oravecz, & Gabry, 201).

Value

An object of class "MODERATED_REGRESSION". The object is a list containing the following possible components:

modelMAINsum

All of the summary.lm function output for the regression model without interaction terms.

anova_table

Anova Table (Type III tests).

mainRcoefs

Predictor coefficients for the model without interaction terms.

modeldata

All of the predictor and outcome raw data that were used in the model, along with regression diagnostic statistics for each case.

collin_diags

Collinearity diagnostic coefficients for models without interaction terms.

modelXNsum

Regression model statistics with interaction terms.

RsqchXn

Rsquared change for the interaction.

fsquaredXN

fsquared change for the interaction.

xnRcoefs

Predictor coefficients for the model with interaction terms.

simslop

The simple slopes.

simslopZ

The standardized simple slopes.

plotdon

The plot data for a moderated regression.

JN.data

The Johnson-Neyman results for a moderated regression.

ros

The Johnson-Neyman regions of significance for a moderated regression.

Author(s)

Brian P. O'Connor

References

Bodner, T. E. (2016). Tumble graphs: Avoiding misleading end point extrapolation when graphing interactions from a moderated multiple regression analysis. Journal of Educational and Behavioral Statistics, 41, 593-604.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Darlington, R. B., & Hayes, A. F. (2017). Regression analysis and linear models: Concepts, applications, and implementation. Guilford Press.

Hayes, A. F. (2018a). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (2nd ed.). Guilford Press.

Hayes, A. F., & Montoya, A. K. (2016). A tutorial on testing, visualizing, and probing an interaction involving a multicategorical variable in linear regression analysis. Communication Methods and Measures, 11, 1-30.

Lee M. D., & Wagenmakers, E. J. (2014) Bayesian cognitive modeling: A practical course. Cambridge University Press.

Morey, R. & Rouder, J. (2024). BayesFactor: Computation of Bayes Factors for Common Designs. R package version 0.9.12-4.7, https://github.com/richarddmorey/bayesfactor.

Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99119.
https://doi.org/10.20982/tqmp.14.2.p099

O'Connor, B. P. (1998). All-in-one programs for exploring interactions in moderated multiple regression. Educational and Psychological Measurement, 58, 833-837.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Wadsworth Thomson Learning.

Examples

# moderated regression	-- with IV_range = 'AikenWest'
MODERATED_REGRESSION(data=data_Lorah_Wong_2018, DV='suicidal', IV='burden',  MOD='belong_thwarted', 
                     IV_range='AikenWest',
                     MOD_levels='quantiles',
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = TRUE, COVARS='depression', 
                     plot_type = 'interaction', plot_title=NULL, DV_range = c(1,1.25))

# moderated regression	-- with  IV_range = 'tumble'
MODERATED_REGRESSION(data=data_Lorah_Wong_2018, DV='suicidal', IV='burden', MOD='belong_thwarted', 
                     IV_range='tumble',
                     MOD_levels='quantiles',
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = TRUE, COVARS='depression', 
                     plot_type = 'interaction', plot_title=NULL, DV_range = c(1,1.25)) 

# moderated regression	-- with numeric values for IV_range & MOD_levels='AikenWest'       
MODERATED_REGRESSION(data=data_OConnor_Dvorak_2001, DV='Aggressive_Behavior', 
                     IV='Maternal_Harshness', MOD='Resiliency', 
                     IV_range=c(1,7.7), 
                     MOD_levels='AikenWest', MOD_range=NULL,
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     center = FALSE, 
                     plot_type = 'interaction', 
                     DV_range = c(1,6), 
                     Xaxis_label='Maternal Harshness', 
                     Yaxis_label='Adolescent Aggressive Behavior', 
                     legend_label='Resiliency')

Ordinary least squares regression

Description

Provides SPSS- and SAS-like output for ordinary least squares simultaneous entry regression and hierarchical entry regression. The output includes the Anova Table (Type III tests), standardized coefficients, partial and semi-partial correlations, collinearity statistics, casewise regression diagnostics, plots of residuals and regression diagnostics. The output includes Bayes Factors and, if requested, regression coefficients from Bayesian Markov Chain Monte Carlo (MCMC) analyses.

Usage

OLS_REGRESSION(data, DV, forced=NULL, hierarchical=NULL, 
               COVARS=NULL,
               plot_type = 'residuals', 
               CI_level = 95,
               MCMC = FALSE,
               Nsamples = 10000,
               verbose=TRUE, ...)

Arguments

data

A dataframe where the rows are cases and the columns are the variables.

DV

The name of the dependent variable.
Example: DV = 'outcomeVar'

forced

(optional) A vector of the names of the predictor variables for a forced/simultaneous entry regression. The variables can be numeric or factors.
Example: forced = c('VarA', 'VarB', 'VarC')

hierarchical

(optional) A list with the names of the predictor variables for each step of a hierarchical regression. The variables can be numeric or factors.
Example: hierarchical = list(step1=c('VarA', 'VarB'), step2=c('VarC', 'VarD'))

COVARS

(optional) The name(s) of possible covariates variable for a moderated regression analysis.
Example: COVARS = c('CovarA', 'CovarB', 'CovarC')

plot_type

(optional) The kind of plots, if any. The options are:

  • 'residuals' (the default)

  • 'diagnostics' (for regression diagnostics), or

  • 'none' (for no plots).

Example: plot_type = 'diagnostics'

CI_level

(optional) The confidence interval for the output, in whole numbers. The default is 95.

MCMC

(logical) Should Bayesian MCMC analyses be conducted? The default is FALSE.

Nsamples

(optional) The number of samples for MCMC analyses. The default is 10000.

verbose

Should detailed results be displayed in console? The options are: TRUE (default) or FALSE. If TRUE, plots of residuals are also produced.

...

(dots, for internal purposes only at this time.)

Details

This function uses the lm function from the stats package, supplements the output with additional statistics, and it formats the output so that it resembles SPSS and SAS regression output. The predictor variables can be numeric or factors.

The Bayesian MCMC analyses can be time-consuming for larger datasets. The MCMC analyses are conducted using functions, and their default settings, from the BayesFactor package (Morey & Rouder, 2024). The MCMC results can be verified using the model checking functions in the rstanarm package (e.g., Muth, Oravecz, & Gabry, 2018).

Good sources for interpreting residuals and diagnostics plots:

Value

An object of class "OLS_REGRESSION". The object is a list containing the following possible components:

modelMAIN

All of the lm function output for the regression model without interaction terms.

modelMAINsum

All of the summary.lm function output for the regression model without interaction terms.

anova_table

Anova Table (Type III tests).

mainRcoefs

Predictor coefficients for the model without interaction terms.

modeldata

All of the predictor and outcome raw data that were used in the model, along with regression diagnostic statistics for each case.

collin_diags

Collinearity diagnostic coefficients for models without interaction terms.

Author(s)

Brian P. O'Connor

References

Bodner, T. E. (2016). Tumble graphs: Avoiding misleading end point extrapolation when graphing interactions from a moderated multiple regression analysis. Journal of Educational and Behavioral Statistics, 41, 593-604.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Darlington, R. B., & Hayes, A. F. (2017). Regression analysis and linear models: Concepts, applications, and implementation. Guilford Press.

Hayes, A. F. (2018a). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (2nd ed.). Guilford Press.

Hayes, A. F., & Montoya, A. K. (2016). A tutorial on testing, visualizing, and probing an interaction involving a multicategorical variable in linear regression analysis. Communication Methods and Measures, 11, 1-30.

Lee M. D., & Wagenmakers, E. J. (2014) Bayesian cognitive modeling: A practical course. Cambridge University Press.

Morey, R. & Rouder, J. (2024). BayesFactor: Computation of Bayes Factors for Common Designs. R package version 0.9.12-4.7, https://github.com/richarddmorey/bayesfactor.

Muth, C., Oravecz, Z., & Gabry, J. (2018). User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology, 14(2), 99119.
https://doi.org/10.20982/tqmp.14.2.p099

O'Connor, B. P. (1998). All-in-one programs for exploring interactions in moderated multiple regression. Educational and Psychological Measurement, 58, 833-837.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Wadsworth Thomson Learning.

Examples

# forced (simultaneous) entry
head(data_Green_Salkind_2014)
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               forced = c('quads','gluts','abdoms','arms','grip'))

# hierarchical entry
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               hierarchical = list( step1=c('quads','gluts','abdoms'), 
                                    step2=c('arms','grip')) )

Standardized coefficients and partial correlations for multiple regression

Description

Produces standardized regression coefficients, partial correlations, and semi-partial correlations for a correlation matrix in which one variable is a dependent or outcome variable and the other variables are independent or predictor variables.

Usage

PARTIAL_COEFS(cormat, modelRsq=NULL, verbose=TRUE)

Arguments

cormat

A correlation matrix. The DV (the dependent or outcome variable) must be in the first row/column of cormat.
Example: cormat = correls

modelRsq

(optional) The model Rsquared, which makes the computations slightly faster when it is available.
Example: modelRsq = .22

verbose

Should detailed results be displayed in console?
The options are: TRUE (default) or FALSE.

Value

A data.frame containing the standardized regression coefficients (betas), the Pearson correlations, the partial correlations, and the semi-partial correlations for each variable with the DV.

Author(s)

Brian P. O'Connor

References

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.

Examples

PARTIAL_COEFS(cormat = cor(data_Green_Salkind_2014))

Plots predicted values for a regression model

Description

Plots predicted values of the outcome variable for specified levels of predictor variables for OLS_REGRESSION, MODERATED_REGRESSION, LOGISTIC_REGRESSION, and COUNT_REGRESSION models from this package.

Usage

PLOT_MODEL(model, 
           IV_focal_1, IV_focal_1_values=NULL, 
           IV_focal_2=NULL, IV_focal_2_values=NULL, 
           IVs_nonfocal_values = NULL,
           bootstrap=FALSE, N_sims=100, CI_level=95, 
           xlim=NULL, xlab=NULL,
           ylim=NULL, ylab=NULL,
           title = NULL,
           verbose=TRUE)

Arguments

model

The returned output from the OLS_REGRESSION, MODERATED_REGRESSION, LOGISTIC_REGRESSION, or COUNT_REGRESSION functions in this package.

IV_focal_1

The name of the focal, varying predictor variable.
Example: IV_focal_1 = 'age'

IV_focal_1_values

(optional) Values for IV_focal_1, for which predictions of the outcome will be produced and plotted. IV_focal_1_values will appear on the x-axis in the plot. If IV_focal_1 is numeric and IV_focal_1_values is not provided, then a sequence based on the range of the model data values for IV_focal_1 will be used. If IV_focal_1 is a factor & IV_focal_1_values is not provided, then the factor levels from the model data values for IV_focal_1 will be used.
Example: IV_focal_1_values = seq(20, 80, 1)
Example: IV_focal_1_values = c(20, 40, 60)

IV_focal_2

(optional) If desired, the name of a second focal predictor variable for the plot.
Example: IV_focal_2 = 'height'

IV_focal_2_values

(optional) Values for IV_focal_2 for which predictions of the outcome will be produced and plotted. If IV_focal_2 is numeric and IV_focal_2_values is not provided, then the following three values for IV_focal_2_values, derived from the model data, will be used for plotting: the mean, one SD below the mean, and one SD above the mean. If IV_focal_2 is a factor & IV_focal_2_values is not provided, then the factor levels from the model data values for IV_focal_2 will be used.
Example: IV_focal_2_values = c(20, 40, 60)

IVs_nonfocal_values

(optional) A list with the desired constant values for the non focal predictors, if any. If IVs_nonfocal_values is not provided, then the mean values of numeric non focal predictors and the baseline values of factors will be used as the defaults. It is also possible to specify values for only some of the IVs_nonfocal variables on this argument.
Example: IVs_nonfocal_values = list(AGE = 25, EDUC = 12)

bootstrap

(optional) Should bootstrapping be used for the confidence intervals? The options are TRUE or FALSE (the default).

N_sims

(optional) The number of bootstrap simulations.
Example: N_sims = 1000

CI_level

(optional) The desired confidence interval, in whole numbers.
Example: CI_level = 95

xlim

(optional) The x-axis limits for the plot.
Example: xlim = c(1, 9)

xlab

(optional) A x-axis label for the plot.
Example: xlab = 'IVname'

ylim

(optional) The y-axis limits for the plot.
Example: ylim = c(0, 80)

ylab

(optional) A y-axis label for the plot.
Example: ylab = 'DVname'

title

(optional) A title for the plot.
Example: title = 'OLS prediction of DV'

verbose

Should detailed results be displayed in console?
The options are: TRUE (default) or FALSE

Details

Plots predicted values of the outcome variable for specified levels of predictor variables for OLS_REGRESSION, MODERATED_REGRESSION, LOGISTIC_REGRESSION, and COUNT_REGRESSION models from this package.

A plot with both IV_focal_1 and IV_focal_2 predictor variables will look like an interaction plot. But it is only a true interaction plot if the required product term(s) was entered as a predictor when the model was created.

Value

A matrix with the levels of the variables that were used for the plot along with the predicted values, confidence intervals, and se.fit values.

Author(s)

Brian P. O'Connor

Examples

ols_GS <- 
OLS_REGRESSION(data=data_Green_Salkind_2014, DV='injury', 
               hierarchical = list( step1=c('age','quads','gluts','abdoms'), 
                                    step2=c('arms','grip')) )

PLOT_MODEL(model = ols_GS, 
           IV_focal_1 = 'gluts', IV_focal_1_values=NULL,
           IV_focal_2='age', IV_focal_2_values=NULL, 
           IVs_nonfocal_values = NULL,
           bootstrap=TRUE, N_sims=100, CI_level=95, 
           ylim=NULL, ylab=NULL, title=NULL,
           verbose=TRUE) 
	
ols_LW <- 
MODERATED_REGRESSION(data=data_Lorah_Wong_2018, DV='suicidal', IV='burden', MOD='belong_thwarted', 
                     IV_range='tumble',
                     MOD_levels='quantiles',
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     COVARS='depression', 
                     plot_type = 'interaction', DV_range = c(1,1.25)) 
                     
PLOT_MODEL(model = ols_LW, 
           IV_focal_1 = 'burden', IV_focal_1_values=NULL,
           IV_focal_2='belong_thwarted', IV_focal_2_values=NULL, 
           bootstrap=TRUE, N_sims=100, CI_level=95) 
                     
logmod_Meyers <- 
  LOGISTIC_REGRESSION(data= data_Meyers_2013, DV='graduated', 
                      forced= c('sex','family_encouragement') ) 

PLOT_MODEL(model = logmod_Meyers, 
           IV_focal_1 = 'family_encouragement', IV_focal_1_values=NULL,
           IV_focal_2=NULL, IV_focal_2_values=NULL, 
           bootstrap=FALSE, N_sims=100, CI_level=95) 
           
pois_Krem <-
  COUNT_REGRESSION(data=data_Kremelburg_2011, DV='OVRJOYED', forced=NULL, 
                   hierarchical= list( step1=c('AGE','SEX_factor'), 
                                       step2=c('EDUC','REALRINC','DEGREE')) )

PLOT_MODEL(model = pois_Krem, 
           IV_focal_1 = 'AGE', 
           IV_focal_2='DEGREE',
           IVs_nonfocal_values = list( EDUC = 5, SEX_factor = '2'),
           bootstrap=FALSE, N_sims=100, CI_level=95)

Plots of Johnson-Neyman regions of significance for interactions

Description

Plots of Johnson-Neyman regions of significance for interactions in moderated multiple regression, for both MODERATED_REGRESSION models (which are produced by this package) and for lme models (from the nlme package).

Usage

REGIONS_OF_SIGNIFICANCE(model,
                        IV_range=NULL, MOD_range=NULL,
                        plot_title=NULL, Xaxis_label=NULL,
                        Yaxis_label=NULL, legend_label=NULL,
                        names_IV_MOD=NULL)

Arguments

model

The name of a MODERATED_REGRESSION model, or of an lme model from the nlme package.

IV_range

(optional) The range of the IV to be used in the plot.
Example: IV_range = c(1, 10)

MOD_range

(optional) The range of the MOD values to be used in the plot.
Example: MOD_range = c(2, 4, 6)

plot_title

(optional) The plot title.
Example: plot_title = 'Regions of Significance Plot'

Xaxis_label

(optional) A label for the X axis to be used in the plot.
Example: Xaxis_label = 'IV name'

Yaxis_label

(optional) A label for the Y axis to be used in the plot.
Example: Yaxis_label = 'DV name'

legend_label

(optional) The legend label.
Example: legend_label = 'Simple Slopes'

names_IV_MOD

(optional) and for lme/nlme models only. Use this argument to ensure that the IV and MOD variables are correctly identified for the plot. There are three scenarios in particular that may require specification of this argument:

  • when there are covariates in addition to IV & MOD as predictors,

  • if the order of the variables in model is not IV then MOD, or,

  • if the IV is a two-level factor (because lme alters the variable names in this case).

Example: names_IV_MOD = c('IV name', 'MOD name')

Value

A list with the following possible components:

JN.data

The Johnson-Neyman results for a moderated regression.

ros

The Johnson-Neyman regions of significance for a moderated regression.

Author(s)

Brian P. O'Connor

References

Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373-400.

Huitema, B. (2011). The analysis of covariance and alternatives: Statistical methods for experiments, quasi-experiments, and single-case studies. John Wiley & Sons.

Johnson, P. O., & Neyman, J. (1936). Tests of certain linear hypotheses and their application to some educational problems. Statistical Research Memoirs, 1, 57-93.

Johnson, P. O., & Fey, L. C. (1950). The Johnson-Neyman technique, its theory, and application. Psychometrika, 15, 349-367.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction. (3rd ed.). Wadsworth Thomson Learning

Rast, P., Rush, J., Piccinin, A. M., & Hofer, S. M. (2014). The identification of regions of significance in the effect of multimorbidity on depressive symptoms using longitudinal data: An application of the Johnson-Neyman technique. Gerontology, 60, 274-281.

Examples

head(data_Cohen_Aiken_West_2003_7)

CAW_7 <- 
MODERATED_REGRESSION(data=data_Cohen_Aiken_West_2003_7, DV='yendu',
                     IV='xage',IV_range='tumble',
                     MOD='zexer', MOD_levels='quantiles', 
                     quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                     plot_type = 'interaction') 

REGIONS_OF_SIGNIFICANCE(model=CAW_7) 

head(data_Bauer_Curran_2005)

HSBmod <-nlme::lme(MathAch ~ Sector + CSES + CSES:Sector,
                   data = data_Bauer_Curran_2005, 
                   random = ~1 + CSES|School, method = "ML") 
summary(HSBmod)

REGIONS_OF_SIGNIFICANCE(model=HSBmod,  
                        plot_title='Johnson-Neyman Regions of Significance', 
                        Xaxis_label='Child SES',
                        Yaxis_label='Slopes of School Sector on Math achievement')  
                        

# moderated regression	-- with numeric values for IV_range & MOD_levels='AikenWest'       
mharsh_agg <- 
  MODERATED_REGRESSION(data=data_OConnor_Dvorak_2001, DV='Aggressive_Behavior',
                       IV='Maternal_Harshness', IV_range=c(1,7.7), 
                       MOD='Resiliency', MOD_levels='AikenWest', 
                       quantiles_IV=c(.1, .9), quantiles_MOD=c(.25, .5, .75),
                       center = FALSE, 
                       plot_type = 'interaction', 
                       DV_range = c(1,6), 
                       Xaxis_label='Maternal Harshness', 
                       Yaxis_label='Adolescent Aggressive Behavior', 
                       legend_label='Resiliency') 

REGIONS_OF_SIGNIFICANCE(model=mharsh_agg,  
                        plot_title='Johnson-Neyman Regions of Significance', 
                        Xaxis_label='Resiliency', 
                        Yaxis_label='Slopes of Maternal Harshness on Aggressive Behavior')