Residual standard deviation: the standard deviation of the residuals (residuals = differences between
observed and predicted values). It is calculated as follows:
(Y Y )
2
s
=
est
res
1
k
n
The regression equation: the different regression coefficients bi with standard error sbi, t value and P
value. The P value is the probability that you would have found the current result if the coefficient were
equal to 0 (null hypothesis). If the P value for one or more coefficients is less than the conventional 0.05,
then these coefficients can be called statistically significant, and the corresponding independent variables
exert independent effects on the dependent variable Y
Analysis of variance: the analysis of variance table divides the total variation in the dependent variable
into two components, one which can be attributed to the regression model (labeled Regression) and one
which cannot (labeled Residual). If the significance level for the F test is small (less than 0.05), then the
hypothesis that there is no (linear) relationship can be rejected, and the multiple correlation coefficient can
be called statistically significant.
Zero order correlation coefficients: these are the simple correlation coefficients for the dependent
variable Y and all independent variables Xi separately.
Logistic regression
Logistic regression is a technique for analyzing problems in which there are one or more independent
variables that determine an outcome. The outcome is measured with a dichotomous variable (in which
there are only two possible outcomes).
In logistic regression, the dependent variable is binary or dichotomous,
i.e. it only contains data coded as 1 (TRUE, success, pregnant, etc. ) or 0
(FALSE, failure , non pregnant , etc.).
The goal of logistic regression is to find the best fitting (yet biologically reasonable) model to describe the
relationship between the dichotomous characteristic of interest (dependent variable = response or outcome
variable) and a set of independent (predictor or explanatory) variables. Logistic regression generates the
coefficients (and its standard errors and significance levels) of a formula to predict a logit transformation of
the probability of presence of the characteristic of interest:
b
=
logit(p)
b
+
X
0
1
1
b
+
2
X
2
b
+
3
X
3
b
+
.
.
.
+
k
X
k
where p is the probability of presence of the characteristic of interest. The logit transformation is defined as
the logged odds:
p
probabilit of
y
of
presence
characteri
stic
=
odds
=
1 p
probabilit of
y
of
absence
characteri
stic
and
p
ln
=
logit(p)
1 p
Rather than choosing parameters that minimize the sum of squared errors (like in ordinary regression),
estimation in logistic regression chooses parameters that maximize the likelihood of observing the sample
values.
The MedCalc dialog box for logistic regression is similar to the one for multiple regression (see p. 61). In
the dialog box you first identify the dependent variable. Remember that the dependent variable must be
binary or dichotomous, and it should only contain data coded as 0 or 1. Cases with values other than 0 or
1 for the dependent variable will be excluded from the analysis!
63
New Page 1