Page 467 - Encyclopedia of Nursing Research
P. 467
R
is examined through test–retest procedures;
Reliability equivalence is assessed through alternative
forms and internal consistency techniques.
For observational measurement, intrarater
Reliability refers to the consistency of and interrater techniques assess the two
responses on self-report, norm- referenced forms of reliability, respectively.
measures of attitudes and behavior. Stability reliability is considered by
Reliability arises from classical measurement some to be the only true way to measure the
theory, which holds that any score obtained consistency of responses on an instrument.
from an instrument will be a composite of In fact, stability was the primary manner
the individual’s true pattern and error vari- in which early instruments were examined
ability. The error is made up of random and for reliability. Stability is measured primar-
systematic components. Maximizing the ily through test–retest procedures in which
instrument’s reliability helps to reduce the the same instrument is given to the same
random error associated with the scores, subjects at two different points in time,
although the validity of the instrument helps commonly 2 weeks apart. The scores are
to minimize systematic error (see Validity). then compared for consistency, using some
The “true” score or variance in measurement form of agreement testing that depends on
relies on the consistency of the instrument as the level of measurement. Typically, data are
reflected by form and content, the stability continuous; thus, interclass or bivariate cor-
of the responses over time, and the freedom relation coefficients and difference between
from response bias or differences that could mean scores are usually assessed. An inter-
contribute to error. Error related to content class correlation (ICC) is different than a
results from the way questions are asked bivariate correlation as it is computing the
and the mode of instrument administration. relationship among multiple observations
Time can contribute to error by the frequency of the same variable. Specifically, the ICC as
of measurement and the time frame imposed an assessment of stability is determining the
by the questions asked. Error due to response consistency of measurements made at differ-
differences results from the state or mood of ent times by the same group of individuals.
the respondent, wording of questions that The ICC is calculated based on mean squares
may lead to a response bias, and the testing obtained from analysis of variance (ANOVA)
or conceptual experience of the subject. models. The ICC examines the individual’s
There are generally two forms of reliabil- “error” (consistency) over time as it relates
ity assessment designed to deal with random to “error” inherent in the questionnaire and
error: stability and equivalence. Stability is results in a ratio. The values obtained can
the reproducibility of responses over time. range from 0 to 1, with 1 indicating per-
Equivalence is the consistency of responses fect consistency and no measurement error.
across a set of items so that there is evidence There are no absolute cutoffs for what level
of a systematic pattern. Both of these forms the ICC should be, but a good general rule
apply to self-report and to observations made is that a score below .50 should be carefully
by a rater. For self-report measures, stability scrutinized. An ICC is considered superior to

