Prophecy with Numbers: Prospective Punishment for Predictable Human
Behaviour?
Brad Johnson*
A discussion about the way in which
people reason about events in the physical world will prove to be beneficial
before considering
the provisions of the Dangerous Prisoners (Sexual
Offenders) Act 2003 and is necessary in order to identify and deconstruct
the process by which psychiatrists evaluate dangerousness. In the physical
world
it is both possible and useful to identify relationships or connections between
events where one event or constellation of
events immediately precedes the
occurrence of another. In such circumstances two possibilities may
arise:
(i) it may be observed that the presence of one event or
constellation of events always precedes the occurrence of another;
or
(ii) it may be observed that the presence of one event or
constellation of events sometimes precedes the occurrence of another.
The
difference between the two possibilities which lies in the use of the words
“always” and
“sometimes”[1] can be
illustrated by considering contrasting examples. If the element potassium is
combined with water a violent reaction will result,
producing potassium
hydroxide gas which combusts spontaneously at room temperature. A chemist would
not expect the reaction to produce
any exceptions, but would expect them to obey
a well observed rule that the combination of potassium and water is always
followed
by the production of potassium hydroxide gas. Such an expectation
reveals an underlying and unprovable belief that future observations
will
conform to past observations so that certainty in the past will endure in the
future. By contrast some associations between
events admit random exceptions
with the consequence that the past cannot serve as a perfect guide to the
future. Habitual smoking
is an event, repetitive in nature, which sometimes
precedes the development of lung cancer. Some, rather than all, habitual smokers
develop lung cancer and as a result for a given population of smokers it is
impossible to distinguish between those who will develop
the disease and those
who will not.
Expectations or Uncertainty
Our attitude towards, and beliefs about, future events in the physical world
depend upon our memory of past experiences. Without any
past experiences or
memory of those experiences it would be difficult to develop any expectations
about events in the physical world.
For example, in the absence of any past
experiences it would be difficult to predict that the setting sun will return;
that the incoming
high tide will recede or that the snow of winter will melt in
the summer. For the observer without any past experiences, the physical
world
offers many objects never before seen in the context of a dynamic environment of
change. Amongst the many things that can be
observed, the bright object of the
sun changes its position with the passage of time and continues to do so until
it sets behind
the horizon. For someone who has never before seen a sunrise
there is no past experience or memory of a past experience that would
suggest
that the setting sun will ever return. If it returns, then the observer will
recognise the object based on memory of theexperiences
from the previous day and
notice that it behaves in a similar manner as it sets again. After observing two
sunsets a pattern may
not be obvious. How many sunsets and sunrises however
would a person need to observe before developing the expectation that the sun
will always rise and set? Although only an arbitrary answer can be offered to
such a question, most people believe that the sun
will rise and set tomorrow
given their experiences and the experiences of countless generations of past
observers who have watched
the sun behave without exception. The expectations
of people with respect to the rising and setting sun may be contrasted with
the
uncertainty of the weather. Not many people would expect rain every day or every
second day or third day since the number of
days between rain and sunshine
according to our experiences is not periodic, but variable and therefore
uncertain. Given the presence
of something that always occurs, it is common for
people to develop expectations. The presence of something that sometimes occurs
will usually give rise to uncertainty.
Which category does human
behaviour belong to? Unlike chemical elements and compounds such as potassium
and water, human beings display
comparatively complex behaviour and interactions
between each other and with their environment. Chemical compounds don’t
parent
children, enter into romantic relationships, go to war, struggle to
improve their standard of living, laugh politely at an attempted
joke or commit
crimes against other chemical compounds. Humans, who appear to posses some
freedom of will or volition, are complex,
as are their relationships and
behaviour with respect to others. Whether or not human behaviour can be
accurately predicted depends
upon the existence of patterns that always occur as
opposed to patterns that sometimes occur. In the context of the Dangerous
Prisoners Act it is necessary to recognise that although people exercising
their free will can behave in many ways, some behaviour has been labelled
criminal and as a result it would be useful to know if it is possible to predict
whether a particular individual will commit a crime.
If you can identify
individual offenders before they have opportunities to execute their crimes then
actual victims can be replaced
with potential victims who need not learn to live
with the trauma of violation.
For reasons that will become apparent, the
main obstacle to achieving accurate predictions with respect to human behaviour
is the
presence of uncertainty. The response to this uncertainty has been
systematic since the gradual development and advent of science
through which we
have sought to replace “sometimes” with “always” in a
quest to eliminate uncertainty and
establish order with respect to the behaviour
of objects and forces in the universe. Some investigations have witnessed
spectacular
success, whilst others have left our desires unfulfilled. Where
uncertainty has not been eliminated, the methods of statistics and
probability
theory have emerged as tools for understanding and minimising the risks often
associated with it. Essential to these
methods—as they are applied to most
types of human behaviour—is the process of comparing relative frequencies
and the
identification of sample populations each based on a specific group of
attributes. Here the term “population” can refer
to any collection
of objects or events, not just people. Either way it is a simple concept that
can be understood easily without
a background in mathematics.
The goal
of such analysis is to predict what outcomes will follow given the presence of
certain attributes or behaviours in a population
of individuals who all posses
the same attributes or display the same behaviour. Smoking and lung cancer offer
an illustration. In
this case the sample population being studied comprises
individuals who display the same behaviour: habitual smoking. The outcome
observed is whether or not the individual develops lung cancer. The sample
population is divided into two groups, those who develop
lung cancer and those
who don’t. Two relative frequencies can then be calculated:
(i)
the ratio of the number of individuals who develop lung cancer to the total
number of individuals in the sample population.
(ii) the ratio of the
number of individuals who do not develop lung cancer to the total number of
individuals in the sample population.
Example: a given population of
habitual smokers might include 1,000 individuals. If 800 develop lung cancer and
200 do not then the
following relative frequencies will be obtained:
i)
lung cancer present: 800/1,000 or 80%
ii) lung cancer absent: 200/1,000
or 20%
Relative frequencies for certain outcomes where the future
conforms to the past present quite a different picture. For example, the
sample
population might consist of an experiment repeated 1,000 times in which
potassium is combined with water. The outcome of interest
is whether or not
potassium hydroxide is produced. Again, the sample population can be divided
into experiments that produce potassium
hydroxide and experiments that do not.
We should expect to see the following relative frequencies:
Potassium
hydroxide present: 1,000/1,000 or 100%
Potassium hydroxide absent:
0/1,000 or 0%.
It can be seen that the data from the sample populations
reveal that habitual smoking is only sometimes associated with lung cancer
rather than always, whilst the presence of potassium hydroxide is always
associated with an experiment in which water and potassium
are combined. The
distinction between sometimes and always in this case means the difference
between accurate predictions and inaccurate
predictions. Before an experiment is
conducted we should expect to be correct if we predict that the outcome will
produce potassium
hydroxide. If we predict that a given habitual smoker will
develop lung cancer however, then it is possible, given the relative frequencies
of the sample population, that we will be surprised by contradiction. In this
case the relative frequencies of the sample population
also specify the margin
of error for any prediction inferred from them. There is an 80% chance that our
prediction will be correct
and a 20% chance that our prediction will be
incorrect for any given habitual smoker predicted to develop lung cancer.
The relative frequencies for a given sample population with respect to a
particular outcome can be further analysed by comparing them
to the relative
frequencies of a second population, defined according to different attributes.
For instance the relationship between
lung cancer and smoking can be further
analysed by comparing the relative frequencies for the same outcome of two or
more sample
populations defined according to different attributes. Typically,
the relative frequencies for a sample population of smokers are
compared to the
relative frequencies for a sample population of non-smokers.
Example: for
a sample population of 1,000 smokers where 800 develop lung cancer and a sample
population of 1,000 non-smokers where
100 develop lung cancer the following
relative frequencies can be determined and compared:
i) smokers with lung
cancer: 800/1,000 or 80%
ii) non-smokers with lung cancer: 100/1,000 or
10%.
The comparison of the sample populations reveals that the chance of
developing lung cancer is greater for smokers than non-smokers.
In neither case
can accurate predictions be made from the relative frequencies of either sample
population, however, it is possible
to infer that there is an increased risk of
developing lung cancer for smokers. As a result predictions that a smoker will
develop
lung cancer should be more accurate than predictions that a non-smoker
will develop lung cancer.
In all of the above examples the relevant data
satisfy some of the basic postulates of the probability calculus and demonstrate
that
there is an intimate relationship between relative frequencies and
probability theory. The basic postulates include the following:
(i) the
probability of an event must be a rational number that is greater than or equal
to 0 and less than or equal to 1,
(ii) the complement probability of an
event, that is the probability that it will not occur, can be represented as
follows P(!E) =
1 - P(E) where P(!E) represents the probability that the event
will not occur and P(E) represents the probability that it will occur,
(iii) the probability of an event that is certain to occur is equal to 1
whilst the probability of an event that is certain not to
occur is equal to 0.
For all other events it follows that the likelihood of the event
occurring increases as the value approaches 1, and decreases as it
approaches 0.
Having considered some of the basic principles employed in analysing uncertain
events in the physical world, it is
necessary to examine their application to
the specific example of predicting the occurrence of future sexual offences for
convicted
sexual offenders as required by the Dangerous Prisoners (Sexual
Offenders) Act 2003.
Cogent Evidence to a High Degree of
Probability
Before deciding whether to make an order to continue the detention of a
convicted sexual offender, the court may only make the decision
if it is
satisfied by acceptable cogent evidence—and to a high degree of
probability—that the evidence is of sufficient
weight to support the
decision[2], and that the submitted
evidence demonstrates that the offender represents a serious danger to the
community. In this case the nature
of the evidence and its reliability require
careful consideration. The evidence that is of primary importance is that
offered by
the court-appointed psychiatrists who may be commissioned to prepare
a report for the purposes of executing a risk assessment
order.[3] The risk assessment order
requires the preparation of reports by two psychiatrists with each report
indicating the level of risk
that the prisoner will commit another serious
sexual offence if released and the reasons for the psychiatrist’s
assessment.
[4]
What is risk,
and how do psychiatrists determine a convicted sexual offender’s level of
risk? It is important for the court
to consider these questions and evaluate the
experts’ psychiatric evidence critically in order to assess the
reliability of
their research methods and conclusions. In reaching a conclusion
about an offender’s chance of re-offending, the court should
be supplied
with the facts and assumptions upon which the expert opinion is based, the
process of inference which reveals the relationship
between the facts or
assumptions and the conclusions reached by the expert so that the court can
scrutinise the conclusions and their
reliability.[5]
Risk can not
be measured in the same manner as physical dimensions such as length, mass and
time. For each physical dimension there
exists an experimentally defined
standard unit to measure it. For length it’s the metre, for mass
it’s the kilogram
and for time it’s the second. For risk however,
no standard unit has been experimentally defined. Rather than being measured
experimentally with standard units, risk is typically associated with the
concept of probability, so that the risk of something occurring
increases as its
relative frequency approaches 1 and decreases as it approaches 0. Risk, however,
need not be defined exclusively
in terms of a rational number between 1 and 0,
but may also be defined qualitatively in terms of risk levels such as low,
medium
or high. For either method the data or facts that influence the risk
determination are critical, as are the limitations that affect
the process of
making inferences from the data.
Psychiatrists and psychologists have
employed a number of methods for determining risk with respect to human
behaviour, which include
the following: clinical assessment, actuarial risk
assessment and actuarially informed clinical assessment which combines elements
from each. The difference between clinical and actuarial assessment is reflected
in the type of data relied on in order to determine
the level of
risk—clinical assessment relying primarily on data about the person being
assessed and actuarial assessment relying
on data from a population of
individuals who share a number of attributes in common with the person being
assessed, thus allowing
statistical comparative judgements.
Clinical Assessment
In a clinical assessment, information about the client or patient is
collected through an interview process during the course of which
the client
will be confronted with a series of questions, his or her responses and
demeanour being carefully scrutinised by the psychiatrists.
In addition to an
interview, the client may be required to complete a self report questionnaire
and the psychiatrist may also interview
individuals who may offer observations
about the client that can be compared to the responses of the client. The
psychiatrist must
carefully consider the information from the interviews and
questionnaires before forming a clinical judgement about the risk of the
client
re-offending, the client’s state of mind or a disorder diagnosis.
Some general reservations that should be identified with respect to
clinical research methodology in psychiatry and psychology include
the
following:
(i) presence of hypothetical constructs
(ii) unreliable
diagnostic methods
(iii) absence of experimentally defined standard units
of measurement
Hypothetical Constructs
Scientific research involves the collection of data that correspond to
observations or measurements of perceivable entities in the
physical world, and
the forces that may influence their
behaviour.[6] A hypothetical construct
however is an idea rather than an object or entity that can be perceived as
sensory information, for which
data that correspond to an observation or
measurement can not be collected. Human emotions offer a useful example of a
hypothetical
construct. Many people believe that they experience emotional
states despite the absence of empirical evidence to support their existence.
For
example, my wife is a perceivable object in the physical world, and I believe
that I love my wife. I can see my wife, a physical
object, and my wife is the
object of my love, however, I can not see the love that I believe I feel for
her. Love can not be seen,
touched, measured or weighed as it does not possess
any of the properties of matter. Research methods in psychology have embraced
emotional states such as love, depression, anxiety, happiness and many others.
Disorders such as those defined in the DSM
IV[7] are themselves hypothetical
constructs. In an attempt to avoid the absence of perceivable data about such
constructs, some research
methodologists have assumed the existence of a
connection between human behaviour and human emotions. It is believed that
certain
emotional states to some degree correspond to a constellation of
observable behaviours. Anger, an emotional state, might be thought
to correspond
to the following behaviours: raised voice or shouting, physical assault,
threatening gestures etc. Researchers have
also attempted to identify
associations between chemicals and emotional states or disorders. In addition to
observable behaviour
that is thought to correspond to particular emotional
states the client who offers an introspective opinion about his or her emotions
during the interview process. This however requires the co-operation of the
client being interviewed, as well as honest responses
and an ability to describe
emotional states accurately, since the person being interviewed may lie about
emotional states or be mistaken.
The psychiatrist or psychologist is dependant
upon the honest and accurate testimony of the client being interviewed and the
honest
and accurate testimony of other individuals who have observed and
interacted with the client. Dishonest or inaccurate information
will ultimately
affect the reliability of a diagnosis or assessment of risk.
Diagnostic Methods
Information collected from a clinical assessment may be used to diagnose the
interviewed client with a recognised disorder as well
as for determining the
risk of self harm or harm to others. If disorders, however, are hypothetical
constructs defined in terms
of certain behaviours and emotional states then a
reliable diagnosis is dependant upon a number of factors.
Before
considering diagnostic methods in psychiatry it will be helpful to briefly
discuss an example of a well defined condition in
medicine that corresponds to
an observable or measurable physiological condition in the human body, in order
to make comparative
judgements. To diagnose whether or not a patient has a
simple bone fracture, a radiologist can perform an x-ray on the suspect bone,
in
which case a visual inspection of the x-ray will reveal whether the bone is
fractured or not. In this case there is a direct relationship
between an
observable physiological state in the body and the condition that is diagnosed.
Different doctors who examine the same
x-ray of a simple bone fracture are
unlikely to reach different diagnoses.
By contrast, disorders in
psychiatry and psychology are operationally defined. Operational definitions
attempt to define disorders
which are unobservable hypothetical constructs in
terms of observable behaviours and the self report testimony of the client. As
a
result, psychiatrists do not directly observe the disorder but rather the
behaviour and subjective mental states of the client
that are believed to
indicate the presence of the disorder. This is somewhat like diagnosing a simple
bone fracture based on certain
patient symptoms without taking an x-ray of the
bone. Consider the diagnostic criteria for the
DSM-IV-TR[8] defined disorder,
“paranoid personality”:
A. A pervasive distrust and
suspiciousness of others such that their motives are interpreted as malevolent,
beginning by early adulthood
and present in a variety of contexts, as indicated
by four or more of the following:
(1) suspects, without sufficient basis,
that others are exploiting, harming, or deceiving him or her
(2) is
preoccupied with unjustified doubts about the loyalty or trustworthiness of
friends or associates
(3) is reluctant to confide in others because of
unwarranted fear that the information will be used maliciously against him or
her
(4) reads hidden or demeaning or threatening meanings into benign
remarks or events
(5) persistently bears grudges, i.e. is unforgiving of
insults, injuries or slights
(6) perceives attacks on his or her character
or reputation that are not apparent to others and is quick to react angrily or
to
counter-attack
(7) has recurrent suspicions, without justification,
regarding fidelity of spouse or sexual partner.
The criterion serves
as an operational definition for the disorder “paranoid personality”
and guides the interview process
by classifying a constellation of behaviours
and mental states into a specific disorder which can hopefully be distinguished
from
other disorders that are also operationally defined. As the interview
process proceeds, the psychiatrist attempts to find a correspondence
between the
defined criterion and the information collected about the client, so that a
diagnosis can be reached. The criterion that
operationally defines paranoid
personality disorder however can not be directly observed. For example,
suspiciousness is an attribute
or trait of a person that can only be inferred
from someone’s behaviour and communication over a period of time.
Information
about the client must be collected and interpreted in order to
determine which disorder their behaviour and mental state most closely
corresponds to. This process however is imperfect as the different disorders do
not enjoy discrete boundaries but rather overlap
in terms of the criterion which
defines them. As a result it is possible that two psychiatrists who examine the
same patient will
arrive at a different diagnosis despite the application of the
same diagnostic criteria. In addition, mental states such as suspiciousness
are
not well defined, so different psychiatrists may examine the same information
about a client and arrive at different conclusions
with respect to whether the
clients behaviour reveals an overly suspicious state of mind. Finally,
incomplete information about the
client will ultimately affect the reliability
of the diagnosis. You must have enough information that allows you to identify
the
presence or absence of a disorder. Since the failure of the psychiatrist to
obtain relevant information or to consider relevant information
can affect the
outcome of the interview, there is a much greater emphasis on the interviewing
skills of the psychiatrist or psychologist
and their experience with clinical
assessments. Despite these limitations in clinical research methods,
psychiatrists and psychologists
attempt to diagnose mental disorders in a
climate that lends itself to conflicting conclusions. Given that the presence or
absence
of a mental diagnosis can affect the outcome of a clinical risk
assessment, unreliable methods of diagnosis are unacceptable.
Standard Units of Measurement
There are no standard units that have been experimentally defined in
psychiatry or psychology. In order to understand the significance
of standard
units consider the following question: how long is a metre? Many people would
find such a question confusing, offering
the answer “100cm”. This
however does not answer the original question but rather a second question:
“How many
centimetres are there in a metre?” In physics the
question is resolved by identifying an experimentally defined standard unit
of
measurement. The metre is a standard unit of measurement and the experiment that
defines its length is the distance travelled
by light in a vacuum during
1/299,792,458 of a second. Other units include the second, kilogram and coulomb
each of which, including
the metre, attempts to measure one of the four
fundamental dimensions in physics: length, mass, time and electrical charge. By
contrast,
experimentally defined standard units of measurement are noticeably
absent and infrequently used in either psychology or psychiatry,
a state of
affairs that can best be explained by the presence of hypothetical constructs.
Given that hypothetical constructs are
concepts rather than perceivable objects
in the physical world, it follows that it is not possible to measure them
directly, and
without a system of measurement it is not possible to make
comparative judgements. For example, it is not possible to measure
someone’s
level of clinical depression and compare it to another
person’s level of depression, but only to make a judgement about its
presence or absence. In contrast, a person’s height is a dimension that
can be measured in centimetres. It is possible to measure
the heights of two
different people in order to determine whether one is taller than the other and
by how much. Of the standard units
that currently exist, as maintained by the
International Standards Organisation, it is difficult to see what assistance any
would
be in a clinical setting. For example, a psychiatrist could measure a
client’s height in centimetres, but it seems unlikely
that there would be
any relationship between a client’s height and risk of re-offending or
mental health.
Clinical risk assessment suffers from the same defects
that affect clinical diagnosis of mental disorders. They are often intimately
related. The question that needs to be considered is whether or not the presence
of a diagnosed mental disorder will increase the
risk of a person committing an
offence. Alternatively, how is risk to be assessed if no disorder is diagnosed?
Although it is beyond
the scope of the present article to address these issues,
statistical studies and their use as criteria in actuarial instruments
suggest
that certain disorders, which are not well defined and are difficult to diagnose
with reliability, are associated with an
increase in incidence of certain
offences. Where the person being assessed is not diagnosed with any disorder,
then clinical assessment
may rely on methods that don’t require expert
training. The main issue however, with respect to clinical assessment methods,
is the issue of reliability. Reliability which should not be confused with
accuracy, concerns consistency with respect to clinical
diagnosis or risk
assessment as conducted by independent psychiatrists. Reliable diagnostic
criteria of assessment procedures should
produce similar or identical
assessments for independent observers. However different psychiatrists
frequently arrive at different
conclusions with respect to diagnosis or risk.
Actuarial Risk Assessment
Actuarial risk assessment departs from clinical assessment methods by
examining populations of released offenders in order to identify
attributes that
are associated with an increased risk of recidivism. The data with respect to
recidivism rates collected from multiple
sample populations of released
offenders can be used to make some simple inferences. The relative frequency of
recidivism for a particular
sample may be used to make a probability statement
about the chance of an individual, who shares the attributes that define the
population,
committing a future offence. Alternatively the relative frequencies
for various populations may be compared to determine which samples
display a
higher level recidivism, which in turn is believed to indicate a greater risk of
recidivism. The process of establishing
relative frequencies with respect to
recidivism begins by examining an initial population of released offenders for a
specific period
of time which yields relative frequencies for those who
re-offend and those who do not. The sample population being investigated
also
allows researchers to look for attributes that are associated with recidivism.
The initial population can then be analysed by
specifying further attributes
that break the population down into more clearly defined demographic groups in
the hope of identifying
greater recidivism rates for specific populations.
Example: for a given population of 1,000 released offenders where 600 re-offend
within a seven year period, the following relative frequencies would be
obtained:
Offence committed: 600/1,000 or 60%
No offence committed:
400/1,000 or 40%
By specifying further attributes it is possible to
identify a subsection of the initial population in order to determine whether
they
display an increased or decreased relative frequency. For example 900 of
the released offenders might be male and aged between the
ages of 30 and 45. If
630 of the 900 re-offend within a seven year period the following relative
frequencies will reveal that the
incidence of recidivism is greater where the
attributes of gender and age are equal to male and 30–45 respectively. If
the
relative frequencies remain unchanged it is assumed the attributes have a
neutral effect and therefore no association with an increase
or decrease in
risk.
Offence committed: 630/900 or 70%
No offence committed:
270/900 or 30%
When the data have been collected, a number of sample
populations—each of which corresponds to a unique set of
attributes—can
be used to specify a collection or table of relative
frequencies which can then be used as a basis for making probability statements
and judgements about levels of risk for a specific individual, by looking for
the presence or absence of certain attributes. For
example given the above
relative frequency of recidivism for released offenders who are male and aged
between 30 and 45, researchers
might infer that there is a 70 per cent chance
that any released offender who is male and aged between 30 and 45 will re-offend
within
a seven year period or that males aged between 30–45 display a
higher risk of re-offending when compared to released offenders
generally. The
purpose of this simplified example is to identify the process of reasoning upon
which inferences are made from relative
frequencies.
In practice however
although the reasoning process is the same, actuarial studies will specify many
more attributes beside gender
and age in order to analyse recidivism rates from
many different dimensions. Attributes which are used to define sample
populations
may be divided into those that are static and those that are
dynamic. Static attributes are those that do not change with time such
as an
individual’s gender, date of birth and criminal history, whilst dynamic
factors are susceptible to change—such
as an individual’s, marital
status, employment status or substance addiction. The Violence Risk Appraisal
Guide (1993)[9] offers the
following example of static and dynamic attributes used to assess an
individual’s chance of re-offending with respect
to a violent
offence:
(1) PCL-SV score. Indicates the presence or absence of psychopathy.
(2) Maladjustment at elementary school age.
(3) Diagnosis of personality disorder under DSM IV.
(4) Age at index of offence.
(5) Lived with both parents to age of 16.
(6) Failure on prior conditional release.
(7) Non-violent offence score.
(8) Marital status.
(9) Diagnosis of schizophrenia under DSM IV.
(10) Victim injury.
(11) History of alcohol misuse.
(12) Female victim.
For any individual it is possible to determine the presence or absence of
the variables and or assign a value. The risk of an individual
re-offending by
committing a violent offence depends upon the number of variables to which there
is a positive response with the
risk increasing as more variables are found to
be applicable to an individual being assessed. The psychiatrist assessing an
individual
will record a response to each of the 12 attributes and compare the
result to the relative frequency for a sample population that
has the same
attributes as the assessed individual. For example person A, with the following
results: high PCL-SV score, displayed
maladjustment at elementary school,
diagnosed with a DSM-IV personality disorder, aged 28, lived with both parents
to age of 16,
failed on a prior conditional release, low non-violent offence
score, single, absence of a diagnosis of schizophrenia, a history
of alcohol
misuse and female victims. Person A might be at a higher risk of re-offending
than an individual, B, with the following
results: low PCL-SV score, no
incidence of maladjustment at elementary school, absence of a DSM-IV personality
disorder, aged 62,
no failure on prior conditional release, high non-violent
offence score, married, no history of alcohol misuse and no female victim.
Data
collected by the researchers would presumably reveal that the relative frequency
or incidence of recidivism is lower for the
population of individuals who
display the attributes that correspond to individual B than for the population
of individuals who display
the attributes that correspond to individual A.
Limits of Statistical Inference
Categorising individuals, according to either static or dynamic attributes
which correspond to sample populations that define relative
frequencies,
presents a number of issues. It is necessary to recognise that the data
collected from research that corresponds to
sample populations ultimately
supports inferences about groups of individuals rather than single individuals.
This distinction can
be illustrated by considering the following
propositions:
(i) Smoking causes lung cancer.
(ii) John will contract lung cancer.
The first proposition is based on a population of individuals, rather
than a single individual, from which it is possible to make
a generalisation
that is subject to exceptions. Of the lung cancer patients surveyed in numerous
statistical studies many will be
habitual smokers, however not all since lung
cancer can occur in either the presence or absence of habitual smoking. A
relative frequency
of 80% with respect to the incidence of lung cancer amongst
smokers indicates that of 1,000 smokers who participated in a study,
800
developed lung cancer whilst 200 did not. It is possible to infer that any group
of smokers not surveyed can be divided into
two further groups, those who will
develop lung cancer and those who will not, in proportions that are consistent
with the surveyed
sample population. Thus if the future conforms to the past
then for a population of 100 non-surveyed smokers it is possible to predict,
based on the statistical study, that 80 or approximately 80 smokers will develop
lung cancer and 20 will not. What of inferences
however with respect to a single
individual rather than a group? Clearly it is not possible to divide a single
individual into two
categories, those with and those without, in the proportions
consistent with the surveyed sample. Although for a single individual
there are
two possible outcomes only one will actually occur. When applied to an
individual the relative frequency from a sample
population serves as a
probability statement that attempts to indicate which outcome is more likely to
actually occur. Where each
possible outcome has a rational number assigned to
it, the outcome whose rational number is closest to one is more likely to occur,
which in this case is lung cancer given that its value is 0.80 whilst the
possibility of not contracting lung cancer is 0.20. Clearly
any prediction, such
as proposition (ii), about an individual smoker contracting lung cancer based on
the statistical studies could
be wrong. This is the impact of uncertainty.
The distinction between statements about groups and individuals with
respect to relative frequencies, defined according to statistical
studies, leads
to a fundamental problem. For any given individual it is not possible to
determine the category to which he or she
will belong. It is not possible to
predict whether one will belong to the group of 80 with lung cancer or the group
of 20 without.
The same problem clearly applies to recidivism predictions based
on relative frequencies from statistical studies where it is not
possible to
determine for example whether a released offender with a specific collection of
attributes will be one of the 70 per
cent who re-offend or the 30 per cent who
do not. The prediction of recidivism for a released offender like the predicted
condition
for a given smoker must ultimately be compared to the observed
condition or behaviour in order to affirm or refute it. When a prediction
is
compared to the observed outcome it will fall into one of four categories: (i)
True Positive, (ii) True Negative, (iii) False
Positive, (iv) False Negative. In
the case of recidivism a true positive is an accurate prediction that an
offender will re-offend
and a true negative is an accurate prediction that an
offender will not re-offend. A false positive is an inaccurate prediction that
an offender will re-offend and a false negative is an inaccurate prediction that
an offender will not re-offend. The presence of
either false positives or false
negatives indicates that the particular phenomenon can not be predicted
accurately which is the case
for recidivism whether or not clinical or actuarial
methods are employed.
In an attempt to avoid some of the problems
associated with quantitative statistical statements about risk based on relative
frequencies,
some researchers have adopted a qualitative approach to assessing
risk. Whilst a rational number between 1 and 0 is assigned to risk
using a
quantitative approach, a qualitative approach relies on words rather than
numbers by defining categories of risk such as
low, medium and high. Typically
such judgements feature in clinical methods; however they may also appear in
actuarial methods where
different ranges of rational numbers correspond to
different risk categories. Neither approach can necessarily offer accurate
predictions
however a qualitative approach results in a loss of precision and
expresses less information. For example if an assessment that indicates
there is
a high likelihood that the offender will re-offend is compared to an assessment
that indicates there is a 0.75 likelihood
that the offender will re-offend it
can be seen that the quantitative statement attempts to define the risk with
more precision just
as 0.756 likelihood is more precise than 0.75 likelihood.
Whether risk is expressed quantitatively or qualitatively the question
to be
considered by the court is: what level of risk must the offender be assessed at
before the sentence should be extended? Is
a risk level of 0.75, indicating that
the likelihood of not re-offending is 0.25, a sufficient basis for extending a
sentence or
is a more accurate risk level required such as 0.95? Alternatively
should a sentence be extended where the risk of recidivism is
judged to be high?
A second issue arises where historical information about a sample
population of individuals is used to make judgements about an individual
who
shares a relatively small number of common attributes at the expense of ignoring
distinguishing attributes that are not shared.
This reveals an assumption that
future individuals will behave in a manner that is consistent with the behaviour
displayed by a sample
population of past individuals and that the relative
frequency drawn from the past sample population will remain stable over time.
Inferences in statistics where individuals are characterised according to
attributes that are consistent with the attributes of a
sample population in
statistical studies are based upon arguments of analogy. In such arguments the
behaviour of something that is
known may be used as a basis to make inferences
about the unknown behaviour of something that is similar in certain respects.
Consider
the following example offered by Thomas Reid:
We may observe a very great similitude between this earth which we inhabit, and
the other planets, Saturn, Jupiter, Mars, Venus and
Mercury. They all revolve
around the sun, as the earth does….They borrow all their light from the
sun, as the earth does.
Several of them are known to revolve around their axis
like the earth, and by that means must have a like succession of day and night.
Some of them have moons that serve to give light in the absence of the sun, as
our moon does to us. They are all in their motions,
subject to the same law of
gravitations, as the earth is. From all this similitude, it is not unreasonable
to think, that these planets
may, like our earth, be the habitation of various
orders of living creatures. [10]
Reid’s analogical argument attempts to make a conclusion about
something that is unknown, the presence of life on other planets,
based on the
presence of life on earth which bears some resemblance to the other solar system
planets. In the same manner psychiatrists
attempt to make predictions about the
unknown future behaviour of released offenders based on the known behaviour of
offenders released
in the past and their comparative similarities. In these
types of analogies however the differences may be as important as the
similarities.
Although there were obvious similarities between the earth and
the other observable solar system planets during Reid’s lifetime,
many
differences have since been discovered which would suggest that the presence of
carbon based life forms, such as those found
on earth, is unlikely. In the same
manner the relative frequencies for risk of recidivism are based on sample
populations that exemplify
a discrete number of common attributes. An individual
with the same attributes will be assumed to represent the level of risk defined
by the relative frequency of the sample population despite the fact that the
individual being assessed will also possess a number
of attributes that are
different. These differences however which may mitigate the level of risk will
be overlooked in a purely actuarial
approach to risk assessment.
To
compensate for the lack of emphasis on unique individual characteristics in a
purely actuarial approach some psychiatrists and
psychologists have adopted an
actuarially informed clinical assessment which combines the methods of both.
This allows the psychiatrist
or psychologist to accommodate protective or
mitigating factors that might be seen to decrease the likelihood of risk as
determined
by an actuarial assessment. Some recently developed actuarial
instruments such as the HCR-20[11]
have been modified to include dynamic information that requires clinical
investigation. Whilst this allows important information
about the individual to
be considered it also introduces the weaknesses of clinical assessment methods
which have already been discussed.
This has an impact on the reliability of the
assessment to the extent that independent psychologists or psychiatrists reach
the same
diagnosis or assessment of risk. That is to say that independent
psychologists and psychiatrists are more likely to reach the same
conclusion
with respect to risk when applying actuarial methods than when applying clinical
methods. As a result reliability decreases
in the presence of clinical
assessment methods.
Conclusion
From the limitations outlined above it should be apparent that statistical
inferences are ultimately based on generalisations about
populations that
don’t eliminate uncertainty but rather in the presence of uncertainty
attempt to indicate which of two options
might be seen as more likely, though
not certainly, to occur. In addition such generalisations obscure the identity
of the individual
with the effect that differences may be overlooked. Clinical
approaches on the other hand also suffer from defects that render their
diagnostic and risk assessment methods unreliable, an issue which also features
in actuarially informed clinical assessment. Despite
the differences, all three
approaches share a common goal of attempting to determine the risk that an
offender poses with respect
to recidivism upon release rather than attempting to
predict what the offender will actually do. Given the limitations of actuarial
and clinical methods of assessment, a number of issues requires judicial
consideration. Central to this consideration is the following
question:
Can clinical or actuarial methods of risk assessment yield
cogent evidence, as required by the legislation, which identifies the level
of
risk for an offender with a high degree of probability?
Furthermore
since the concept of risk implicitly accommodates uncertainty, which in
this case reflects the inability to make accurate predictions about the future
behaviour
of released offenders, are inaccurate predictions of future behaviour
an acceptable basis for extending the sentence of an offender
who is otherwise
entitled[12] to
release?
Given the limitations outlined above it is arguable that none of
the assessment methods can yield evidence which satisfies the standard
of proof,
a high degree of probability,[13]
required by the legislation. It does not follow that such risk assessment
evidence is inadmissible for the legislation authorises
the court to issue a
risk assessment order to be performed by court appointed
psychiatrists.[14] However the risk
assessment evidence on its own is not sufficient to support a finding of
dangerousness which features as only one
of many factors to be taken into
consideration by the court under section 13(4) of the
legislation.[15] As a result the
court must decide what probative value or weight should be given to the risk
assessment reports conducted by the
court appointed psychiatrists. In
considering this question the court must critically examine the nature of the
research methodology
relied upon to assess risk and or make predictions about
future human behaviour with respect to serious sexual offences. In doing
so it
should look closely at the relationship between the inferences drawn by
psychiatrists and the facts or assumptions which support
them. Given the
problems that affect each of the methods for assessing risk and the serious
consequences with respect to the outcome
of an assessment it is arguable that
any such evidence should be assigned a low probative value or weight.
* Part time lecturer in legal philosophy, Southern Cross
University.
[1] See Rudolf Carnap: An Introduction to the
Philosophy of Science (1995). Carnap makes a similar distinction between
universal laws and statistical laws.
[2] See Dangerous Prisoners
(Sexual Offenders) Act 2003, ss13(1),
(3)
[3] Ibid ss 8(2)(a), 9.
[4] Ibid
s11(2).
[5] Makita v
Sprowles, Heydon JA, 81.
[6]
Research may also accommodate theoretical explanations as well as experimental
observations. The observation that the combination
of zinc and copper in the
presence of an electrolytic solution, such as salt water, will produce a
measurable voltage or potential
difference, may be distinguished from the
theoretical explanation based upon the unequal distribution of electrons between
the zinc
and copper electrodes. One is observable through measurement, the other
currently is not.
[7] The DSM IV
(Diagnostic and Statistical Manual of Mental Disorders Version IV) lists
the currently accepted disorders and the criteria by which they are diagnosed.
[8] The DSM-IV-TR is the text
revision (TR) of the fourth edition of the Diagnostic and Statistical Manual
of Mental Disorders of the American Psychiatric Association, published in
2000.
[9] Harris et al, Violence
Risk Appraisal Guide (1993). The methodology for designing actuarial
instruments is the same whether they attempt to predict violent or serious
sexual
offences. The attributes selected however may be different depending on
the type of criminal behaviour being
predicted.
[10]
Sir William Hamilton (ed), The Works of Thomas Reid,
D.D. (first published 1846, 1983
ed).
[11] HCR-20 (Webster,
Douglas, Eaves and Hart, 1997 Version
2)
[12] It follows that the
requirement of risk assessment is unnecessary where a serious sexual offender is
not entitled to release.
[13]
This phrase is not defined in the legislation, however, it is unlikely that
courts would accept a definition based on mathematical
terms which defines
probability as a rational number between 1 and 0. As such, courts are unlikely
to identify a rational number
between 1 and 0 that corresponds to a high degree
or probability. The standard would necessarily be greater than that required for
civil trials, beyond the balance of probabilities, and more closely resemble the
criminal standard of proof.
[14]
See Dangerous Prisoners (Sexual Offenders) Act 2003, ss8(2), 9,
11.
[15] Section 13(4) identifies
ten such factors which must be taken into consideration by the court.
AustLII:
Copyright Policy
|
Disclaimers
|
Privacy Policy
|
Feedback
URL: http://www.austlii.edu.au/au/journals/UTSLawRw/2005/6.html