One other key point must be made. Maths, though it centrally concerns number, space, measure, etc., is not fixed and unchanging. During periodic renegotiations of what counts as school maths the cognitive demands made on children change (Cooper, 1983, 1985a, 1985b, 1994a). These demands have typically been differentiated by measured ‘ability’ and/or social class in England, as the case of SMP well illustrates (Cooper, 1985b; Dowling, 1998). In England in recent years, such renegotiation has led to an apparent weakening of the boundary between ‘everyday’ knowledge and ‘esoteric’ mathematical knowledge both in the curriculum and in its assessment, perhaps especially so for children deemed ‘less able’ (e.g. Dowling, 1998). While in the 1960s and early 1970s the preferred version of school maths tended to favour ‘abstract’ algebraic approaches #(3) , the dominant orthodoxy since the time of the Cockcroft Report of 1982 has favoured the teaching and learning of maths within ‘realistic’ settings (Cockcroft, 1982; Dowling, 1991; Boaler, 1993a, 1993b). This preference within the world of maths educators has been reflected in the national tests (Cooper, 1992, 1994b). It has been argued, drawing on the work of Bernstein (1990, 1996) and Bourdieu (1986), that test items contextualising mathematical operations within ‘realistic’ settings might be expected to cause problems of interpretation for certain students. Working class children may experience more difficulty than others in choosing ‘appropriately’ between using ‘everyday’ knowledge and ‘esoteric’ mathematical knowledge when responding to items (Cooper, 1992, 1994b). This may lead to underestimation of their mathematical capacities in cases where a rational ‘everyday’ response is ruled out as ‘inappropriate’ by the marking scheme but is ‘chosen’ by the child in place of an alternative ‘esoteric’ response (Cooper, 1996, 1998a&b). Similar arguments have been advanced in respect of gender, with girls seen as likely to be disadvantaged by ‘realistic’ assessment items (Boaler, 1994). In summary, performance on ‘realistic’ items may not reflect underlying competence #(4) . It is upon this possible threat to valid and fair assessment that our research has focused.
While the assessment
literature has many useful discussions of item bias and differential validity
(e.g. Wood, 1991, p.177; Gipps & Murphy, 1994) these tend not to draw
on relevant sociological insights concerning the relation between culture
and cognition (e.g. Bernstein, 1996; Bourdieu, 1986, 1990a,b&c). Discussions
of bias are frequently technical if not empiricist in tone (e.g. Camilli
& Shepard, 1994). While purely quantitative methods can identify items,
or classes of items, which some groups of testtakers find more or less
difficult than other groups, they are less good at increasing our understanding
of why such items ‘behave’ in the way they do. To advance our understanding
in this area, a more qualitative concern with children’s cognitive strategies
and processes is needed, coupled with the use of relevant theoretical insights
from outside the area of assessment itself #(5)
. It is this more explanatory
problem to which our research has been addressed  in the belief that a
better understanding of the ways culture, cognition and test performance
interact should inform test design (e.g. Cooper, 1998b; Cooper & Dunne,
1998). It would then be possible to avoid more easily those items
which cause unnecessary and constructirrelevant difficulty to some test
takers (Messick. 1989, 1994). However, here we intend to show how differently
contextualised NC item types are associated with different relative performances
by certain social groups. Our focus will be therefore on some of our quantitative
data.
We have employed both
quantitative and qualitative methods. The basic strategy has been to use
initially statistical analysis of children’s performance on items in test
situations to generate insights concerning broad classes of test items
(e.g. items which embed mathematical operations in ‘everyday’ and ‘esoteric’
contexts respectively). This has involved coding test items on a number
of dimensions #(6).
Analyses of the relationships between social class, gender, measured ability,
item type and performance have been carried out. Some of these use the
child as the case for analysis, others use the item itself (see below and
Cooper, Dunne & Rogers, 1997). Alongside this we have used more qualitative
analyses of children’s responses to particular items in both the tests
and subsequent clinical interviews to generate understanding of why,
for example, ‘realistic’ and ‘esoteric’ items seem to be differentially
difficult for children from different sociocultural backgrounds (e.g.
Cooper & Dunne, 1998). This has involved the coding of children’s responses
on various dimensions, especially the child’s use, whether ‘appropriate’
or not, of ‘everyday’ knowledge in responding to items. In parallel, informing
and being informed by this work, a model of the way culture, cognition
and performance on ‘realistic’ test items interact has been developed (Cooper,
1996, 1998b).
In each of three primary
and secondary schools, Year 6 and Year 9 children took three group tests
in maths. Two of these were the actual May 1996 Key Stage national tests.
The third, taken some four months earlier, comprised a test put together
by us, drawing on previous NC items. Our tests were designed to cover a
variety of item types and four Attainment Targets (ATs) #(7)
. Our secondary test,
like the May 1996 test, was tiered by NC level. Our tests were marked according
to the NC marking schemes. Between the administration of the first test
and May 1996 we interviewed all of the Year 6 children and a 25% sample
of the Year 9 children while they worked individually through a selection
of items from the first test. This allowed access to children’s interpretations
of the items and their methods. Furthermore, and this has been a crucial
part of our approach, it was possible to allow children to reconsider
their approach and answer in cases where they had initially chosen an ‘inappropriate’
‘everyday’ reading of the meaning and requirement of the item. This has
allowed us to explore the ways in which the use of a certain class of ‘realistic’
item can lead to the underestimation of children’s actually existing knowledge
and understanding (Cooper, Dunne & Rogers, 1997; Cooper & Dunne,
1998). In order to allow an examination of social class effects we have
also collected information on parental occupations. The issue of parental
occupations was a sensitive one, especially in the secondary schools. Two
of the three schools required parental permission before children were
allowed to supply this information. The third required that the question
go directly to the home with the result that we gained this information
for only 43% of the sample in this school #(8)
. We also have children’s
scores on the three Nelson Cognitive Ability tests. We have also interviewed
teachers, concentrating on the school’s approach to maths, and on teachers’
perspectives on NC assessment and the pupils in their schools. The nature
of the samples and the project’s activities are set out in Table 1.
Children Tested (n)  Children Interviewed (n)  Teachers Interviewed (n)  Lessons Observed (n)  
Key Stage 2  
School A 
63

63

4

4

School B 
44

44

3

4

School C 
29

29

6

5

Total
KS2


Key Stage 3  
School D 
254

50

6

10

School E 
102

37

5

5

School F 
117

36

4

5

Total
KS3

The marking scheme
(Band 1  4, Pper 1) gives as "approximate evidence" of achievement:
"Gives the answer to the division of 269 by 14 as 20, indicating that they
have interpreted the calculator display to select the most appropriate
whole number in this context. Do not accept 19 or 19.2
The item in Figure 1 is one of a type much discussed in mathematical education circles (e.g. Verschaffel, De Corte, & Lasure, 1994).The key point is that the child’s answer must not be fractional. The lift can not go up (and down) 19.2 times. The child is required therefore to introduce a ‘realistic’ consideration into his or her response. In fact the child must manage much more than this. S/he must introduce only a small dose of realism  ‘just about enough’. S/he must not reflect that the lift might not always be full; or that some people might get impatient and use the stairs; or that some people require more than the average space  e.g. for a wheelchair. Such considerations  ‘too much realism’  will lead to a problem without a single answer, and no mark will be gained #(10) . There is a certain irony here. Many reformers have argued for the use of ‘illstructured’ items in maths teaching, learning and assessment contexts (e.g. Pandey, 1990). This item, however, is unintentionally illstructured. Children’s and schools’ interests now hinge on managing the resulting ambiguities in a legitimate manner.
The child is asked to exercise some ‘realistic’ judgement and, in doing so, might be presumed to be undertaking a ‘realistic’ application of some mathematical (or at least arithmetical) knowledge. But on whose account of ‘applying’? The lift item essentially concerns queuing behaviour. A mathematics of queuing exists. We might turn for some insight to an elite disciplinary source. Let’s try Newer Uses of Mathematics #(11) , edited in 1978 by Sir James Lighthill, FRS, then Lucasian Professor of Applied Maths at Cambridge #(12). This edited collection includes a paper on methods of operational analysis by Hollingdale (former Head of Maths Dept at the Royal Aircraft Establishment) which discusses queuing. An edited extract follows:
Everyone, nowadays, is only too familiar with queues  at the supermarket, the post office, the doctor's waiting room, the airport, or on the factory floor. Queues occur when the service required by customers is not immediately available. Customers do not arrive regularly and some take longer to serve than others, so queues are likely to fluctuate in length  even to disappear for a time if there is a lull in demand…. The shopper leaving the supermarket, for example, desires service; the store manager wants to see his cashiers busy most of the time. If customers have to wait too long, some will decide to shop elsewhere; … The essential feature of a queuing situation, then, is that the number of customers (or units) that can be served at a time is limited so there may be congestion. …. Queuing problems lend themselves to mathematical treatment and the theory has been extensively developed during the last seventy years. …The raw materials of queuing theory are mathematical models of queuegenerating systems of various kinds. The objective is to predict how the system would respond to changes in the demands made on it; in the resources provided to meet those demands; and in the rules of the game, or queue discipline as it is usually called. Examples of such rules are: 'first come, first served'; 'last come, first served', as with papers in an office 'intray'; service in an arbitrary order; or priority for VIPs or disabled persons. To analyse queuing problems, we need information about the input (the rate and pattern of arrival of customers), the service (the rate at which customers are dealt with either singly or in multiple channels), and the queue discipline…. (Hollingdale, in Lighthill, pp. 244245)
The question which arises then is would any of these models deliver the correct answer according to the producers of marking schemes for National Curriculum tests #(13) . If not, why not, and what approach does? Can the ‘required’ approach be specified via teachable ‘rules of engagement’ for such items? If not, why not? Should they be?
Various writers have employed the notion of educational ground rules to capture what is demanded of children in cases like the lift item (Mercer & Edwards, 1987). There is clearly some affinity between this concept and those of recognition and realisation rules as employed by Bernstein (1996). However, it can be seen that it would be quite difficult  if not impossible  to write a set of rules which would enable the child to respond as required to the lift question. Certainly, the rule  in the sense of a mandated instruction  to employ ‘realistic’ considerations would not do, since ‘how much’ realism is required remains a discretionary issue. It is this problem that has led to a range of attacks on the use of rules to model human activities (e.g.Taylor, 1993) and, in particular, has led Bourdieu to reject a rulebased account of cultural competence (see Bourdieu, 1990a). His concept of habitus aims to capture the idea of a durable socialised predisposition without reducing behaviour to strict rulefollowing (Bourdieu, 1990c). Bourdieu sometimes describes what habitus captures as ‘a feel for the game’ and we can see that this describes fairly well what is required by the lift problem and others like it #(14) . Both Bernstein and Bourdieu have shown that members of the working class are more likely to respond to testlike situations by drawing on ‘local’ and/or ‘functional’ rather than ‘esoteric’ and/or ‘formal’ perspectives #(15) . We have shown elsewhere that this can lead to the relative underestimation of these children’s mathematical capacities when test items are superficially ‘realistic’ but actually demand an ‘esoteric’ response (Cooper, 1996, 1998b; Cooper & Dunne, 1998). Because of lack of space, we will not present any findings concerning the lift item here, nor will we be able to present the explanatory perspective. We move instead to present a statistical overview of children’s relative performance on ‘realistic’ and ‘esoteric’ items at KS2. We have already described our simple coding of ‘realistic’ and ‘esoteric’ items. The lift item can serve as an exemplar of the former #(16) . The following is an example of the latter:
Female

Female

Male

Male

Total

Total


Class 
Mean

Count

Mean

Count

Mean

Count

Service class 
57.74

26

60.33

34

59.21

60

Intermediate class 
55.68

13

55.04

17

55.32

30

Working class 
47.34

13

51.07

20

49.60

33

Total 
54.62

52

56.46

71

55.68

123

Female

Female

Male

Male

Total

Total


Class 
Mean

Count

Mean

Count

Mean

Count

Service class 
71.07

26

70.10

34

70.52

60

Intermediate class 
70.35

13

69.98

17

70.14

30

Working class 
65.71

13

64.69

20

65.09

33

Total 
69.55

52

68.54

71

68.97

123

Female

Female

Male

Male

Total

Total


Class 
Mean

Count

Mean

Count

Mean

Count

Service class 
.81

26

.88

34

.85

60

Intermediate class 
.79

13

.79

17

.79

30

Working class 
.71

13

.79

20

.76

33

Total 
.78

52

.83

71

.81

123

Ratios such as these have properties that can make them difficult to interpret. In particular, a ratio of percentages will have an upper bound set by the size of its denominator. If, for example, a child scores 50% as their ‘esoteric’ subtotal then their highest possible r/e ratio will be 100/50 or 2. If another child, on the other hand, scores 40% as their ‘esoteric’ subtotal their highest possible ratio will be 100/40 or 2.5. Since service class children, on average, do better than others on the ‘esoteric’ subsection of the tests their potential maximum r/e ratio is lower than that for the working class children who score lower on the ‘esoteric’ subsection. Notwithstanding this, Table 4 shows that the service class children have the highest ratios of any group.
There is a clear relation of this ratio to social class background, with its value ranging from 0.85 for the service class, through 0.79 for the intermediate grouping, to 0.76 for the working class for boys and girls taken together #(22) . Service class children as a whole have a better performance on ‘realistic’ items in relation to ‘esoteric’ items than do working class children. The relation of the ratio to class is particularly clear in the case of girls. Looking at sex, the r/e ratio is higher for boys in both the service and working class groups, though it is identical for girls and boys in the intermediate grouping #(23) . The class effect is illustrated in Figure 3, where two linear regression lines have been fitted to capture the ‘realistic’‘esoteric’ relation for these two class groupings. What this finding suggests is that, all other things being equal, the higher the proportion of ‘realistic’ items in a test, the greater will be the difference in outcome between service and working class children.
It is important to
stress that these class differences are not ones of kind. There is much
overlap in the three distributions of these ratios by social class. The
differences in Table 4 are differences ‘on average’ not of kind. The charts
in Cooper et al (1997) demonstrate this clearly. However, it is also worth
noting that, given the many other dimensions on which these test items
differ within the categories ‘realistic’ and ‘esoteric’, it is also
possible that these results underestimate the importance of the effect
of ‘realistic’ versus ‘esoteric’ contextualisation. It is perhaps surprising
that the effect appears at all amidst all this ‘noise’ #(24)
.
Social class may, of
course, only appear to be a causal factor here. It might be the
case, for example, that ‘ability’, some concomitant of school attended
such as curriculum coverage, and/or systematic differences in the easiness
of the ‘realistic’ versus ‘esoteric’ items are the real underlying causes
of the results in Table 4. We have tried to approach these problems from
two directions. First, we have used logistic regression to examine the
associations between school, ‘ability’, sex, class and the ratio. Secondly,
concerning curriculum topic/area we have looked at how the ratio varies
within Attainment Targets. The regression analysis (Cooper, Dunne &
Rodgers, 1997) with our ‘realistic’/ ‘esoteric’ ratio as dependent variable
and social class, sex, school and nonverbal ‘ability’ as independent variables,
suggests that class and sex are statistically significant here and that
school and nonverbal ‘ability’ are not #(26)
. Details of the analysis
by Attainment Target are set out in the following section of the paper.
Female  Female  Male  Male  Total  Total  
Mean  Count  Mean  Count  Mean  Count  
Service class  .79  26  .83  34  .81  60 
Intermediate class  .82  13  .81  17  .81  30 
Working class  .78  13  .79  20  .78  33 
Total  .79  52  .81  71  .80  123 
Female  Female  Male  Male  Total  Total  
Mean  Count  Mean  Count  Mean  Count  
Service class  .69  26  .88  34  .80  60 
Intermediate class  .66  13  .71  17  .69  30 
Working class  .57  13  .56  20  .56  33 
Total  .66  52  .75  71  .71  123 
Female  Female  Male  Male  Total  Total  
Mean  Count  Mean  Count  Mean  Count  
Service class  1.17  26  1.17  34  1.17  60 
Intermediate class  1.04  13  1.13  17  1.09  30 
Working class  1.04  13  1.19  20  1.13  33 
Total  1.10  52  1.17  71  1.14  123 
The patterns are less clear than they were in Table 4 but are nevertheless there. In each case an overall service/working class comparison of the r/e ratio favours the service class against the working class. In parallel with this, an overall male/female comparison of the r/e ratio consistently favours the boys. These differences are particularly marked in the case of algebra. It is also interesting to note that, in the case of ‘shape and space’, the children found the ‘realistic’ items generally easier than the ‘esoteric’ ones. Nevertheless, the r/e ratio remains highest in the case of the service class taken as a whole, and boys have a higher ratio than girls. We are not able to present a table for the case of data handling since all of the items under this heading have been coded as ‘realistic’. However, some idea can be gained of the ‘behaviour’ of the latter items in relation to class by examining their position in Table 8. Here we show how children from each class group performed on each of the seven attainment target  context coding combinations. Table 9 shows comparable calculations for boys and girls. Comparing the service class with the working class, and boys with girls, there appear to be similar class and gender effects across attainment targets, suggesting that the differences in the r/e ratio in Table 4 are not ‘spurious’ topic effects.
Service class  Intermediate class  Working class  Total 

number of separately coded items & subitems  
Number  ‘esoteric’  78.81  77.94  75.61  77.74  1.04 
21

Number  ‘realistic’  64.05  63.57  59.96  62.83  1.07 
22

Algebra  ‘esoteric’  68.46  66.92  61.54  66.23  1.11 
10

Algebra  ‘realistic’  50.60  44.05  30.74  43.67  1.65 
11

Shape & space  ‘esoteric’  66.79  66.19  57.58  64.17  1.16 
11

Shape & space  ‘realistic’  72.50  66.33  60.00  67.64  1.21 
9

Handling data  ‘esoteric’  n/a  n/a  n/a  n/a  n/a 
0

Handling data  ‘realistic’  62.86  55.42  49.34  57.42  1.27 
26

n (children)  60  30  33  123 
110

Girls  Boys  Total 

number of separately coded items & subitems  
Number  ‘esoteric’  76.63  78.27  77.56  1.02 
21

Number  ‘realistic’  60.71  63.88  62.51  1.05 
22

Algebra  ‘esoteric’  68.23  64.36  66.03  0.94 
10

Algebra  ‘realistic’  42.06  44.37  43.37  1.05 
11

Shape & space  ‘esoteric’  63.49  64.08  63.89  1.01 
11

Shape & space  ‘realistic’  65.19  68.87  67.28  1.06 
9

Handling data  ‘esoteric’  n/a  n/a  n/a  n/a 
0

Handling data  ‘realistic’  55.67  58.19  57.10  1.05 
26

n (children)  54  71  125 
110

Another possibility which needs to be addressed is that it is because the ‘esoteric’ items are, in general, found easier in this data set, coupled with class related differences in typical educational achievement, that the r/e ratio patterns by class are as they are. Perhaps working class children just perform less well on harder items? In fact, however, statistical analyses employing items rather than the child as the case have shown that broad social class differences in a relative of this ratio remain (though are reduced in importance #(28) ) when examined within four categories of items ordered by average difficulty levels #(29) . The means in Table 10 derive from a variable constructed by dividing, for each item, the service class mean score by the working class mean score #(30) . It can be seen that, within each category of items, from the most easy to the most difficult, the service class children perform relatively better than working class children on ‘realistic’ items as compared to ‘esoteric’ items #(31) .
Realistic Items  Esoteric Items  Total Items  
Item difficulty levels  Mean  Count  Mean  Count  Mean  Count 
1. Most difficult quartile  1.62  21  1.37  6  1.56  27 
2. Second quartile  1.42  18  1.20  9  1.35  27 
3. Third quartile  1.21  14  1.10  12  1.16  26 
4. Most easy quartile  1.06  15  1.03  15  1.04  30 
Totals  1.35  68  1.14  42  1.27  110 
These effects may appear
small. However, in the world of educational practice, where decisions are
often taken on the basis of thresholds being achieved or not by children,
differences of this size can have large effects. To illustrate this, we
have developed a simulation of what would happen to children from different
social class backgrounds if a selection process were to occur on the basis
of three differently composed tests: one comprising items which behave
like our ‘esoteric’ items, one of items which behave like our ‘realistic’
items, and one comprising an equal mixture of the two #(32)
. This process might be
realised as a selection exam for secondary school or for set placement
within the first year of secondary school. A summary of the results is
shown in Table 11 and Figure 4. It can be seen that, using our results
as the basis for predicting outcomes, the proportion of working class children
in this sample who would be selected by an ‘esoteric’ test is double that
which would be selected by a ‘realistic’ test. The two tests lead to quite
different outcomes, mainly for intermediate and working class children
#(33) .
Esoteric
Test
(26% selected in total) 
Mixed Test (½ & ½) (26.8% selected in total)  Realistic
Test
(27.6% selected in total) 

Percentage selected  
Service Class  30.0  33.3  33.3 
Intermediate Class  20.0  23.3  33.3 
Working Class  24.2  18.2  12.1 
A similar simulation for sex does not show such large effects, reflecting the smaller differences in the realistic/esoteric ratio in Table 4. While in the case of class, a move from ‘realistic’ through mixed to ‘esoteric’ composition linearly increases the proportion of working class children selected, any pattern for sex is less clear (see Table 12 and Figure 5).
Esoteric
Test
(26% selected in total) 
Mixed Test (½ & ½) (26.8% selected in total)  Realistic
Test
(27.6% selected in total) 

Percentage selected  
Girls  22.2  18.5  22.2 
Boys  28.2  32.4  31.0 
Given the small cell
sizes which would result, we will not present a simulation for the six
sex/class groups.
We have collapsed 1&2
into a service class, 38 into an intermediate class, and
911 into a working class.
References
Ball, S.J. (1994) Education Reform, Open University, Buckingham.
Bernstein, B. (1990) The Structuring of Pedagogic Discourse, London, Routledge.
Bernstein, B. (1996) Pedagogy, Symbolic Control and Identity: Theory, Research, Critique, Taylor & Francis, London.
Bhaskar, R. (1979) The Possibility of Naturalism, Sussex, Harvester.
Boaler, J. (1993a) "The role of contexts in the mathematics classroom: do they make mathematics more "real"’? For the Learning of Mathematics, 13, 2, 1217
Boaler, J. (1993b) "Encouraging the transfer of 'school' mathematics to the 'real world' through the integration of process and content, context and culture", Educational Studies in Mathematics, 25, 341373.
Boaler, J.: (1994) ‘When do girls prefer football to fashion? An analysis of female underachievement in relation to "realistic" mathematics contexts’, British Educational Research Journal, 20, 5, 551564.
Bourdieu, P. (1984) Homo Academicus, Paris, Éditions de Minuit.
Bourdieu, P. (1986) Distinction: A social critique of the judgement of taste, RKP, London
Bourdieu, P. (1990a) "From rules to strategies", in his In Other Words, Cambridge, Polity.
Bourdieu, P. (1990b) In Other Words, Cambridge, Polity Press.
Bourdieu, P. (1990c) The Logic of Practice, Oxford, Blackwell.
Bourdieu, P. (1994) Raisons Pratiques: sur la théorie de l’action, Paris, Seuil.
Brown, M. (1992) "Elaborate nonsense? The muddled tale of Standard Assessment Tasks at Key Stage 3", in Gipps, C. (Ed) Developing Assessment for the National Curriculum, Kogan Page & London University Institute of Education, pp. 619.
Brown, M. (1993) Clashing Epistemologies: the Battle for Control of the National Curriculum and its Assessment, Professorial Inaugural Lecture, King's College, London.
Camilli, G. & Shepard, L. (1994) Methods for Identifying Biased Test Items, London, Sage.
Cockcroft, W.H.: (1982) Mathematics Counts, London, HMSO.
Cooper, B. (1983) "On explaining change in school subjects", British Journal of Sociology of Education, 4(3), pp. 20722.
Cooper, B. (1985a) Renegotiating Secondary School Mathematics: a Study of Curriculum Change and Stability, Basingstoke, Falmer Press.
Cooper, B. (1985b) "Secondary school mathematics since 1950: reconstructing differentiation", in Goodson, I.F. (Ed) Social Histories of the Secondary Curriculum, Barcombe, Falmer, pp.89119.
Cooper, B. (1992) "Testing National Curriculum Mathematics: Some critical comments on the treatment of 'real' contexts for mathematics", in The Curriculum Journal, pp. 231243.
Cooper, B. (1994a) "Secondary mathematics education in England: recent changes and their historical context", in Selinger, M. (Ed) Teaching Mathematics, London, Routledge, pp526.
Cooper, B. (1994b) "Authentic testing in mathematics? The boundary between everyday and mathematical knowledge in National Curriculum testing in English schools", in Assessment in Education: Principles, Policy and Practice, 1, 2, pp. 143166.
Cooper, B. (1996) "Using Data From Clinical Interviews To Explore Students’ Understanding of Mathematics Test Items: Relating Bernstein and Bourdieu on Culture to Questions of Fairness in Testing", Paper presented to the Symposium: Investigating Relationships Between Student Learning and Assessment in Primary Schools, American Educational Research Association Conference, New York, April 1996.
Cooper, B. (1998a) "Assessing National Curriculum Mathematics in England: Exploring children’s interpretation of Key Stage 2 tests in clinical interviews", Educational Studies in Mathematics, 35, 1, 1949.
Cooper, B. (1998b, forthcoming) "Using Bernstein and Bourdieu to understand children’s difficulties with ‘realistic’ mathematics testing: an exploratory study", in International Journal of Qualitative Studies in Education, 11, 4.
Cooper, B. & Dunne, M. (1998) "Anyone for tennis? Social class differences in children’s responses to national curriculum mathematics testing", The Sociological Review, 46, 1.
Cooper, B., Dunne, M., & Rodgers, N. (1997) "Social class, gender, item type and performance in national tests of primary school mathematics: some research evidence from England", paper presented at the Annual Meeting of the American Educational Research Association, Chicago, March 1997.
DarlingHammond, L. (1994) "Performancebased assessment and educational equity", Harvard Educational Review, 64, 1, pp.530.Dearing, R. (1993) The National Curriculum and its Assessment: Final Report, London, SCAA.
Dearing, R. (1993) The National Curriculum and its Assessment: Final report, SCAA.
Department of Education and Science/Welsh Office (1988) National Curriculum: Task Group on Assessment and Testing: A Report, DES/WO.
Dowling, P. (1991) "A touch of class: ability, social class and intertext in SMP 1116", in Pimm, D. & Love, E. (Eds) Teaching and Learning School Mathematics, London, Hodder & Stoughton.
Dowling, P. (1998) The Sociology of Mathematics Education: Mathematical Myths/Pedagogic Texts, Falmer Press.
Erikson, R. & Goldthorpe, J.H. (1993) The Constant Flux: A Study of Class Mobility in Industrial Societies, Oxford, Clarendon.
Gilbert, N. (1993) Analysing Tabular Data: Loglinear And Logistic Models For Social Researchers, University College London Press.
Gipps, C. & Murphy, P. (1994) A Fair Test? Assessment, Achievement and Equity, Open University Press.
Goldthorpe, J. & Heath, A. (1992) "Revised class schema 1992", Working Paper 13, Nuffield College Oxford.
Hoel, P.G. (1971) Introduction to Mathematical Statistics, 4^{th} Edition, New York, Wiley.
Holland, J. (1981) "Social class and changes in orientation to meaning", in Sociology, 15, 1, 118.
Lighthill, J. (1978) (Ed) Newer Uses of Mathematics, Penguin Books.
Mehan, H. (1973) "Assessing children’s school performance", in Dreitzel, H.P. (Ed) Childhood and Socialisation, Canada, CollierMacmillan.
Mercer, N. & Edwards, D. (1987) Common Knowledge: The Development of Understanding in the Classroom, London, Methuen.
Messick, S. (1989) "Validity", in Linn, R. (Ed) Educational Measurement, 3rd Edition, London, Collier Macmillan.
Messick, S. (1994) "The interplay of evidence and consequences in the validation of performance assessments", Educational Researcher, 23, 2, 1323.
Morais, A., Fontinhas, F. & Neves, I. (1992) "Recognition and realisation rules in acquiring school science: the contribution of pedagogy and social background of students", British Journal of Sociology of Education, 13, 2, 247270.
Murphy (1996) "Assessment practices and gender in science", in Parker, L.H. et al (Eds) Gender, Science and Mathematics, Kluwer.
Pandey, T. (1990) "Power items and the alignment of curriculum and assessment", in Kulm, G. (Ed) Assessing Higher Order Thinking in Mathematics, Washington, AAAS.
SCAA  Schools Curriculum and Assessment Authority (1995a) Mathematics Tests Key Stage 2 1995, London, Dept. for Education.
SCAA  Schools Curriculum and Assessment Authority (1995b) Mathematics Tests Key Stage 3 1995, London, Dept. for Education.
SCAA  Schools Curriculum and Assessment Authority (1996) Key Stage 2 Tests 1996, London, Dept. for Education and Employment.
SEAC  Schools Examinations and Assessment Council (1992) Mathematics Tests, 1992, Key Stage 3, SEAC/University of London.
SEAC  Schools Examinations and Assessment Council (1993a) 1993 Key Stage 3 Mathematics Tests, DES/WO.
SEAC  Schools Examinations and Assessment Council (1993b) Pilot Standard Tests: Key Stage 2: Mathematics, SEAC/University Of Leeds.
Taylor, C. (1993) "To follow a rule …" in Calhoun, C., Lipuma, E. & Postone, M. (Eds) Bourdieu: Critical Perspectives, Cambridge, Polity
Verschaffel, L., De Corte, E. & Lasure, S. (1994) "Realistic considerations in mathematical modelling of school arithmetic word problems", Learning and Instruction, 4, 273294.
Wood, R. & Power, C. (1987) "Aspects of the competenceperformance distinction: educational, psychological and measurement issues", in Journal of Curriculum Studies, 19, 5, 409424.
Wood, R. (1991) Assessment and Testing, Cambridge, Cambridge University Press.