Article Text


Inbreeding and risk of late onset complex disease
  1. I Rudan1,
  2. D Rudan3,
  3. H Campbell2,
  4. A Carothers4,
  5. A Wright4,
  6. N Smolej-Narancic3,
  7. B Janicijevic3,
  8. L Jin5,
  9. R Chakraborty5,
  10. R Deka5,
  11. P Rudan3
  1. 1School of Public Health “Andrija Stampar”, University Medical School, Zagreb, Croatia
  2. 2Public Health Sciences, University of Edinburgh, Edinburgh, UK
  3. 3Institute for Anthropological Research, Zagreb, Croatia
  4. 4Medical Research Council, Human Genetics Unit, Western General Hospital, Edinburgh, UK
  5. 5Department of Environmental Health, University of Cincinnati, Cincinnati, Ohio, USA
  1. Correspondence to:
 Dr Harry Campbell
 Public Health Sciences, University of Edinburgh, Edinburgh EH8, 9AG, UK;

Statistics from

Inbreeding has been shown in almost all species to be associated with impairment of function because of homozygosity of recessive alleles. This occurs across a wide range of traits and suggests a large number of deleterious alleles in the human genome. This has been predicted from the reduced early survival of offspring in first cousin marriages and from similar results in other organisms.1–3 As most identified genetic variants causing complex disease in humans are partially recessive1 we predict that inbreeding in humans might influence a wide range of complex diseases.

Numerous reports on the health effects of inbreeding have focused mainly on its impact on reproduction, childhood mortality, and rare Mendelian disorders.2,3 For example, a 4–5% increase in childhood mortality has been found in the offspring of first cousin marriages, and similar results have been reported in other species.2,4,5 However, the effects of inbreeding on late onset disorders are largely unknown, despite the fact that deleterious effects of inbreeding in other species are known to increase with age, as predicted by selection theory.6,7 The reported finding of greater inbreeding effects for traits such as blood pressure and serum cholesterol in middle age compared with early adult life is consistent with this.8

In order to investigate the hypothesis that the heritable component of late onset diseases includes a major class of deleterious recessive alleles,9 we recently studied the effects of inbreeding on blood pressure among 2760 adult individuals from 25 villages in a Dalmatian island isolate. The study showed a large effect of inbreeding on blood pressure equivalent to a rise in systolic blood pressure of ∼20 mm Hg and diastolic blood pressure of ∼12 mm Hg in offspring of first cousin marriages. The effect appeared to be mediated by several hundred recessive alleles as a result of increased homozygosity.10 In support of this finding, several studies of small inbred communities worldwide have reported an increased prevalence of hypertension.8,11–15

In the present study, we extend this observation by investigating the relation between inbreeding and the prevalence of 10 late onset complex diseases of public health importance. The study was carried out in 14 of the original 25 isolate villages on three neighbouring islands in middle Dalmatia, Croatia. These island populations present a wide range of levels of inbreeding and endogamy, reduced genetic variation at both individual and (sub)population levels, and relative uniformity of environment,10,16–18 and thus provide a good setting for investigating inbreeding effects.


Study cohort

The village populations of three neighbouring islands in the eastern Adriatic, part of Middle Dalmatia, Croatia (Brac, Hvar, and Korcula, see fig 1), represent well characterised genetic isolates. The tendency towards inbreeding in each village has been influenced by geographic isolation, political (“Pastrovic”) privileges given to residents of certain communities, and sociocultural factors.16–18 A survey of 1339 adult individuals selected randomly from voting lists to form approximately 20–30% of the total population of these 14 villages was undertaken in late 1970s and early 1980s by the Institute for Anthropological Research in Zagreb in collaboration with the Smithsonian Institute, Washington, USA. The mean adult ages in individual villages varied from 41 to 65 years (detailed age/sex profiles for each village are given in appendix 1). For each individual, information was collected on the highest level of education, occupation, diet, smoking habits, and body mass index (table 1).

Key points

  • From arguments derived from population genetics, we propose that the genetic basis of common late onset diseases includes a major class of deleterious recessive alleles. Inbreeding is therefore predicted to increase the incidence of such diseases.

  • Among 10 late onset conditions studied in a genetic isolate population, inbreeding was found to be a significant (positive) predictor for coronary heart disease, stroke, cancer, uni/bipolar depression, asthma, gout, and peptic ulcer, but not for type 2 diabetes

  • It appears that inbreeding causes an increase in homozygosity at many genetic loci with small deleterious effects on homeostatic pathways, resulting in increased disease risk.

  • The results indicate that between 23% and 48% of the incidence of these disorders in this population sample (other than type 2 diabetes) could be attributed to recent inbreeding. The global impact of inbreeding could thus be substantial, as an estimated one billion people globally show rates of consanguineous marriages greater than 20%.

Computation of individual inbreeding coefficients

A single researcher (IR) computed individual inbreeding coefficients for each study participant, based on pedigree information on four to five ancestral generations, recorded during the initial field work and supplemented by a study of parish registries. The individual inbreeding coefficients (F) were then computed according to Wright’s path method19:

Embedded Image

where mi and ni refer to the number of paths from a common ancestor, and c refers to the number of common ancestors. The genealogical inbreeding coefficient for each village was then computed as the average of all individual F values. To further support these estimates, F was calculated from isonymy as suggested by Tay and Yip,20 and mean values were derived for each of the 14 villages. Estimates based on isonymy are generally thought to be positively biased, and so to provide an upper bound for F (table 1).

Follow up data on disease status of cohort individuals

Population census data in 1981, 1991, and 2001 from study villages show significant depopulation with minimal immigration over the last two decades. Thus only 480 individuals who were still resident in the study villages were available for follow up in the year 2000. Specific diagnostic criteria were established for each of 10 commonly occurring disorders in this population (coronary heart disease, stroke, cancer, schizophrenia, epilepsy, uni/bipolar depression, asthma, adult type diabetes, gout, and peptic ulcer) following those presented in Merck’s Manual. In collaboration with local general practitioners, two medical doctors, who were blind to the inbreeding status of each individual, inspected the medical records between March and October 2000 and recorded whenever appropriate diagnostic criteria were met. Diagnoses were supported wherever possible by medical records from the local general hospital in Split.

Statistical analysis and modelling

Disease prevalence was first investigated by comparisons between villages grouped by the level of inbreeding as high, moderate, or low (table 2). Disease prevalence rates were standardised by sex and age to the total population of all 14 villages included in the study, using 10 year age intervals and direct standardisation.

In an attempt to overcome the possible confounding effects of unmeasured environmental exposures or population stratification, the relation between individual inbreeding coefficients and disease outcomes was investigated among the 480 individuals. Data on age, sex, education, occupation, diet, smoking status, village of residence, height, weight, and individual inbreeding coefficient (F) were analysed in a logistic regression with disease status as the outcome. Age and sex were forced into the prediction model irrespective of whether they were formally significant. Other main effects (inbreeding, smoking, height, weight and village) were entered with p = 0.05, and all higher order effects and interactions with p = 0.01.

Population attributable risk

Population attributable risk (PAR) estimates for inbreeding were calculated by logistic regression, allowing for individual differences in the variables village, sex, age, height, weight, and smoking. The appropriate regression was determined as a function of all associated variables (including F), then the probability for each individual of having the disease outcome was calculated assuming an F value set at 0. The sum of all such probabilities, Psum, is an estimate of the number affected in the absence of inbreeding, but with other variables remaining unaltered. Then PAR = 1−Psum/Naff, where Naff is the actual number of affected individuals. In deriving the PAR values, the effects estimated from the subset of 14 villages were applied to the full dataset from all villages.

The original survey was carried out with the informed consent of the participants and ethical approval for the recent field work and analyses was granted by the appropriate research ethics committees in Scotland and Croatia.


Table 1 presents selected characteristics of the study villages: number and name of village, island, average inbreeding coefficients computed from genealogical data and from isonymy, education level, occupation, diet, smoking, and body mass index. The table presents village data in three groups defined by the average level of inbreeding. The allocation of villages to these three groups according to the estimates of F based on genealogy (Fgen) is broadly consistent with those based on isonymy (Fiso). On a log–log scale, the correlation between the two measures of inbreeding across villages was 0.92, with Fiso exceeding Fgen on average by a factor of 1.32.

Table 2 presents age and sex standardised disease prevalence data for each of the 10 diseases under investigation. A stepwise increase in disease prevalence across villages stratified by (increasing) average inbreeding coefficient was found for gout, depression, peptic ulcer, schizophrenia, cancer, epilepsy, coronary heart disease, and asthma (the last two not statistically significant).

Table 3 includes data from 480 individuals from 14 villages, with age and sex forced into the multiple logistic regression model. Other main effects (inbreeding, smoking, log_weight, log_height, and village) were entered with p = 0.05, and all higher order effects and interactions with p = 0.01. Schizophrenia and epilepsy were excluded from this analysis because of the small number of cases (four each) and thus low study power to investigate predictors for these conditions. Inbreeding remained a significant (positive) predictor for every condition except type 2 diabetes. The forced inclusion of age and sex in the model acted to reduce slightly the significance of the effect of inbreeding, because of a small positive correlation between inbreeding and age. Village of residence was found to be a significant predictor only for coronary heart disease. Weight was a significant positive predictor for type 2 diabetes and gout and a significant negative predictor for cancer.

In terms of health impact, the results on population attributable risk (table 3) show that 23–48% of the incidence of these disorders (other than type 2 diabetes) in this population can be attributed to recent inbreeding.


The impact of inbreeding on reproduction, childhood mortality, and Mendelian disorders is well documented.2,3 In contrast, very little has been published on the effects of inbreeding on late onset diseases. This is despite the fact that inbreeding may have a greater influence on late onset traits than on traits that are subject to early selection.6,7 This study shows an important effect of inbreeding on several genetically complex late onset diseases which are of major public health importance. This is consistent with our proposal that an important genetic influence on these disorders is mediated by numerous deleterious recessive alleles, suggesting that inbreeding increases disease risk as a result of increased homozygosity.9

Validity of findings

It is important to consider whether these results can be explained by chance, bias, or confounding.


The fact that this was our major a priori hypothesis, taken together with the levels of statistical significance reported, argues strongly against chance as an explanation for these findings.


With respect to selection bias, the 480 individuals on whom we were able to obtain disease outcome data were a subset of the original cohort from 1979–81. However, census data revealed that the major reason for loss to follow up, other than deaths of cohort members, was emigration from the villages over the 20 year period, which should not result in substantial bias.

With respect to measurement bias, we do not believe that disease outcomes were ascertained or recorded differently among individuals who differed by inbreeding status. Standard and explicit clinical criteria were adopted by the two study doctors, who were blind to the inbreeding status of individuals. Furthermore, the results cannot be explained by different diagnostic practices in different villages, as the village term was not found to be statistically significant in the multiple logistic regression analysis (except for coronary heart disease).


Various potential confounding factors (age, sex, smoking status, education level, general diet, occupational group, height, weight) were measured and their effects adjusted for in the multivariate analysis. Although a degree of imprecision is inevitable in measuring some of these factors, we do not believe that confounding could have accounted for the large and consistent effects demonstrated.

Factors supporting the validity of the data

Several factors support the validity of the data. First, the findings support our prior hypothesis and are consistent with similar findings on hypertension in the same population10 and with other reports of inbreeding effects on blood pressure.8,11–15 Second, the overall strength of the effect argues against bias or confounding. Third, we have presented detailed arguments that biologically plausible mechanisms underpinning this effect are consistent with population genetic theory and observations in a wide range of organisms.9 Finally, the data are consistent with the few other published reports of inbreeding effects on late onset diseases in which inbreeding was measured rather than self reported (table 4).

Size of inbreeding effect in late onset diseases

The magnitude of the inbreeding effect on disease prevalence was large. However, the effect on prevalence of stroke and coronary heart disease, for example, is consistent with our previous report of a rise in diastolic blood pressure by 2 mm Hg for an increase in F of 0.01,10 and both cohort studies and randomised trials show that an increase of 5 mm Hg diastolic blood pressure is associated with a 33% increase in stroke risk and a 20% increase in risk of ischaemic heart disease.21 The effect size is supported by the only two previously published estimates of inbreeding on blood pressure that we could identify in other isolate populations.8,11

The large effect may reflect the greater influence of inbreeding on late onset traits than on traits that are subject to early selection.6,7 It is also possible that low environmental variation, or underestimation of F because of individuals being related through multiple lines of descent, contribute to the size of inbreeding effect in these isolate populations.8,11,15,22 Thus the magnitude of the inbreeding effect relative to the overall variation may be smaller in more environmentally diverse populations.

The ecological analysis at the village level (table 2) suggests an inbreeding effect on the prevalence of epilepsy and schizophrenia, but there were insufficient outcome events to permit an analysis at the individual level. The effect of inbreeding was shown in seven of the other eight diseases studied. The lack of observed effect of inbreeding on type 2 diabetes may reflect the lower heritability and stronger environmental influences involved in the aetiology of this condition23 or heritable components that are mainly additive or dominant rather than recessive.

Mechanism of inbreeding effects in late onset diseases

We have argued that the genetic component of late onset diseases may be due principally to large numbers of rare variants in numerous genes—the common disease/rare variant (CD/RV) hypothesis.9 The possibility that a significant fraction of the genetic variation in complex traits is caused by rare alleles maintained by mutation–selection balance is consistent with extensive research into the genetics of quantitative traits in simpler organisms.7 Recent estimates24 suggest that each person carries, on average, 500–1200 slightly deleterious mutations, most of which are rare and present in heterozygous form. Many of these variants will become homozygous in inbred individuals, who are expected to show correspondingly large effects. We have previously reported an estimate of several hundred recessive genes underlying human hypertension,10 consistent with a complex and genetically highly variable system of blood pressure control and with published work from spontaneous and engineered animal models of hypertension.25,26 The late onset disorders under investigation in this study are likely to be similarly complex at a physiological level so that significant effects of inbreeding are again expected.

The observed effect of inbreeding on the prevalence of several different late onset diseases is consistent with the presence of many deleterious recessive alleles located throughout the genome. It is also consistent with a more general effect of inbreeding, with increased homozygosity at these loci leading to an accumulation of small deleterious effects on homeostatic pathways, which cumulatively increase disease risk. This suggests a greater sensitivity of homeostatic mechanisms to inbreeding in later life, as predicted by findings in animals.5,6 Decay of homeostatic capacity would also be expected to lead to reduced capacity to respond appropriately to diverse stimuli. This is supported by the recently reported observation that the reduced survival found in inbred animals is greater in the natural habitat than in a controlled laboratory environment.27

The inbreeding data do not allow an easy distinction to be made between the relative contributions of common versus rare variants but do inform two somewhat neglected aspects of the genetic architecture underlying complex diseases.9 First, the results provide indirect evidence in support of a major polygenic component to disease susceptibility. The inbreeding coefficient is shown to be a significant predictor of coronary heart disease, stroke, cancer, depression, asthma, gout, and peptic ulcer, with population attributable risks varying between 23% ands 48% (table 3). Second, the recessive or partially recessive nature of complex disease susceptibility has received little emphasis. Both factors have implications for the identification of individual disease susceptibility alleles. If disease susceptibility is indeed highly polygenic then it implies the need to reduce the phenotypic complexity of “disease” by means of genetically simpler but contributory quantitative traits (QT) or disease subgroups. Those with extreme values of QT distributions or early disease age of onset will be those most likely to harbour susceptibility alleles of large effect and hence to provide a realistic possibility of gene identification. A significant component of genetic susceptibility appears to result from variants that are recessive or partially recessive. This implies that the study of inbred populations would be advantageous, as the increased gene dosage of such variants in inbred individuals will tend to amplify their phenotypic effects compared with outbred populations, where most alleles are present in heterozygotes.

Public health implications

The population attributable risk estimates from this study suggest that 23–48% of the incidence of the disorders showing an inbreeding effect in this population can be attributed to inbreeding. We have previously reported that 36% of hypertension incidence in this population can be attributed to inbreeding.10 These estimates are specific to this population and may be absent or considerably smaller in other populations. Nevertheless, inbreeding is highly prevalent globally and inbreeding effects may explain some of the observed differences in disease prevalence among different populations. Consanguineous marriages, defined as a union between individuals related as second cousins or closer (equivalent to F ⩾0.0156 in their progeny), have been conservatively estimated to occur at 1–10% prevalence among 2811 million people globally and at 20–50% prevalence among 911 million.28,29 In addition, the extent of inbreeding even in outbred populations may have been underestimated.30 Further details, including updated tables of global consanguinity studies, can be found at The global impact of inbreeding on late onset disorders of public health importance could therefore be significant in health economic terms. This effect may be mediated by the observed inbreeding effect on important physiological traits such as blood pressure10 and cholesterol,8 recently shown to be two of the most important determinants of the global burden of disease.31 As inbreeding declines owing to increased population movement and admixture, the prevalence of late onset disorders may also decline, and as common late onset diseases are correlated with longevity,32 this may influence life expectancy.


Age/sex profiles of the 14 study villages


Criteria adopted in this study for establishing diagnoses of 10 selected late onset diseases

1. Coronary heart disease (includes angina pectoris and myocardial infarction)

Angina pectoris

  • May be diagnosed by GP.

  • Clinical syndrome characterised by repeating episodes of precordial discomfort or pressure, typically precipitated by exertion or relieved by rest or sublingual glyceryl trinitrate with or without reversible ischaemic ECG changes.

Myocardial infarction

  • Must be diagnosed by a consultant in local general hospital.

  • Based on presenting symptoms (deep substernal radiating pain) and supported by combination of ECG findings (deep Q waves, elevated or depressed ST segments, deeply inverted T waves), and raised myocardial component of creatine kinase and lactic dehydrogenase, with or without myocardial imaging.

2. Cerebral stroke

  • Must be diagnosed by a consultant in local general hospital.

  • Based on presenting symptoms including variable neurological defects that increase over 24–48 hours.

  • Diagnosis may be supported by CT/MRI scan or arteriography.

3. Cancer

  • Must be diagnosed by a consultant in local general hospital.

  • Diagnosis requires abnormal cellular growth of any site to be histologically confirmed as malignant.

4. Diabetes type II

  • May be diagnosed by GP.

  • Symptomatic hyperglycaemia (polyuria, polydipsia, polyphagia, weight loss) or diabetic ketoacidosis or non-ketotic hyperglycaemic-hyperosmolar coma; or plasma (or serum) glucose level greater than 140 mg/dl after an overnight fast on two occasions; or development of any of the late complications (retinopathy, nephropathy, atherosclerotic changes on peripheral or coronary arteries, neuropathy).

5. Schizophrenia

  • Must be diagnosed by a consultant in local general hospital.

  • Chronic mental disorder (continuous signs of illness for at least six months) characterised by psychotic symptoms involving disturbances of thought, perception, feeling and behaviour

  • Psychotic symptoms such as delusions, hallucinations, and formal thought disorder; deterioration from previous level of functioning; a tendency toward onset before the age of 45.

  • Diagnosis should exclude mood (affective) disorder, organic mental disorder or mental retardation.

6. Manic depression

  • Must be diagnosed by a consultant in local general hospital.

  • Combination of symptomatic picture of depression, chronic course, family history, and response to treatment

  • Diagnosis may be supported by TRH stimulation test or dexamethasone suppression test.

7. Epilepsy

  • Must be diagnosed by a consultant in local general hospital.

  • Recurrent paroxysmal disorder characterised by sudden brief attacks of altered consciousness, motor activity, sensory phenomena, or inappropriate behaviour caused by abnormal excessive discharge of cerebral neurones.

  • Diagnosis may be supported by abnormalities in EEG, CT, or MRI.

8. Asthma

  • May be diagnosed by GP.

  • Airways obstruction that is usually reversible, presenting with attacks of tachypnoea, tachycardia, and audible wheezes, airway inflammation, and increased airways responsiveness to a variety of stimuli.

  • Diagnosis may be supported by a family history of allergy or asthma.

9. Gout

  • May be diagnosed by GP.

  • Acute gouty arthritis (recurrent acute mono/polyarticular pain in peripheral joints, often nocturnal, progressively more severe, with swelling, warmth, redness, and tenderness).

  • Diagnosis may be supported by any of the following: raised serum urate (greater than 7 mg/dl), demonstration of urate crystals in tissue or synovial fluid, or dramatic response (within 24 hours) to colchicine.

10. Peptic ulcer

  • Must be diagnosed by a consultant in local general hospital.

  • Circumscribed ulceration of the gastric or duodenal mucous membrane causing a chronic and recurrent burning, gnawing, aching, soreness, or empty feeling in the epigastrium.

  • Diagnosis must be supported by endoscopic findings and/or x ray studies with barium.


List of variables entered into multiple logistic regression

Table 1

Genetic and environmental characteristics of 14 village populations ranked according to average inbreeding coefficient computed from genealogical data

Table 2

Age and sex standardised prevalences of 10 complex diseases in groups of villages with relatively “high,” “moderate,” and “low” average inbreeding coefficient values

Table 3

Summary of results of multiple logistic regression

Table 4

Review of the studies investigating the effect of inbreeding on complex late onset diseases

Figure 1

Map of the Dalmatian island genetic isolate showing study islands and villages (numbered 1 to 14).


The initial field work in the 14 villages was funded by grants from the Smithsonian Institution, Washington DC, USA. The reconstruction of genealogies was supported partly by the Ministry of Science and Technology of the Republic of Croatia. The classification of disease status was sponsored partly by the University of Cincinnati, USA (from grant ES06096 from the National Institutes of Health, USA) and the Ministry of Science and Technology of the Republic of Croatia. Subsequent analysis was supported by the Wellcome Trust (IRDA) grant to HC and IR, the Croatian Ministry of Science and Technology (CMST) grants to PR, NS-N, and IR. IR was supported by funds from the UK Medical Research Council, the University of Edinburgh, and the Overseas Research Scheme.


View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.