Analyses of pig genomes provide insight into porcine demography and evolution

ResearchBlogging.orgcover
Pig domestication has started over 10 000 years ago and has had important consequences on human life, changing our agricultural and medical practices. Much has been argued on whether pig was domesticated independently across multiple locations or it was adopted by humans only once and then transported elsewhere. Originally, pig (Sus scrofa) has emerged in the South East Asia during the early Pliocene (~5.3–3.5 Myr ago) and then spread across most of the Eurasian continent. Yet, unraveling the true story of the pig domestication has become possible only recently, with a publication of a near complete pig genome by Groenen et al. featuring the Nature front cover in the November issue 7424, 2012.

 Genome assembly

The research team (RT hereafter) made impressive efforts on genome sequencing and assembly. The genome was sequenced with both BAC and NGS technologies. For NGS the RT used 44bp paired-end Illumina library, which was likely the headliner technology from Illumina at the time of the initiation of the porcine genome project. In total, the RT obtained 2.60 Gb of sequencing data and thanks to BAC could assign scaffolds (2.5 Gb) quite precisely to 20 (18+X and Y) chromosomes, leaving only 212 Mb of unplaced scaffolds.

The RT used a very clever strategy for the genome annotation. After screening for repeats, they run a three-staged procedure to ensure that the annotation is of a high quality. First, they run so-called targeted-stage. For this stage, they downloaded from UniProt SwissProt/TrEMBL all pig protein sequences and then matched them against assembled genome with Exonerate/Genewise to predict the models of coding sequence. Second, they applied similarity stage, where they generated additional coding models using proteins from the related species. Thirdly, they obtained RNA-seq data and used it them  to search for expressed regions in the genome. Finally, they obtained a consensus prediction for the transcripts based on these three stages (see Fig.1).

Composition of pig transcripts
Figure 1. Composition of pig transcripts: The final gene set consists of 21,640 protein coding genes, including mitochondrial genes, these contain 23,118 transcripts. A total of 380 pseudogenes were
identified and 2,965 non-coding RNAs (ncRNAs). Of the protein coding transcripts 3,605 were made from RNASeq only, 15,072 came from proteins from other species. 3,959 transcripts were pig specific, 37 transcripts were mitochondrial.

Evolution of porcine genome compared to other mammalian genomes
As the first analysis, the RT ran the comparison of pig genome with five more mammalian genomes – human, mouse, dog, horse and cow.

First, they extracted only 1:1 orthologous from the total gene set, used branch-site model to calculate the dN/dS ratio and analyzed proteins with accelerated evolution for enrichment for specific biological processes. In general, dS (0.160) and dN/dS ratio (0.144) in pig lineage were similar to other mammals (0.138–0.201) except for the mouse (0.458), suggesting similar evolutionary rates and intermediate level of purifying selection. Specific pathways significantly (P< 0.05) enriched within the pig group are shown on Fig. 2. However, it was not entirely clear if the pathway analysis was corrected for multiple testing.

accelerated evolution of pig genome
Figure 2. KEGG pathways with genes that show accelerated evolution for each of the six mammals used in the dN/dS analysis. The bar charts show the individual dN/dS and dS values for each of the six mammals. The dN/dS and dS values refer to the time period of each of the six individual lineages. The number of proteins that show significantly accelerated dN/dS ratios in each lineage varies from 84 in the mouse to 311 in the pig lineage. Pathways significantly (P?<?0.05) enriched within this group of genes are also shown with the number of genes shown in brackets. HPI, Helicobacter pylori infection.

Second, the RT attempted to analyze a subset of immune-related genes to find out which proteins evolve especially fast and also which gene families have experienced gene-family expansions. The RT thoroughly annotated genes related to immunity and compiled an impressive dataset on gene duplications. However, the analysis itself could have been greatly improved if it was run with a specific hypothesis in mind or at least if there was a more detailed explanation on how exactly gene family evolution was investigated. Have they used a birth-death model? Was it a simple Student t-test? Unfortunately, the Supplementary Information just like the article itself, leave the reader only wondering about it. The results of the “unknown” analysis, nevertheless, suggest that in porcine genome there are expansions of IFN, IL1B, CD36, CD68, CD163, CRP and IFIT1 genes. Interested reader may want to check out yet another paper from the same RT that deals exclusively with the evolution of the porcine immunome.

I am going to put aside the whole part on genome re-arrangements, conservation of synteny and evolutionary breakpoints as not to expand the blog post to unreadable size.

Analyze population divergence and domestication
The second part of the article deals primarily with the history of the pig population and highlights some interesting insights into pig domestication.

To run this part of the analysis, the RT has additionally sequenced 4 Asian and 6 European wild boars and compiled a dataset on single nucleotide polymorphisms for each individual (the dataset included both wild individuals and domesticated breeds from the previous part). As expected, the nucleotide heterozygocity is higher among Asian wild boars compared to European (see Fig. 3a). Furthermore, high peaks of low heterozygocity among European wild boars may evidence a past bottleneck which the population experienced when migrating out from Asia. In contrast, the nucleotide heterozygocity of pig breeds (i.e. domesticated pig)  shows quite a surprising pattern (Fig. 3b): (1) the genetic heterozygocity is relatively similar between European and Asian breeds; and (2) not only this diversity is only mildly reduced in Asian breeds compared to Asian wild individuals, but also the diversity of European breeds is higher compared to European wild boars. What might explain such pattern?

heterozygosity of pigs
Figure 3. The distribution of the heterozygosity as the log2(SNPs) per 10k bin. a, Wild Sus scrofa: blue, south China; green, north China; orange,?Italian; red, Dutch. b, Breeds: blue, Chinese breeds (Jiangquhai, Meishan, Xiang); red–yellow, European breeds (Hampshire, large white, landrace). Note that the Hampshire breed is a North American breed of European origin.

Sadly, the story behind this pattern is greatly simplified in the article itself, and lots of interesting details and discussions are left aside for the Supplementary Materials. Just in brief, the RT found a clear signal of admixture between wild Asian and European boars and between European and Asian breeds. For wild individuals they suggest that it may be multiple migrations of Asian wild boars across Eurasia during the later stage of the Pleistocene that produced an admixture signal between wild boars.

For domesticated breeds the story is rather complex and includes multiple causes of admixture such as trading of Asian breeds and mixing it with European ones in late 18th and 19th centuries, multiple domestication origins, husbandry practices and incomplete lineage sorting. The only clear conclusions are that (1) domestication of wild boar has happened at least twice and independently in Asia and Europe (2) there were numerous admixture events between wild boars from Asia and Europe (late Pleistocene), between domesticated breeds with wild boars in both Europe and Asia (over 10 000 years), and finally between domesticated breeds of Europe and domesticated breeds of Asia (18-20th centuries).

The second part of the story tells us about the demographic history of Asian and European populations. For this analysis the RT used a pair-wise sequentially Markovian coalescence model. They find that the population of wild boar has increased in size once arriving into Europe, but then both European and Asian population showed a decline in population size and this decline is much more pronounced in European population (Fig. 4). The RT team suggests that population size declines due to climatic oscillations with a down peak around Last Glacial Maximum (20 000 years ago). However, personally, I would not disregard hunting practices as potential cause of wild boar population decline, because humans have already populated Eurasia at that time and probably knew how to hunt boars. Another detail that could have been worth adding is confidence intervals on demographic history – for now the reader is left with a guess on how accurate the observed lines are.

Demography of Pig
Figure 4. Demographic history was inferred using a hidden Markov model (HMM) approach as implemented in pairwise sequentially Markovian coalescence (PSMC)

Lastly, the RT looked at the selective sweeps on the pig genome and for this part of analysis they followed strategy from Green et al. 2010. In general, they find that regions with putative selective sweeps are over-represented with genes involved into RNA splicing and RNA processing. However, they exclude completely from the analysis non-protein coding regions, while it would be clearly interesting to look at given that they find a quite some sweeps associated with genes involved with RNA splicing and processing.

In sum, the paper is incredibly dense, includes various kinds of analyses and results, and to me looks like an overview of 10 years of great work of dozens of people. I believe we are going to see (and already seeing) more detailed papers on each section of this article and each of these future publications will unravel fascinating details on evolution of pig genome from different biological perspectives.

Groenen MA, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y, Rothschild MF, Rogel-Gaillard C, Park C, Milan D, Megens HJ, Li S, Larkin DM, Kim H, Frantz LA, Caccamo M, Ahn H, Aken BL, Anselmo A, Anthon C, Auvil L, Badaoui B, Beattie CW, Bendixen C, Berman D, Blecha F, Blomberg J, Bolund L, Bosse M, Botti S, Bujie Z, Bystrom M, Capitanu B, Carvalho-Silva D, Chardon P, Chen C, Cheng R, Choi SH, Chow W, Clark RC, Clee C, Crooijmans RP, Dawson HD, Dehais P, De Sapio F, Dibbits B, Drou N, Du ZQ, Eversole K, Fadista J, Fairley S, Faraut T, Faulkner GJ, Fowler KE, Fredholm M, Fritz E, Gilbert JG, Giuffra E, Gorodkin J, Griffin DK, Harrow JL, Hayward A, Howe K, Hu ZL, Humphray SJ, Hunt T, Hornshøj H, Jeon JT, Jern P, Jones M, Jurka J, Kanamori H, Kapetanovic R, Kim J, Kim JH, Kim KW, Kim TH, Larson G, Lee K, Lee KT, Leggett R, Lewin HA, Li Y, Liu W, Loveland JE, Lu Y, Lunney JK, Ma J, Madsen O, Mann K, Matthews L, McLaren S, Morozumi T, Murtaugh MP, Narayan J, Nguyen DT, Ni P, Oh SJ, Onteru S, Panitz F, Park EW, Park HS, Pascal G, Paudel Y, Perez-Enciso M, Ramirez-Gonzalez R, Reecy JM, Rodriguez-Zas S, Rohrer GA, Rund L, Sang Y, Schachtschneider K, Schraiber JG, Schwartz J, Scobie L, Scott C, Searle S, Servin B, Southey BR, Sperber G, Stadler P, Sweedler JV, Tafer H, Thomsen B, Wali R, Wang J, Wang J, White S, Xu X, Yerle M, Zhang G, Zhang J, Zhang J, Zhao S, Rogers J, Churcher C, & Schook LB (2012). Analyses of pig genomes provide insight into porcine demography and evolution. Nature, 491 (7424), 393-8 PMID: 23151582

Genomic analysis of a key innovation in an experimental Escherichia coli population

ResearchBlogging.org

In this paper Richard E. Lenski and colleague are showing an example of how efficient adaptation by natural selection is. During 20 year they have been growing twelve populations of Escherichia coli in glucose medium containing also abundant citrate, this famous long-term evolution experiment called LTEE allowed the evolution to run for 40’000 generations in controlled laboratory conditions. Surprisingly after 31’000 generations some mutants appeared that were able to use citrate, as a source of carbon instead of glucose, the inability to use citrate is one of the frame that define E.coli as a species. Researcher Zachary D. Blount dissected all the mechanism in using whole genome re-sequencing of 29 clones along the population history (Fig.1). Three steps are analyzed in details, as they were necessary to permit the evolution of Cit+ trait, potentiation, actualization and refinement.

Symbols at branch tips mark 29 sequenced clones; labels are shown for clones mentioned in main text and figures. Shaded areas and coloured symbols identify major clades. Fractions above the tree show the number of clones belonging to the clade that yielded Cit+ mutants during replay experiments (numerator) and the corresponding total used in those experiments (denominator). Inset shows number of mutations relative to the ancestor. The solid line is the least-squares linear regression of mutations in non-mutator genomes; the dashed line is the corresponding regression for mutator genomes.
Symbols at branch tips mark 29 sequenced clones; labels are shown for clones mentioned in main text and figures. Shaded areas and coloured symbols identify major clades. Fractions above the tree show the number of clones belonging to the clade that yielded Cit+ mutants during replay experiments (numerator) and the corresponding total used in those experiments (denominator). Inset shows number of mutations relative to the ancestor. The solid line is the least-squares linear regression of mutations in non-mutator genomes; the dashed line is the corresponding regression for mutator genomes.

The phylogeny (Fig.1) highlight two clades that coevolved and were maintain with the new Cit+ clade around generation 20’000 (Fig 1.). Moreover after the evolution of the citrate clade around 36’000, bacteria that have Cit+ phenotype have a SNP that is causing a defect in methyl directed mismatch DNA repair such that mutants accumulate SNPs much faster than any other clades. (Fig.1 inset).

To analyze the three steps leading to the evolution of this new trait, bacteria are ideal organism as they can be frozen and revive, with this capacity replay experiments were run and a proportion of new Cit+ mutants were re-discovered in the three clades (Fig.1) with a significantly higher proportion for the third  clade.

1) The first step of potentiation is represented by the clones of the three clades at generation 20’000 with different level of potentiation implying multiple potentiating mutation. It’s the genetic background that evolve and allow a function to be accessible by mutation. Phylogeny tells us that at least two mutations occur, one before that the three clades diverge and another one in Clade 3. The replay experiments showed that potentiation play an important role for the rise of the new function. This mechanism of potentiation can be explained in two different ways, first epistasis between the background and the mutation, this interaction will express the phenotype and second the background physically promoted the final mutation. In testing this they were not able to show evidence for one or the other hypothesis. Little clue tip the balance in favor of the epistasis hypothesis. But the arguments here are not enough convincing therefore the author focused on identifying the mutational basis of actualization.

2) Actualization is the step where weak mutants appear and is able phenotypically to use citrate; it’s also the most tractable step. The inability to transport citrate is the reason why E.coli cannot use it as a source of carbon in oxic environment. Thus in the Cit+ clones and only on these clones they found that they all have at least two tandem copies of a citrate fermentation operon (cit) that contain a citrate/succinate antiporter ,ciT (Fig. 2)

a, Ancestral arrangement of citG, citT, rna and rnk genes. b, Altered spatial and regulatory relationships generated by the amplification.
a, Ancestral arrangement of citG, citT, rna and rnk genes. b, Altered spatial and regulatory relationships generated by the amplification.

The explanation proposed by the author to the arising of this new trait is that the second citT of figure 2b is using a rnk promoter that is downstream the old citT but now due to the amplification it’s found upstream of the new citT, thus being able to be transcribe in a fusion protein Rnk-CitG-CitT.

To test this hypothesis they first tested the ability of rnk-citT module to spport citT expression in oxic environment. They constructed three low copy plasmids with rnk, citT and rnk-citT. After transformation of each plasmid in three strains, one ancestral (REL606), one potentiated clone of clade 3 (ZDB30) and one weak mutant (ZDB172) they found out that they was expression (in red in Fig. 3) of the rnk-citT module in the two potentiated and weak mutant clones. These indicate that with the rnk-citT transformation there is citT expression under oxic conditions for potentiated clones.

Average time course of expression for the ancestral strain REL606 (left) and evolved clones ZDB30 (centre) and ZDB172 (right), each transformed with lux reporter plasmids pCDcitTlux (blue), pCDrnklux (black) and pCDrnk-citTlux (red). Expression was measured as counts per second (c.p.s.). Each curve shows the average of four replicates.
Average time course of expression for the ancestral strain REL606 (left) and evolved clones ZDB30 (centre) and ZDB172 (right), each transformed with lux reporter plasmids pCDcitTlux (blue), pCDrnklux (black) and pCDrnk-citTlux (red). Expression was measured as counts per second (c.p.s.). Each curve shows the average of four replicates.

This lead to another question; does the transformation of this module into clones give a Cit+ phenotype? To test this they simply inserted a rnk promoter upstream of the natural copy of citT found in the genome (Fig. 4). The construct (ZDB595 in blue) made with the potentiated clone ZDB30 and the insertion of rnk promoter show in figure 4 a weak Cit+ phenotype with an increase growth compared to the red curve (potentiated clone ZDB30) and other curve that are Cit+ phenotypes and have better growth on citrate.

a, Engineered construct containing the cit amplification junction with the rnk promoter, and the arrangement following its insertion into the chromosome. b, Average growth trajectories in DM25 of potentiated clone ZDB30 (red), its isogenic construct with chromosomal integration of the rnk-cit module ZDB595 (blue), 31,500-generation Cit+ clone ZDB564 (purple) and 32,000-generation Cit+ clone ZDB172 (green). c, Trajectories of ZDB595 compared to its parent. Light blue trajectories show heterogeneity among replicate cultures of ZDB595; dark blue and red are averages for ZDB595 and its parent, ZDB30, as in panel b (except different scale).
a, Engineered construct containing the cit amplification junction with the rnk promoter, and the arrangement following its insertion into the chromosome. b, Average growth trajectories in DM25 of potentiated clone ZDB30 (red), its isogenic construct with chromosomal integration of the rnk-cit module ZDB595 (blue), 31,500-generation Cit+ clone ZDB564 (purple) and 32,000-generation Cit+ clone ZDB172 (green). c, Trajectories of ZDB595 compared to its parent. Light blue trajectories show heterogeneity among replicate cultures of ZDB595; dark blue and red are averages for ZDB595 and its parent, ZDB30, as in panel b (except different scale).

3) Finally refinement is the increase in efficiency of the new function by probably additional mutations. Here the main mutations that occur in the early Cit+ mutants are the increase in the number of copy of the cit operon (Fig.5a) with stabilization on later generations to three copies. To test if the number of copy influence the Cit+ phenotype, they cloned the module rnk promoter and citT into a multi-copy plasmid and transformed it into the potentiated strain ZDB30 (in red Fig. 5c). The new build strain ZDB612 (in green Fig.5c) is strongly Cit+ and is growing on citrate as a Cit+ clone (CZB152 in black), which is a good clue in favor of the refinement of Cit+ phenotype by the multiplication of the cit operon.

a, Change in rnk-citT module copy number in sequenced Cit+ clones over time. b, Structure of rnk-citT module cloned into high-copy plasmid pUC19 to produce pZBrnk-citT. c, Growth trajectories in DM25 of potentiated Cit– clone ZDB30 (red), 31,500-generation Cit+ clone ZDB564 (purple), 33,000-generation Cit+ clone CZB152 (black), and ZDB612, a pZBrnk-citT transformant of ZDB30 (green). Each trajectory is the average of six replicates.
a, Change in rnk-citT module copy number in sequenced Cit+ clones over time. b, Structure of rnk-citT module cloned into high-copy plasmid pUC19 to produce pZBrnk-citT. c, Growth trajectories in DM25 of potentiated Cit– clone ZDB30 (red), 31,500-generation Cit+ clone ZDB564 (purple), 33,000-generation Cit+ clone CZB152 (black), and ZDB612, a pZBrnk-citT transformant of ZDB30 (green). Each trajectory is the average of six replicates.

The neo-functionalization created by gene duplication was already observed in comparative studies, but this long-term experiment is a nice opportunity to study the history of the citrate transporter gene duplication and promoter capture that give rise to a new adaptive function. This example shows also the importance of the evolution of the potentiated genetic background that will allow a last mutation to give rise to the function.

Blount, Z., Barrick, J., Davidson, C., & Lenski, R. (2012). Genomic analysis of a key innovation in an experimental Escherichia coli population Nature, 489 (7417), 513-518 DOI: 10.1038/nature11514

Genome Patterns of Selection and Introgression of Haplotypes in Natural Populations of the House Mouse (Mus musculus)

ResearchBlogging.org

 

comic1

1. How genomes evolve in natural populations? is a question that, despite to be a long-standing search for geneticists, recent molecular genomic approaches may help to understand. Their evolution among natural populations may be shaped by forces derived from different processes, such as as mutation, neutral evolution, negative or positive selection and demographic changes (Nosil & Feder 2011). In addition, the study of population divergences and species formation has an important temporal context. That is, the point in time during the evolutionary trajectory of the set of populations or related species we aim to study, can display different patterns. For instance, during initial stages the level of genome differentiation may be low. Then, as divergence proceeds (by intrinsic or extrinsic forces) this level is expected to increase in the genome, in either a clustered or an interspersed way. Further, hybridization between differentiated populations may produce the secondary infiltration of genomic regions of one population into the other. The study of the genomic complexity in natural populations may shed light into the population dynamics and the relevant processes during species evolution.

2. Staubach et al. (2012) used natural populations of house mice (Mus musculus) subspecies (M. m. domesticus which does not refer to a domesticated subspecies and M. m. musculus) to detect pattern of selection and introgression at a genomic scale. A large distribution across the world on varied habitats has characterized the past few thousand years of house mouse evolution. Two natural populations each subspecies were used. The proposed model of demographic history is depicted in figure 1 on the article [ ! ]. Mice were collected from 4 populations in Germany, France, Czech Republic and Kazakhstan, respectively. Eleven unrelated wild individuals per population were genotyped using Affymetrix Mouse Diversity Array for a total approx. 470,000 high quality SNPs. SNP calling was less reliable for M.m. musculus and lower polymorphic sites may reflect ascertainment bias towards M. m. domesticus [ + more about ascertainment bias see http://www.ncbi.nlm.nih.gov/books/NBK9792/]. Neighbor-joining trees based on the SNP set evidenced the population clustering of individuals (Ger,Fra),(Cze,Kaz). Some genomic regions, i.e. 6 MB in chr6, revealed an introgression pattern in M.m. musculus Kaz by M.m. domesticus haplotypes (Figure 1).

[ ! ] The demographic model assumes equal population sizes for the ancestral subspecies (N0) and for actual populations (N1), which could be not the case and events of bottleneck, range expansions, and effective population size (Ne) differences between populations might distort the discussed patterns.

Fig1

Figure 1. Neighbor joining trees of population samples, based on Manhattan distances. The scale bars refer to the number of substitutions per SNP genotyped. (A) For all autosomal SNPs in the study, grouped by sample and rooted by outgroup species. (B) For all haplotypes of a 6MB region on chromosome 6. Some haplotypes of Kaz group with M. m. domesticus due to introgression.

 3. The estimates of linkage disequilibrium (LD) provided an indicator of the size of haplotypes physical linked, which when recognizing selected and diverged loci, may represent the expected size of genomic islands of differentiation. As we move away from the selected loci we expect a decay in LD [ + ]. Here, a linkage of SNPs beyond 100 kb is rare (Figure 2). Kaz and Cze populations showed the strongest and slower decay patterns, respectively. The first could be related with the high genetic diversity found in this population, and the latter pattern probably due to the introgressed regions. However, the removal of the introgressed regions does not eliminate completely the difference of Cze and the other populations [ ! ].

[ + ] LD plot presented here uses r2, defined as the correlation coefficient of the SNP frequencies. A value of 1 is called Perfect LD. r2 is affected by recombination rate and Ne.

[ ! ] The slow r2 decay in Cze after removal of introgression may indicate further effects of either small population sizes or the level of genetic diversity in this population, or both. This could have been discussed in the study by including details from previous reported microsatellite screens and considering additional demographic models.

fig2

Figure 2. Pairwise LD (r2) between markers over distance for each of the populations in the study.

 4. Selective sweep scans were based on the XP-CLR statistic [ + ], Rsb, delta-DAF and Fst. Thresholds (through simulations and constraints) were defined to visualize the distribution of potential selective sweep signatures across autosomes. Detected peaks seem to be scattered across chromosomes, few of them clustered and particular to specific populations (i.e. chr 6 presented three clusters of peaks for Kaz population, Figure 3). Summary statistics showed striking differences in the number of significant regions when using one or other method. Several loci were identified under potentially recent positive selection with non-significant GO category enrichment.

[ + ] In basics, XP-CLR statistic or “cross-population composite likelihood ratio test” uses population differentiation as a means to detect regions under selection and hence is particularly robust against ascertainment bias. It works with biallelic multilocus data, searching for regions in the genome, where the change in allele frequency at the locus occurred too quickly (as assessed by the size of the affected region) to be due to random drift. Details of this method are presented in Chen et al. (2010).

Fig3

Figure 3. Distribution of selective sweep signatures as determined by the XP-CLR statistic for the two population comparisons across all autosomes.

 5. Introgression was evaluated by comparison of focal vs two potential source populations. Each comparison includes the population of the same subspecies as one possible source, and the two populations from the other subspecies as alternative source. Large introgressed haplotypes were detected in Cze from M. m. domesticus, these regions were distributed across all chromosomes. Staubach et al. decided to compare the introgression patterns with a neutral migration-introgression model by simulating dataset with different population sizes and migration rates. The results suggested that the observed frequency of introgressed haplotypes would require higher migration rates than in a neutral model (4Nm=1). However, with these migration rates the expected fraction of introgressed genome should be higher than the observed for the Ger, Fra and Kaz populations. The authors suggested and emphasized that the large fraction of introgressed haplotypes is subject to forces that lead to a faster increase in frequency than expected by chance (i.e. a deterministic or selected process) [ ! ]. Incomplete lineage sorting was discarded by simulations without migration. Some of the introgressed regions were visualized as colored-blocks using UCSC browser.

[ ! ] Despite the the authors refrained from an even deeper exploration of the demographic parameter space, we discussed that the oversimplified demographic model used may neglect relevant differences in populations sizes or migration rate asymmetries. The mosaic nature of introgression has been reviewed in different systems by Arnold & Noland (2009). Here, the role of positive selection in shaping the genomes of house mouse populations, via introgressed haplotypes, is not completely understood.

 Finally, the differentiation between house mouse subspecies evidenced a stage of genomic divergence similar to closely related species pairs (check the previously discussed Flycatcher’s genome paper). These two articles evidenced that genomic divergences between two sister species (completely separated lineages) and between subspecies (closely related populations) seems do not to involve few genomic regions or single loci, but differentiation across all chromosomes.

 References

Arnold, M. & Noland, M. (2009). Adaptation by introgression. Journal of Biology 2009, 8:82 (http://jbiol.com/content/8/9/82)

Nosil, P. & Feder, J. (2012). Genomic divergence during speciation: causes and consequences. Phil. Trans. R. Soc. B. vol. 367 (1587), 332-342.

Chen, H., Patterson, N. Reich, D. (2010) Population differentiation as a test for selective sweeps. Genome Res. 20 (3): 393-402.

Staubach F, Lorenc A, Messer PW, Tang K, Petrov DA, & Tautz D (2012). Genome patterns of selection and introgression of haplotypes in natural populations of the house mouse (Mus musculus). PLoS genetics, 8 (8) PMID: 22956910