Helper function to get all-vs-all repeat distances based on RepeatAlignment.
ListPairwiseDistance(
x,
dist.type = "hamming",
dna.model = "K80",
wmut = 1,
windel = 3.5,
wslippage = 1.75,
exclude.pos = NULL,
post.include = FALSE,
output.dist = "distance",
threads = 1
)
list
of repeats as DNAStringSet
distance calculation method if no gendist matrix is supplied [default: hamming]
a character string specifying the evolutionary model to be
used (see ape::dist.dna
) [default: K80]
weight for nucleotide point mutations [default: 1]
weight for unit insertions/deletions [default: 3.5]
weight for single unit duplications [default: 1.75]
positions of the repeat that will be excluded before distance calculation [default: NULL]
boolean if excluded positions should be added back for the sequence alignment after distance calculation [default: FALSE]
specify if either the distance
or
sequence.dist
should be returned (see
repeatR::RepeatAlignment
) [default: distance]
number of parallel threads [default: 1]
Vara C et al. (2019) PRDM9 Diveristy at Fine Geographical Scale Reveals Contrasting Evolutionary Patterns and Functional Constraints in Natural Populations of House Mice. Molecular Biology and Evolution, 36(8), 1686-1700.
##load example sequence data
data("mousePRDM9", package="repeatR")
myRepPattern<-"PY"
myRepLength<-84
mousePRDM9.random<-sample(mousePRDM9, 20)
mousePRDM9.random.split<-repeatR::splitRepByPattern(mousePRDM9.random,
myRepPattern, myRepLength)
##get distance for all-vs-all comparison excluding highly variable sites
dist.mat.hamming.exclude.pos<-repeatR::ListPairwiseDistance(
x=mousePRDM9.random.split$cds,
dist.type="hamming",
wmut=1,
windel=3.5,
wslippage=1.75,
exclude.pos=c(37:39,46:48,55:57),
post.include=FALSE,
output.dist="distance")
##calculate bionj tree from resulting distances and write tree in newick
##format
mousePRDM9.random.bionj<-ape::bionj(as.dist(dist.mat.hamming.exclude.pos))
ape::write.tree(mousePRDM9.random.bionj)
#> [1] "((((((((((KF462468.1_cds_AHA80564.1_1:5.625,KF462472.1_cds_AHA80568.1_1:9.125):0.5215839744,KF462405.1_cds_AHA80501.1_1:1.116127968):0.1595143378,KF462462.1_cds_AHA80558.1_1:0.7435913086):1.524322033,((AB844002.1_Mmd_C3H.Ttf/t12+:0,AB844007.1_Mmd_tw2/tw2:0):8.212776184,MK848115.1_cds_QCI31693.1_1:1.787223339):0.3978980184):0.481610626,((KF462446.1_cds_AHA80542.1_1:0.8201847076,KF462433.1_cds_AHA80529.1_1:0.1798152924):0.9525851011,KF462411.1_cds_AHA80507.1_1:1.502451539):0.6432133913):2.269659758,KF462503.1_cds_AHA80599.1_1:5.861231804):0.5911666155,AB843897.1_Mmd_SJL/J:0.5102273226):0.9869502783,AB843977.1_Mmm_MG3104:0.4883626699):1.249399662,(AB843981.1_Mmm_MG2127:0.0478041172,(KF462482.1_cds_AHA80578.1_1:1.01809442,MK848142.1_cds_QCI31720.1_1:4.73190546):0.9405090809):0.6914787889):0.263219893,AB843858.1_Mmmol_MG201:0,(KF462479.1_cds_AHA80575.1_1:0,(AB843861.1_Mmmol_MG495:0,AB843990.1_Mmm_BLG2/Ms:0):0):0);"
plot(mousePRDM9.random.bionj)