Helper function to get all-vs-all repeat distances based on RepeatAlignment.

ListPairwiseDistance(
  x,
  dist.type = "hamming",
  dna.model = "K80",
  wmut = 1,
  windel = 3.5,
  wslippage = 1.75,
  exclude.pos = NULL,
  post.include = FALSE,
  output.dist = "distance",
  threads = 1
)

Arguments

x

list of repeats as DNAStringSet

dist.type

distance calculation method if no gendist matrix is supplied [default: hamming]

dna.model

a character string specifying the evolutionary model to be used (see ape::dist.dna) [default: K80]

wmut

weight for nucleotide point mutations [default: 1]

windel

weight for unit insertions/deletions [default: 3.5]

wslippage

weight for single unit duplications [default: 1.75]

exclude.pos

positions of the repeat that will be excluded before distance calculation [default: NULL]

post.include

boolean if excluded positions should be added back for the sequence alignment after distance calculation [default: FALSE]

output.dist

specify if either the distance or sequence.dist should be returned (see repeatR::RepeatAlignment) [default: distance]

threads

number of parallel threads [default: 1]

References

Vara C et al. (2019) PRDM9 Diveristy at Fine Geographical Scale Reveals Contrasting Evolutionary Patterns and Functional Constraints in Natural Populations of House Mice. Molecular Biology and Evolution, 36(8), 1686-1700.

See also

Author

Kristian K Ullrich

Examples

##load example sequence data
data("mousePRDM9", package="repeatR")
myRepPattern<-"PY"
myRepLength<-84
mousePRDM9.random<-sample(mousePRDM9, 20)
mousePRDM9.random.split<-repeatR::splitRepByPattern(mousePRDM9.random,
    myRepPattern, myRepLength)
##get distance for all-vs-all comparison excluding highly variable sites
dist.mat.hamming.exclude.pos<-repeatR::ListPairwiseDistance(
    x=mousePRDM9.random.split$cds,
    dist.type="hamming",
    wmut=1,
    windel=3.5,
    wslippage=1.75,
    exclude.pos=c(37:39,46:48,55:57),
    post.include=FALSE,
    output.dist="distance")
##calculate bionj tree from resulting distances and write tree in newick
##format
mousePRDM9.random.bionj<-ape::bionj(as.dist(dist.mat.hamming.exclude.pos))
ape::write.tree(mousePRDM9.random.bionj)
#> [1] "((((((((((KF462468.1_cds_AHA80564.1_1:5.625,KF462472.1_cds_AHA80568.1_1:9.125):0.5215839744,KF462405.1_cds_AHA80501.1_1:1.116127968):0.1595143378,KF462462.1_cds_AHA80558.1_1:0.7435913086):1.524322033,((AB844002.1_Mmd_C3H.Ttf/t12+:0,AB844007.1_Mmd_tw2/tw2:0):8.212776184,MK848115.1_cds_QCI31693.1_1:1.787223339):0.3978980184):0.481610626,((KF462446.1_cds_AHA80542.1_1:0.8201847076,KF462433.1_cds_AHA80529.1_1:0.1798152924):0.9525851011,KF462411.1_cds_AHA80507.1_1:1.502451539):0.6432133913):2.269659758,KF462503.1_cds_AHA80599.1_1:5.861231804):0.5911666155,AB843897.1_Mmd_SJL/J:0.5102273226):0.9869502783,AB843977.1_Mmm_MG3104:0.4883626699):1.249399662,(AB843981.1_Mmm_MG2127:0.0478041172,(KF462482.1_cds_AHA80578.1_1:1.01809442,MK848142.1_cds_QCI31720.1_1:4.73190546):0.9405090809):0.6914787889):0.263219893,AB843858.1_Mmmol_MG201:0,(KF462479.1_cds_AHA80575.1_1:0,(AB843861.1_Mmmol_MG495:0,AB843990.1_Mmm_BLG2/Ms:0):0):0);"
plot(mousePRDM9.random.bionj)