Helper function to get all repeat alignments for a list of repeats against one repeat.
ListRepeatAlignment(
x,
y,
dist.type = "hamming",
dna.model = "K80",
wmut = 1,
windel = 3.5,
wslippage = 1.75,
exclude.pos = NULL,
post.include = FALSE,
threads = 1
)
list
of repeats as DNAStringSet
repeat as DNAStringSet
distance calculation method if no gendist matrix is supplied [default: hamming]
a character string specifying the evolutionary model to be
used (see ape::dist.dna
) [default: K80]
weight for nucleotide point mutations [default: 1]
weight for unit insertions/deletions [default: 3.5]
weight for single unit duplications [default: 1.75]
positions of the repeat that will be excluded before distance calculation [default: NULL]
boolean if excluded positions should be added back for the sequence alignment after distance calculation [default: FALSE]
number of parallel threads [default: 1]
Vara C et al. (2019) PRDM9 Diveristy at Fine Geographical Scale Reveals Contrasting Evolutionary Patterns and Functional Constraints in Natural Populations of House Mice. Molecular Biology and Evolution, 36(8), 1686-1700.
##load example sequence data
data("mousePRDM9", package="repeatR")
myRepPattern<-"PY"
myRepLength<-84
mousePRDM9.random<-sample(mousePRDM9, 20)
mousePRDM9.random.split<-repeatR::splitRepByPattern(mousePRDM9.random,
myRepPattern, myRepLength)
##get the repeat with max number of repeat units
mousePRDM9.random.split.repeat.units<-unlist(lapply(
mousePRDM9.random.split$cds, length))
max.pos<-which(mousePRDM9.random.split.repeat.units==
max(mousePRDM9.random.split.repeat.units))
##select one/the longest repeat as the target sequences
target.repeat<-mousePRDM9.random.split$cds[[sample(max.pos,1)]]
##align all repeats in the list to the target repeat
mousePRDM9.random.alg<-repeatR::ListRepeatAlignment(
mousePRDM9.random.split$cds, target.repeat)
mousePRDM9.random.alg
#> DNAStringSet object of length 20:
#> width seq names
#> [1] 1260 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG KF462449.1_cds_AH...
#> [2] 1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843958.1_Mmm_MG...
#> [3] 1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG KF462479.1_cds_AH...
#> [4] 1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843967.1_Mmm_MG...
#> [5] 1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843921.1_Mmc_HI306
#> ... ... ...
#> [16] 840 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG MK848111.1_cds_QC...
#> [17] 756 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG MK848092.1_cds_QC...
#> [18] 840 -----------------------...GAGGACACATACAAGAGAGAAG KF462499.1_cds_AH...
#> [19] 756 CCCTATGTTTGCAGGGAGTGTGG...---------------------- KF462496.1_cds_AH...
#> [20] 672 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843953.1_Mmm_MG686
##change settings as for the RepeatAlignment function