Helper function to get all repeat alignments for a list of repeats against one repeat.

ListRepeatAlignment(
  x,
  y,
  dist.type = "hamming",
  dna.model = "K80",
  wmut = 1,
  windel = 3.5,
  wslippage = 1.75,
  exclude.pos = NULL,
  post.include = FALSE,
  threads = 1
)

Arguments

x

list of repeats as DNAStringSet

y

repeat as DNAStringSet

dist.type

distance calculation method if no gendist matrix is supplied [default: hamming]

dna.model

a character string specifying the evolutionary model to be used (see ape::dist.dna) [default: K80]

wmut

weight for nucleotide point mutations [default: 1]

windel

weight for unit insertions/deletions [default: 3.5]

wslippage

weight for single unit duplications [default: 1.75]

exclude.pos

positions of the repeat that will be excluded before distance calculation [default: NULL]

post.include

boolean if excluded positions should be added back for the sequence alignment after distance calculation [default: FALSE]

threads

number of parallel threads [default: 1]

References

Vara C et al. (2019) PRDM9 Diveristy at Fine Geographical Scale Reveals Contrasting Evolutionary Patterns and Functional Constraints in Natural Populations of House Mice. Molecular Biology and Evolution, 36(8), 1686-1700.

See also

Author

Kristian K Ullrich

Examples

##load example sequence data
data("mousePRDM9", package="repeatR")
myRepPattern<-"PY"
myRepLength<-84
mousePRDM9.random<-sample(mousePRDM9, 20)
mousePRDM9.random.split<-repeatR::splitRepByPattern(mousePRDM9.random,
    myRepPattern, myRepLength)
##get the repeat with max number of repeat units
mousePRDM9.random.split.repeat.units<-unlist(lapply(
    mousePRDM9.random.split$cds, length))
max.pos<-which(mousePRDM9.random.split.repeat.units==
    max(mousePRDM9.random.split.repeat.units))
##select one/the longest repeat as the target sequences
target.repeat<-mousePRDM9.random.split$cds[[sample(max.pos,1)]]
##align all repeats in the list to the target repeat
mousePRDM9.random.alg<-repeatR::ListRepeatAlignment(
    mousePRDM9.random.split$cds, target.repeat)
mousePRDM9.random.alg
#> DNAStringSet object of length 20:
#>      width seq                                              names               
#>  [1]  1260 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG KF462449.1_cds_AH...
#>  [2]  1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843958.1_Mmm_MG...
#>  [3]  1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG KF462479.1_cds_AH...
#>  [4]  1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843967.1_Mmm_MG...
#>  [5]  1092 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843921.1_Mmc_HI306
#>  ...   ... ...
#> [16]   840 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG MK848111.1_cds_QC...
#> [17]   756 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG MK848092.1_cds_QC...
#> [18]   840 -----------------------...GAGGACACATACAAGAGAGAAG KF462499.1_cds_AH...
#> [19]   756 CCCTATGTTTGCAGGGAGTGTGG...---------------------- KF462496.1_cds_AH...
#> [20]   672 CCCTATGTTTGCAGGGAGTGTGG...GAGGACACATACAAGAGAGAAG AB843953.1_Mmm_MG686
##change settings as for the RepeatAlignment function