Takes one or more STRUCTURE, TESS, BAPS, BASIC (numeric delimited runs) or CLUMPP format files and converts them to a qlist (list of dataframes).

readQ(files = NULL, filetype = "auto", indlabfromfile = FALSE, readci = FALSE)

Arguments

files

A character or character vector of one or more files.

filetype

A character indicating input filetype. Options are 'auto', 'structure','tess2','baps','basic' or 'clumpp'. See details.

indlabfromfile

A logical indicating if individual labels must be read from input file and used as row names for resulting dataframe. Spaces in labels may be replaced with _. Currently only applicable to STRUCTURE runs.

readci

A logical indicating if confidence intervals from the STRUCTURE run file (if available) should be read. Set to FALSE by default as it take up excess space. This argument is only applicable to STRUCTURE run files.

Value

A list of lists with dataframes is returned. List items are named by input filenames. File extensions such as '.txt','.csv','.tsv' and '.meanQ' are removed from filename. In case filenames are missing or not available, lists are named sample1, sample2 etc. For STRUCTURE runs, if individual labels are present in the run file and indlabfromfile=TRUE, they are added to the dataframe as row names. Structure metadata including loci, burnin, reps, elpd, mvll, and vll is added as attributes to each dataframe. When readci=TRUE and if CI data is available in STRUCTURE run files, it is read in and attached as attribute named ci. For CLUMPP files, multiple runs within one file are suffixed by -1, -2 etc.

Details

STRUCTURE, TESS2 and BAPS run files have unique layout and format (See vignette). BASIC files can be Admixture run files, fastStructure meanQ files or any tab-delimited, space-delimited or comma-delimited tabular data without a header. CLUMPP files can be COMBINED, ALIGNED or MERGED files. COMBINED files are generated from clumppExport. ALIGNED and MERGED files are generated by CLUMPP.

To convert TESS3 R objects to pophelper qlist, see readQTess3.

See the vignette for more details.

See also

Examples

# STRUCTURE files sfiles <- list.files(path=system.file("files/structure",package="pophelper"), full.names=TRUE) # create a qlist of all runs slist <- readQ(sfiles) slist <- readQ(sfiles,filetype="structure") # use ind names from file slist <- readQ(sfiles[1],indlabfromfile=TRUE) # access the first run slist <- readQ(sfiles)[[1]] # access names of runs names(slist)
#> [1] "Cluster1" "Cluster2"
# get attributes of a run attributes(slist[[1]])
#> NULL
# get attributes of all runs lapply(slist,attributes)
#> $Cluster1 #> NULL #> #> $Cluster2 #> NULL #>
# TESS files tfiles <- list.files(path=system.file("files/tess",package="pophelper"), full.names=TRUE) # create a qlist tlist <- readQ(tfiles) # BASIC files afiles <- list.files(path=system.file("files/admixture",package="pophelper"), full.names=TRUE) # create a qlist alist <- readQ(afiles) # CLUMPP files cfiles1 <- system.file("files/STRUCTUREpop_K4-combined.txt", package="pophelper") cfiles2 <- system.file("files/STRUCTUREpop_K4-combined-aligned.txt", package="pophelper") cfiles3 <- system.file("files/STRUCTUREpop_K4-combined-merged.txt", package="pophelper") # create a qlist clist1 <- readQ(cfiles1) clist2 <- readQ(cfiles2) clist3 <- readQ(cfiles3) # manually create qlist df1 <- data.frame(Cluster1=c(0.2,0.4,0.6,0.2),Cluster2=c(0.8,0.6,0.4,0.8)) df2 <- data.frame(Cluster1=c(0.3,0.1,0.5,0.6),Cluster2=c(0.7,0.9,0.5,0.4)) # one-element qlist q1 <- list("sample1"=df1) str(q1)
#> List of 1 #> $ sample1:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.2 0.4 0.6 0.2 #> ..$ Cluster2: num [1:4] 0.8 0.6 0.4 0.8
# two-element qlist q2 <- list("sample1"=df1,"sample2"=df2) str(q2)
#> List of 2 #> $ sample1:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.2 0.4 0.6 0.2 #> ..$ Cluster2: num [1:4] 0.8 0.6 0.4 0.8 #> $ sample2:'data.frame': 4 obs. of 2 variables: #> ..$ Cluster1: num [1:4] 0.3 0.1 0.5 0.6 #> ..$ Cluster2: num [1:4] 0.7 0.9 0.5 0.4