R/read.R
, R/read_feats.R
, R/read_seqs.R
read_tracks.Rd
Convenience functions to read sequences, features or links from various
bioinformatics file formats, such as FASTA, GFF3, Genbank, BLAST tabular
output, etc. See def_formats()
for full list. File formats and the
corresponding read-functions are automatically determined based on file
extensions. All these functions can read multiple files in the same format at
once, and combine them into a single table - useful, for example, to read a
folder of gff-files with each file containing genes of a different genome.
read_feats(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_subfeats(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_links(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_sublinks(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_seqs(
files,
.id = "file_id",
format = NULL,
parser = NULL,
parse_desc = TRUE,
...
)
files to reads. Should all be of same format. In many cases,
compressed files (.gz
, .bz2
, .xz
, or .zip
) are supported.
Similarly, automatic download of remote files starting with http(s)://
or
ftp(s)://
works in most cases.
the column with the name of the file a record was read from. Defaults to "file_id". Set to "bin_id" if every file represents a different bin.
specify a format known to gggenomes, such as gff3
, gbk
, ...
to overwrite automatic determination based on the file extension (see
def_formats()
for full list).
specify the name of an R function to overwrite automatic
determination based on format, e.g. parser="read_tsv"
.
additional arguments passed on to the format-specific read function called down the line.
turn key=some value
pairs from seq_desc
into key
-named
columns and remove them from seq_desc
.
A gggenomes-compatible sequence, feature or link tibble
read_feats()
: read files as features mapping onto
sequences.
read_subfeats()
: read files as subfeatures mapping onto other features
read_links()
: read files as links connecting sequences
read_sublinks()
: read files as sublinks connecting features
read_seqs()
: read sequence ID, description and length.