Convenience functions to read sequences, features or links from various bioinformatics file formats, such as FASTA, GFF3, Genbank, BLAST tabular output, etc. See def_formats() for full list. File formats and the corresponding read-functions are automatically determined based on file extensions. All these functions can read multiple files in the same format at once, and combine them into a single table - useful, for example, to read a folder of gff-files with each file containing genes of a different genome.

read_feats(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_subfeats(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_links(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_sublinks(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_seqs(
  files,
  .id = "file_id",
  format = NULL,
  parser = NULL,
  parse_desc = TRUE,
  ...
)

Arguments

files

files to reads. Should all be of same format. In many cases, compressed files (.gz, .bz2, .xz, or .zip) are supported. Similarly, automatic download of remote files starting with http(s):// or ftp(s):// works in most cases.

.id

the column with the name of the file a record was read from. Defaults to "file_id". Set to "bin_id" if every file represents a different bin.

format

specify a format known to gggenomes, such as gff3, gbk, ... to overwrite automatic determination based on the file extension (see def_formats() for full list).

parser

specify the name of an R function to overwrite automatic determination based on format, e.g. parser="read_tsv".

...

additional arguments passed on to the format-specific read function called down the line.

parse_desc

turn key=some value pairs from seq_desc into key-named columns and remove them from seq_desc.

Value

A gggenomes-compatible sequence, feature or link tibble

Functions

  • read_feats(): read files as features mapping onto sequences.

  • read_subfeats(): read files as subfeatures mapping onto other features

  • read_links(): read files as links connecting sequences

  • read_sublinks(): read files as sublinks connecting features

  • read_seqs(): read sequence ID, description and length.

Examples