Convenience functions to read sequences, features or links from various bioinformatics file formats, such as FASTA, GFF3, Genbank, BLAST tabular output, etc. See def_formats() for full list. File formats and the corresponding read-functions are automatically determined based on file extensions. All these functions can read multiple files in the same format at once, and combine them into a single table - useful, for example, to read a folder of gff-files with each file containing genes of a different genome.

read_feats(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_subfeats(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_links(files, .id = "file_id", format = NULL, parser = NULL, ...)

read_sublinks(files, .id = "file_id", format = NULL, parser = NULL, ...)

  .id = "file_id",
  format = NULL,
  parser = NULL,
  parse_desc = TRUE,



files to reads. Should all be of same format. In many cases, compressed files (.gz, .bz2, .xz, or .zip) are supported. Similarly, automatic download of remote files starting with http(s):// or ftp(s):// works in most cases.


the column with the name of the file a record was read from. Defaults to "file_id". Set to "bin_id" if every file represents a different bin.


specify a format known to gggenomes, such as gff3, gbk, ... to overwrite automatic determination based on the file extension (see def_formats() for full list).


specify the name of an R function to overwrite automatic determination based on format, e.g. parser="read_tsv".


additional arguments passed on to the format-specific read function called down the line.


turn key=some value pairs from seq_desc into key-named columns and remove them from seq_desc.


A gggenomes-compatible sequence, feature or link tibble


  • read_feats(): read files as features mapping onto sequences.

  • read_subfeats(): read files as subfeatures mapping onto other features

  • read_links(): read files as links connecting sequences

  • read_sublinks(): read files as sublinks connecting features

  • read_seqs(): read sequence ID, description and length.
