Genbank flat files (.gb/.gbk/.gbff) and their ENA and DDBJ equivalents have a
particularly gruesome format. That's why
read_gbk() is just a wrapper
around a Perl-based
gb2gff converter and
read_gbk(file, sources = NULL, types = NULL, infer_cds_parents = TRUE)
Either a path to a file, a connection, or literal data (either a single string or a raw vector).
Files ending in
be automatically uncompressed. Files starting with
ftps:// will be automatically
downloaded. Remote gz files can also be automatically downloaded and
Literal data is most useful for examples and tests. To be recognised as
literal data, the input must be either wrapped with
I(), be a string
containing at least one new line, or be a vector containing at least one
string with a new line.
Using a value of
clipboard() will read from the system clipboard.
only return features from these sources
only return features of these types, e.g. gene, CDS, ...
infer the mRNA parent for CDS features based on overlapping coordinates. Default TRUE for gff2/gtf, FALSE for gff3. In most GFFs this is properly set, but sometimes this information is missing. Generally, this is not a problem, however, geom_gene calls parse the parent information to determine which CDS and mRNAs are part of the same gene model. Without the parent info, mRNA and CDS are plotted as individual features.