This function wraps the
UniProt ID Mapping service which
maps between the identifiers used in one database, to the identifiers of
another. By default it maps UniProtKB accessions to UniProt, and returns
a data.frame
with metadata about the mapped protein accessions. You can
also map IDs from/to other databases e.g. from = "Ensembl", to = "UniProtKB"
.
Usage
uniprot_map(
ids,
from = "UniProtKB_AC-ID",
to = "UniProtKB",
format = "tsv",
path = NULL,
fields = NULL,
isoform = NULL,
method = "paged",
page_size = 500,
compressed = NULL,
verbosity = NULL,
dry_run = FALSE
)
Arguments
- ids
character
, vector of identifiers to map from. Should not contain duplicates. Maximum length = 100,000 ids.- from
string
, database to map from. Default is"UniProtKB_AC-ID"
. See from_to_dbs possible databases whose identifiers you can map from.- to
string
, database to map to. Default is"UniProtKB"
. See from_to_rules for the possible databases you can map to, depending on thefrom
database.- format
string
, data format to fetch. Default is"tsv"
. Can be one of"tsv"
or"fasta"
.- path
string
(optional), file path to save the results, e.g."path/to/results.tsv"
.- fields
character
(optional), fields (i.e. columns) of data to get. Only used ifto
is a UniProtKB, UniRef, or UniParc database. See return_fields for all available fields.- isoform
logical
(optional), should protein isoforms be included in the results? Not necessarily relevant for all formats and databases.- method
string
, download method to use. Either"paged"
(default) or"stream"
. Paged is more robust to connection issues and takes less memory. Stream may be faster, but uses more memory and is more sensitive to connection issues.- page_size
integer
(optional), how many entries per page to request? Only relevant ifmethod = "paged"
. It's best to leave this at500
.- compressed
logical
(optional), should gzipped data be requested? Only relevant ifmethod = "stream"
andpath
is specified.- verbosity
integer
(optional), how much information to print?0: no output
NULL (default): minimal output
1: show request headers
2: show request headers and bodies
3: show request headers, bodies, and curl status messages
- dry_run
logical
, perform request withhttr2::req_dry_run()
? Requires thehttpuv
package to be installed. Default isFALSE
.
Value
By default, returns an object whose type depends on format
:
tsv
:data.frame
fasta
: Biostrings::AAStringSet (ornamed character
if Biostrings not installed)
If path
is specified, saves the results to the file path indicated,
and returns NULL
invisibly. If dry_run = TRUE
, returns a
list containing information about the request, including the request
method
, path
, and headers
.
See also
Other API wrapper functions: uniprot_search()
, uniprot_single()
Examples
if (FALSE) {
# Default, get info about UniProt IDs
uniprot_map(
"P99999",
format = "tsv",
fields = c("accession", "gene_primary", "feature_count")
)
# Other common use, mapping other IDs to UniProt
# (or vice-versa)
uniprot_map(
c("ENSG00000088247", "ENSG00000162613"),
from = "Ensembl",
to = "UniProtKB"
)
}