Skip to contents

This package provides an R wrapper for the UniProt website REST API.

Installation

Install the latest version from R-universe:

install.packages("uniprotREST", repos = "https://csdaw.r-universe.dev")

Or install the latest development version from GitHub:

remotes::install_github("csdaw/uniprotREST")

Documentation

Read the full documentation here.

Quick start

ID mapping with uniprot_map()

Map to/from UniProt IDs. This function wraps the ID mapping API endpoint.

# Proteins of interest (from 3 different taxa)
ids <- c("P99999", "P12345", "P23456")

# Get accessions, gene names and sequence lengths
result <- uniprot_map(
  ids = ids, 
  from = "UniProtKB_AC-ID",
  to = "UniProtKB",
  format = "tsv",
  fields = c("accession", "gene_primary", "length")
)
## Running job: 90889074746482f892eea3ec45ebac4779dbc1c6 
## Checking job status...
## Job complete!
##  Downloading: page 1 of 1
result
##     From  Entry Gene.Names..primary. Length
## 1 P99999 P99999                 CYCS    105
## 2 P12345 P12345                 GOT2    430
## 3 P23456 P23456                    L   2151

Perform text searches against UniProt databases. This function wraps the Query API endpoint.

# Get human glycoproteins less than 100 amino acids long

result <- uniprot_search(
  query = "(proteome:UP000005640) AND (keyword:KW-0325) AND (length<100)",
  database = "uniprotkb",
  format = "tsv",
  fields = c("accession", "gene_primary")
)
##  Downloading: page 1 of 1
head(result)
##    Entry Gene.Names..primary.
## 1 P06028                 GYPB
## 2 P80098                 CCL7
## 3 Q16627                CCL14
## 4 P0DMC3                APELA
## 5 P25063                 CD24
## 6 P31358                 CD52

Retrieving an entry with uniprot_single()

Download the full entry for a single protein. This function wraps the Retrieve API endpoint.

# Human cytochrome C
result <- uniprot_single(
  id = "P99999",
  database = "uniprotkb",
  format = "json",
  verbosity = 0
)

str(result, max.level = 1)
## List of 17
##  $ entryType               : chr "UniProtKB reviewed (Swiss-Prot)"
##  $ primaryAccession        : chr "P99999"
##  $ secondaryAccessions     :List of 6
##  $ uniProtkbId             : chr "CYC_HUMAN"
##  $ entryAudit              :List of 5
##  $ annotationScore         : num 5
##  $ organism                :List of 4
##  $ proteinExistence        : chr "1: Evidence at protein level"
##  $ proteinDescription      :List of 1
##  $ genes                   :List of 1
##  $ comments                :List of 10
##  $ features                :List of 36
##  $ keywords                :List of 14
##  $ references              :List of 19
##  $ uniProtKBCrossReferences:List of 179
##  $ sequence                :List of 5
##  $ extraAttributes         :List of 3

Metadata