GOSTAR Updates in 2020

Data Science To Empower
Life Science Innovation

GOSTAR Updates in 2020

GOSTAR is the largest manually annotated structure-activity relationship (SAR) database of small molecules published in leading medicinal chemistry journals and patents. Compounds from both discovery and development stages targeting all target families are covered. Along with SAR, key properties like ADME and toxicity are captured. This relational database enables users to navigate and analyze massive content of small molecules to derive insightful decisions in design and discovery of novel compounds.

Content coverage

The GOSTAR database content is composed from various sources which includes:

  • MedChem Journals
  • Patents
  • FDA/EMEA/PMDA Reports
  • Clinical Trial Registries
  • Scientific Reviews
  • Company Websites
  • Books
  • Conferences
  • Public Sources

Patents covered in 2020


The patent coverage in GOSTAR database is very comprehensive. The content was indexed from more than 2900 patents in the year 2020. GOSTAR avoids duplicity or redundancy in database by avoiding capturing similar patents, i.e. patent published in multiple patent offices.

Preclinical candidates covered in 2020


In the year 2020, the GOSTAR database was enriched with 1500+ preclinical compounds acting against various indications like COVID-19, Non-alcoholic steatohepatitis (NASH), Hepatitis virus infections, HIV infections, Cardiovascular diseases, and various cancers.


Few significant drug inclusions in 2020 were:

  • EPV-COV19
  • FT-8225
  • VNRX-9945
  • CARG-201
  • S-540956
  • BMS-818251
  • BRII-732
  • CR-13626
  • NAB815
  • CV730
  • GLPG-4124
  • IDG-16177

Target space covered in 2020 updates


New content was updated for more than 2500 protein targets in 2020. While content for EGFR was updated from 200+ references, Adenosine A2A receptor was updated from 86 references and KRAS had content updated from 54 references, whilst NOTCH made into top 20 with around 4.7K compounds covered from a reference (Table 2).

Distribution of SAR content

Of the 1.2 million SAR rows added to the GOSTAR, functional in-vitro and in-vivo contribute 41.25% to data, binding constitutes 32.28%, and 6.69% of content consists of ADME properties.

Approximately, 2% content is around toxicity properties of the compounds covered in 2020 and the rest 17% represents other property types including physicochemical properties.


Try GOSTAR today. To schedule a free demo, write to us at:

For more information on GOSTAR, visit: