GOSTAR Updates: The Latest Feature Enhancements

GOSTAR Updates: The Latest Feature Enhancements

Modern-day medicinal and computational chemists thrive in a data-rich environment, constantly seeking to uncover untapped potentials from the literature. This journey involves navigating the complex process of sifting through extensive datasets. In the ever-evolving realm of drug discovery, data-driven advancements continuously reshape the landscape, deepening our understanding of the unique challenges that emerge. One such challenge is identifying the right datasets from a vast sea of information.

We are delighted to share the latest updates and improvements in GOSTAR. We’ve been working hard behind the scenes, listening to your feedback, and adding new features to enhance your experience. In this article, we’ll take a closer look at these updates that we believe will elevate your experience with GOSTAR.

1) Improved user interface of GOSTAR leaflet

Our main objective was to make the interface simpler, allowing you to easily navigate through different sections and effortlessly explore the GOSTAR leaflet. We have also taken great care in removing any unnecessary data, ensuring that the content you receive is concise and relevant. In addition, this update brings improved performance with faster loading times and smoother interactions, enhancing your overall browsing experience.

2) Secured & Centralized Data Requests within GOSTAR UI

Bid farewell to the tedious task of sending manual emails and enduring long waiting periods. Our most recent update simplifies the entire process, enabling you to effortlessly request and download data directly within the GOSTAR UI. You can monitor your progress, receive immediate email notifications, and retain complete control over your user preferences.

3) Enhanced Data Filtering in the Result section

Introducing an exciting new feature that enables you to conveniently refine your search results by applying various activity parameters such as Activity Prefix, Biological Source, Cells Cell-line Organ, Unit of Measurement, Activity Type, Activity Value, and Micromolar value. Experience the benefits of streamlined searches, precise customization, and the efficient discovery of valuable insights. With these fixed filters, you can reduce the time spent on irrelevant data and can obtain meaningful results that drive your research forward.

4) Faster Patent Search Functionality

Discover the enhanced efficiency of our patent search feature in GOSTAR UI, where you can experience a streamlined and time-saving search process. We have made improvements to eliminate the need for a dropdown selection of patent numbers, resulting in a quicker search. The enhanced convenience with the enabled “Save button”, allows you to directly search patents without any additional steps.

5) Explore Company Structure for Better Analysis

Maximize your Competitive Analysis capabilities by utilizing the advanced company structure investigation. Acquire in-depth knowledge about parent and group companies, equipping you with superior competitive intelligence. Unveil untapped prospects by uncovering associations and links within the industry landscape.

6) Introducing SMART Filters for Enhanced Drug Discovery

Streamline your research process and harness the power of PAINS SMARTS Filters. Uncover molecules with multiple interfering behaviors that can disrupt assay readouts, slashing the time and costs associated with potential drug withdrawals. Elevate your research by gaining invaluable insights into compound-specific toxicological endpoints using these specialized filters. With PAINS SMARTS Filters, you’ll make well-informed decisions right from the start, ensuring a more efficient and successful journey in drug discovery.

In conclusion, the latest updates in GOSTAR UI provide a better user interface for leaflets, secure data requests, improved filtering, quicker patent searches, enhanced analysis capabilities, and SMART filters for drug discovery. These updates and introductions bring exciting improvements to our GOSTAR platform, making it more user-friendly and efficient while equipping you with valuable tools and insights.

Log in now to explore these exciting new features. Additionally, you can request a training session with GOSTAR Support to gain a deeper understanding of how to utilize these features to your advantage. Alternatively, if you are new to GOSTAR and want to power your drug discovery with gold-standard data, request a demo now.


The Drug-Target Interaction Heatmap

The Drug-Target Interaction Heatmap

A heatmap is a two-dimensional data visualization approach that displays the magnitude of a phenomenon as color. The color shift might be via hue or intensity, giving the reader clear visual indications about how the occurrence is clustered or evolves over space. Heatmaps are classified into two types: cluster heatmaps and spatial heatmaps. The sorting of rows and columns is intentional and somewhat arbitrary in a clustered heatmap, and the magnitudes are laid out into a matrix of fixed cell size whose rows and columns are discrete phenomena and categories, to suggest clusters or portray them as discovered via statistical analysis. The cell size is arbitrary, but it must be large enough to be seen. The position of a magnitude on a spatial heatmap, on the other hand, is determined by its location in that space, and there is no concept of cells; the phenomena are assumed to change continuously.

Data scientists and data analysts examine and determine essential links and characteristics among different points in a dataset, as well as aspects of those data points when working with small and large datasets. Heatmaps depict these data points and their interactions in a high-dimensional context without becoming excessively compressed and visually unpleasant. In data analysis, heatmaps enable specific variables of rows and/or columns to be plotted on the axes.

The drug-target interaction heatmap facilitates decision-making about the potential off-target activity of drug candidates early in the new drug development and drug repurposing workflows. In terms of important factors, a heatmap provides a clear picture of the interactions between drugs and their targets. This allows for the quick identification of the most important interactions.

GOSTAR presents these findings in a visually intuitive manner, allowing end-users to easily interpret the data and draw conclusions.


  1. Heat map. In: Google Arts and Culture. Accessed 21 April 2022.
  2. Exploratory Data Analysis. In: IBM Cloud Learn Hub. Accessed 21 April 2022.

Matched Molecular Pair Analysis

Matched Molecular Pair Analysis

The complexity in molecular design is selecting what to do next based on existing data, medicinal chemistry knowledge, experience, and intuition. In small compound sets, a skilled chemist can discern trends and correlations by eye. As the number of molecules increases, more methodical procedures are required.

The Matched Molecular Pair (MMP) analysis, which compares closely related chemical structures pairwise across a big dataset, is one method in the medicinal chemist’s toolbox for accomplishing this. Since the structures of the two molecules in question differ very slightly, any change in a physical or biological feature between the matched molecular pair can be more easily interpreted.

In 2004, Kenny and Sadowski coined the term Molecular Matched Pair (MMP) for a subset of QSAR; it is now a widely used concept in drug design processes [1]. Matched molecular pairs differ only in small single-point alterations, which are referred to as chemical transformations. As the structural difference between the two molecules is minimal, any differences in physical properties or observed biological effects can simply be linked to it. In 2010, Hussain and Rea published an approach to find matched molecular pairs and relate them to the distribution of value differences for each transformation, and it has since become a popular tool for analyzing huge chemistry datasets.

MMP is typically used to describe a pair of compounds that differ structurally at a single site because of a well-defined transformation accompanied by a change in a property value. To rationalize observed structure-property relationships (SPR) and compound optimization, the relationship between structural and property change is used. Aside from assisting in hypothesis creation and testing, MMP can also be used to find outliers, such as a pair of compounds that have a sudden change in a property, known as an activity cliff. These compounds are typically the most intriguing to investigate in the development of compounds aimed at increasing the property that exhibits this change.

GOSTAR provides tools for determining the matched molecular pairs and analyzing activity landscapes across compound datasets.


  1. Kenny P.W., Sadowski J. Structure modification in chemical databases. In: Oprea T., editor. Cheminformatics in drug discovery. Wiley-VCH Weinheim; Germany, 2004, 271.
  2. Hussain J, Rea C. Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J Chem Inf Model. 2010, 50(3), 339-348.

Interactive Property Space Exploration

Interactive Property Space Exploration

Lipophilicity plays a significant role in small molecule drug design and discovery. A partition coefficient, logP, can be used to describe the lipophilicity of an organic compound. It is expressed as the ratio of the unionized compound’s concentration in the organic and aqueous phases at equilibrium. The distribution of species in compounds containing ionizable groups is influenced by pH and the lipophilicity of a molecule is affected by its ionization state. As a result, the distribution coefficient (logD) of a compound is defined, which considers the dissociation of weak acids and bases. In aqueous conditions, highly lipophilic substances are often less soluble. Lipophilic compounds, on the other hand, may have good solubility in oils and lipids, making them good candidates for lipid-based formulations.

Lipophilicity influences potency, selectivity, permeability, absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties. High lipophilicity, with logP greater than five, is associated with limited solubility, increased clearance, and poor oral absorption. Furthermore, highly lipophilic drugs have a predisposition for interacting with hydrophobic targets other than the primary target, thereby enhancing promiscuity and toxicity. Low lipophilicity can reduce permeability and potency, resulting in lower bioavailability and overall efficacy. Compounds with logP greater than one or less than four are thought to have better physicochemical and ADME properties for oral drugs.

Lipophilicity is often regarded as a key indicator of potential promiscuity, with many property–promiscuity studies indicating that drug promiscuity rises with the increase in lipophilicity. This tendency is concerning since increasing a molecule’s lipophilicity can improve its efficacy at the primary target; however, this can be counterbalanced by an increase in off-target promiscuity [2]. Lipophilicity is a key element in determining a drug’s affinity for protein targets and in modulating ADMET characteristics. As a result, the combination of high target potency and high lipophilicity may increase the likelihood of ADMET-related attrition. 

Therefore, medicinal chemistry optimization needs to be balanced and multidimensional. GOSTAR empowers medicinal chemists to efficiently explore the property space against a variety of bioactivity endpoints.


  1. Gao Y, Gesenberg C, Zheng W. Oral Formulations for Preclinical Studies: Principle, Design, and Development Considerations, Developing Solid Oral Dosage Forms (Second Edition), Academic Press. 2017, 455-495.
  2. Armstrong D, Li S, Frieauff W, Martus H.J, Reilly J, Mikhailov D, Whitebread S, Urban L. Predictive Toxicology: Latest Scientific Developments and Their Application in Safety Assessment, Comprehensive Medicinal Chemistry III, Elsevier. 2017, 94-115.

Drug Discovery Databases

Types of Drug Discovery Databases

GOSTAR is an integrated platform of various standalone databases with cross-indexing of chemical compounds of interest with SAR, ADME, Toxicity, Preclinical/Clinical, Biological Targets, Structural information, Developmental pipeline, etc., along with extensive cross-references within the database as well as with external public databases and open-access repositories.

Medicinal Chemistry Database

A largest Reference centric database with data annotated from most referred mainstream Medicinal Chemistry Journals and enriches database with Pharmacodynamics, Pharmacokinetics, Efficacy, safety, metabolite and toxicity data for millions of small molecules tested in various In vitro, In vivo and ex vivo assays which are in early discovery in Drug development.

GOSTAR Med Chem knowledge base enables researchers to quickly and confidently identify the most promising compounds to take forward in the drug discovery process by covering the chemical, biological and Pharmacology space. Physiochemical properties like experimental LogP, LogD, solubility, etc, of small molecules, were captured for discovery compounds which enable scientists to generate new drug ideas by exploring the chemical space of new compounds.

Robust SAR activity against a target protein for compound(s) of interest and relative sphere of influence across other target space to measure Compound-Target selectivity could be achieved for discovery compounds through GOSTAR Med Chem Database.

Targets Database

A huge Biological Target centric database of chemical molecules excerpted from pharmacological Patents and various highly referred Medicinal Chemistry Journals, mapped with SAR data against various Biological Targets. This database can be further customized to suit any specific requirements or additional data. The following are various databases categorized based on Target Superfamilies.

  • Kinase Database
  • Protease Database
  • GPCR Database
  • Nuclear Hormone Receptor Database
  • Ion-Channel Database
  • Transporters Database
  • Transferases Database
  • Protease Database
  • Phosphatase Database

Clinical Candidate Database

Database of chemical compounds in different phases of clinical development globally, mapped with Pharmacological data, ADME, Phase Details, Biological Activities, Chemical space, Developmental pipeline, Approved/Suspended/Discontinued info, etc. Various features of include,

  • Compounds in the database include Preclinical and IND-filed to NDA-filed compounds (Pre-registration)
    The database also includes compounds whose development is suspended or discontinued
  • Data information includes developmental pipeline across the top to small scale pharma companies
  • Other drug-related information including Approval dates, Suspended/Discontinued reasons, Therapeutic Indications & Adverse events (mapped with standard nomenclature like ICD10, MedDra, etc.)
  • Biological activities are annotated from subscribed and freely-accessible journals, News releases, Conferences, Abstracts and Meetings besides others

Mechanism-Based Toxicity Database

Toxicity Related Database detailing toxic effects induced by chemical compounds (Discovery, Developmental or Drug-like Compounds) or their metabolites, mapped with their mechanism of action. Various features of Mechanism-Based Toxicity Database include,

  • Detailed information on toxic effects induced by compounds and/or their metabolites
  • Toxicity information includes carcinogenicity, tumorigenicity, mutagenicity, teratogenicity, neurotoxicity, cytotoxicity, etc.
  • Organo-specific Toxicity information specific to species, organ detailed mechanism behind the toxic effect of compound and/or metabolites
  • Diagrammatic representation/schema of metabolism
  • Biological activities are annotated from subscribed and freely accessible journals, Conferences, Abstracts, and Meetings among others.

Drug Approvals 2021

Drug Approvals 2021

Published on 13-Sept-2021

In 2021, the FDA has approved many novel products that serve previously unmet medical needs or significantly help to advance patient quality of life. The broad indication wise distribution of all CDER’s 2021 drug approvals indicates notable advances in drug discovery1,2.

New Drug Approvals & Drugs in Pipeline (FDA) for 2021*

Table 1. Approved Drug List
Table 2. Drugs in Pipeline

*This information is updated as July 31, 2021; listed alphabetically by trade name.

Significant drug launches of 2021

  • Verquvo (Vericiguat, MERCK SHARP DOHME, 01/19/2021)
    Mitigates the risk of cardiovascular death and hospitalization for chronic heart failure 
  • Cabenuva (Cabotegravir and Rilpivirine (Co-Packaged), VIIV HLTHCARE, 01/21/2021)
    Treats HIV 
  • Lupkynis (Voclosporin, AURINIA, 01/22/2021)
    Treats lupus nephritis 
  • Tepmetko (Tepotinib, EMD SERONO INC, 02/03/2021)
    Treats non-small cell lung cancer 
  • Ukoniq (Umbralisib Tosylate, TG THERAPS, 02/05/2021) 
    Treats marginal zone lymphoma and follicular lymphoma 
  • Evkeeza (Evinacumab-Dgnb, REGENERON PHARMACEUTICALS, 02/11/2021) 
    Treats homozygous familial hypercholesterolemia
  • Cosela (Trilacicilib Dihydrochloride, G1 THERAP, 02/12/2021), 
    Mitigates chemotherapy-induced myelosuppression in small cell lung cancer
  • Amondys 45 (Casimersen, SAREPTA THERAPS INC, 02/25/2021)
    Treats Duchenne muscular dystrophy
  • Nulibry (Fosdenopterin Hydrobromide, ORIGIN, 02/26/2021) 
    Reduces the risk of mortality in molybdenum cofactor deficiency Type A 
  • Pepaxto (Melphalan Flufenamide Hydrochloride, ONCOPEPTIDES AB, 02/26/2021)
    Treats relapsed or refractory multiple myeloma 
  • Azstarys (Serdexmethylphenidate Hydrochloride; Dexmethylphenidate Chloride, COMMAVE THERAP, 03/02/2021)
    Treats attention deficit hyperactivity disorder 
  • Fotivda (Tivozanib Hydrochloride, AVEO PHARMS, 03/10/2021)
    Treats renal cell carcinoma 
  • Ponvory (Ponesimod, JANSSEN PHARMS, 03/18/2021)
    Treats relapsing forms of multiple sclerosis
  • Zegalogue (Dasiglucagon Hydrochloride, ZEALAND PHARMA, 03/22/2021)
    Treats severe hypoglycemia 
  • Qelbree (Viloxazine Hydrochloride, SUPERNUS PHARMS, 04/02/2021)
    Treats attention deficit hyperactivity disorder 
  • Nextstellis (Drospirenone; Estetrol, MAYNE PHARMA, 04/15/2021)
    Prevents pregnancy 
  • Jemperli (Dostarlimab-Gxly, GLAXOSMITHKLINE, 04/22/2021)
    Treats endometrial cancer 
  • Zynlonta (Loncastuximab Tesirine-Lpyl, ADC Therapeutics SA, 04/23/2021)
    Treats certain types of relapsed or refractory large B-cell lymphoma 
  • Empaveli (Pegcetacoplan, APELLIS PHARMS, 05/14/2021)
    Treats paroxysmal nocturnal hemoglobinuria 
  • Rybrevant (Amivantamab-Vmjw, JANSSEN BIOTECH, 05/21/2021)
    Treats a subset of non-small cell lung cancer 
  • Pylarify (Piflufolastat F-18, PROGENICS PHARMS INC, 05/26/2021)
    Identifies prostate-specific membrane antigen-positive lesions in prostate cancer 
  • Lumakras (Sotorasib SIB, AMGEN INC, 05/28/2021)
    Treats types of non-small cell lung cancer 
  • Truseltiq (Infigratinib Phosphate, QED THERAP, 05/28/2021)
    Treats cholangiocarcinoma whose disease meets certain criteria
  • Lybalvi (Olanzapine; Samidorphan L-Malate, ALKERMES INC, 05/28/2021)
    Treats schizophrenia and certain aspects of bipolar I disorder
  • Brexafemme (Ibrexafungerp Citrate, SCYNEXIS, 06/01/2021)
    Treats vulvovaginal candidiasis
  • Aduhelm (Aducanumab-Avwa, BIOGEN INC, 06/07/2021)
    Treats Alzheimer’s disease 
  • Rylaze (Asparaginase Erwinia Chrysanthemi (Recombinant)-Rywn, JAZZ PHARMS, 06/30/2021)
    Treats acute lymphoblastic leukemia and lymphoblastic lymphoma in patients who are allergic to E. coli-derived asparaginase products, as a component of a chemotherapy regimen 
  • Kerendia (Finerenone, BAYER HEALTHCARE PHARMACEUTICALS INC, 07/09/2021)
    Reduces the risk of kidney and heart complications in chronic kidney disease associated with type 2 diabetes 
  • Fexinidazole (Fexinidazole, DNDI, 07/16/2021)
    Treats human African trypanosomiasis caused by the parasite Trypanosoma brucei gambiense 
  • Rezurock (Belumosudil, KADMON PHARMS LLC, 07/16/2021)
    Treats chronic graft-versus-host disease after failure of at least two prior lines of systemic therapy
  • Bylvay (Odevixibat, ALBIREO PHARMA INC, 07/20/2021)
    Treats pruritus 
  • Twyneo (Tretinoin and benzoyl peroxide, SOL-GEL TECHNOLOGIES LTD, 07/26/2021)
    It is a topical retinoid and antibacterial fixed-dose combination for the treatment of acne vulgaris in adults and children 9 years of age and older
  • Saphnelo (Anifrolumab, AstraZeneca, 07/30/2021)
    It is a type I interferon (IFN) receptor antagonist indicated for the treatment of adult patients with moderate to severe systemic lupus erythematosus (SLE), who are receiving standard therapy 

Significant Drug launches in Pipeline for 2021

  • Oteseconazole (VT-1161, MYCOVIA PHARMACEUTICALS INC)
    It is an investigational oral antifungal in development for the treatment of recurrent vulvovaginal candidiasis (RVVC)
    It is an orally bioavailable, broad-spectrum penem β-lactam antibiotic in development for the treatment of infections caused by multi-drug resistant bacteria
  • Brixadi (Buprenorphine, BRAEBURN INC)
    It is a long-acting partial opioid agonist injection formulation in development for the treatment of opioid use disorder
  • Tenapanor (ARDELYX INC)
    It is a sodium/hydrogen exchanger 3 (NHE3) inhibitor in development for the control of serum phosphorus in adult patients with chronic kidney disease (CKD) on dialysis or Hyperphosphatemia of Renal Failure
  • Libervant (Diazepam, AQUESTIVE THERAPEUTICS INC)
    It is a buccal film formulation of the approved benzodiazepine diazepam in development for the management of seizure clusters
  • Roxadustat (FG-4592, FIBROGEN INC)
    It is a first-in-class, orally administered small molecule hypoxia-inducible factor prolyl hydroxylase (HIF-PH) inhibitor in development for the treatment of anaemia of chronic kidney disease (CKD)
    It is an investigational, potential first-in-class anti-thymic stromal lymphopoietin (TSLP) monoclonal antibody in development for the treatment of severe asthma
  • LV-101 (Carbetocin intranasal, LEVO THERAPEUTICS INC)
    It is an oxytocin analog in development as a treatment for hyperphagia and behavioral distress associated with Prader-Willi syndrome (PWS)
  • Teplizumab (PROVENTION BIO INC)
    It is an investigational anti-CD3 monoclonal antibody (mAb) in development for the delay or prevention of clinical type 1 diabetes (T1D) in at-risk individuals
    It is a novel, oral angio-immuno kinase inhibitor in development for the treatment of pancreatic and non-pancreatic neuroendocrine tumors (“NET”)
  • Lenacapavir (GILEAD SCIENCES INC)
    It is an investigational, long-acting HIV-1 capsid inhibitor in development for the treatment of HIV-1 infection in heavily treatment-experienced (HTE) people with multi-drug resistant (MDR) HIV-1 infection
    It is an investigational RNAi therapeutic in development for the treatment of the polyneuropathy of hereditary transthyretin-mediated (hATTR) amyloidosis in adults
  • Pedmark (Sodium thiosulfate, FENNEC PHARMACEUTICALS INC)
    It is a cisplatin neutralizing agent in development for the protection against hearing loss in pediatric patients receiving cisplatin chemotherapy
    It is a protein kinase-R (PKR) activator in development for the treatment of adults with pyruvate kinase (PK) deficiency
  • Arimoclomol (ORPHAZYME A/S)
    It is an investigational Heat-Shock Protein amplifier in development for the treatment of Niemann-Pick disease Type C (NPC)
  • Ruxolitinib (INCYTE DERMATOLOGY)
    It is a JAK1/JAK2 inhibitor formulated for topical application in development for the treatment of atopic dermatitis and vitiligo
  • Zimhi (Naloxone hydrochloride, ADAMIS PHARMACEUTICALS CORPORATION)
    It is a high-dose formulation of the approved opioid antagonist naloxone in development for the treatment of opioid overdose
    It is a topical aryl hydrocarbon receptor (AhR) modulating agent in development for the treatment of plaque psoriasis and atopic dermatitis
  • Plinabulin (BEYONDSPRING INC)
    It is a selective immunomodulating microtubule-binding agent (SIMBA) in development for use in combination with granulocyte colony-stimulating factor (G-CSF) for the prevention of chemotherapy-induced neutropenia (CIN)


  1. US FDA


GOSTAR Content Updates - 2021

Published on 13-Sept-2021

GOSTAR is the largest manually annotated structure-activity relationships (SAR) database of small molecules published in mainstream medicinal chemistry journals and patents. Compounds from both discovery and development stages targeting all target families are covered. Along with SAR, key properties like ADME, and Toxicity are captured. This relational database enables users to navigate and analyze the massive content of small molecules to derive insightful decisions in the design and discovery of novel compounds.

Content Coverage

The GOSTAR database is composed of many different types of content, from the scientific literature to publicly available material.

  • MedChem Journals
  • Patents
  • FDA/EMEA/PMDA Reports
  • Clinical Trial Registries
  • Scientific Reviews
  • Company websites
  • Books
  • Conferences
  • Public Sources
Fig 1. A quick view of content covered and sources of the content

Preclinical Candidates Covered in 2021 (until July’2021)

In the year 2021, the GOSTAR database is enriched with various preclinical compounds acting against various indications like COVID-19, Non-alcoholic steatohepatitis (NASH), Hepatitis virus infections, HIV infections, Cardiovascular diseases, and various cancers.

Few significant drug inclusions until July 31, 2021: 

  • Synflorix
  • AZD1222
  • Benaglutide
  • GSK-1557484A
  • MRNA-1273

Target Space Covered in 2021 Updates


New content is updated for more than 2400 protein targets into the GOSTAR database until July 31, 2021.

Table 2: List of top 20 targets covered

Type of Content

Further deep analysis of the content covered in 2021 is shown in figure 2. Of the 1.2 million SAR rows added to GOSTAR, functional in-vitro and in-vivo contribute 41% to data, binding constitutes 33%, and 5% of content consists of ADME properties. 2% of content covers toxicity properties of compounds covered in 2021 and the rest 19% represents other property types including physicochemical properties.

Fig 2. Assay wise distribution of SAR content
Try GOSTAR today. To schedule a free demo, click here.

Biologics – The Biotech Drugs Transforming Medicine

Biologics - The Biotech Drugs Transforming Medicine

Published on 13-Sept-2021

Biologics, also known as biological products, are any type of medicines derived from living organisms such as humans, animals, or microorganisms via highly complex manufacturing processes and administered under closely monitored conditions. This is in contrasts to traditional non-biologic pharmaceutical drugs, which are synthesized in a laboratory through chemical processes without the use of components of living matter. Cancer, infectious diseases, autoimmune disease are among the ailments for which biologics are used to prevent, treat, or cure (Fig. 1).

Figure 1. Biologic medicines in development by therapeutic category [1].

Note: Some medicines are being explored in more than one therapeutic category.

Biologics include a wide variety of products such as monoclonal antibodies, vaccines, gene and cell therapies, and recombinant proteins (Fig. 2).

Figure 2. Biologic medicines in development by product category [1].

Monoclonal antibodies are by far the most researched category of biologics with at least 338 therapeutic mAbs currently being developed by pharmaceutical companies [1].

Monoclonal antibody (mAb) – the bestselling category of biological products.

Antibody engineering has significantly advanced ever since the approval of the first monoclonal antibody by the United States Food and Drug Administration (US FDA) in 1986 [2]. Therapeutic antibodies currently available in the market are safe with fewer adverse effects owing to their high specificity. Consequently, antibody drugs have become the leading class of newly developed drugs in recent years. Eight of the top ten bestselling drugs worldwide in 2018 were biologics. In 2018, the global therapeutic monoclonal antibody was worth roughly US$115.2 billion, with revenues expected to reach $300 billion by 2025 (Fig. 3) [3].

Figure 3. Timeline from 1975 showing the successful development of therapeutic antibodies and their applications [3].

As of December 2019, US FDA had approved 79 therapeutic mAbs, including 30 for cancer treatment [4].

Best-selling biotech drugs worldwide

AbbVie’s Humira and Merck’s Keytruda are among the top-selling biotechnology drugs in the world, generating 19.6 billion and 11.1 billion U.S. dollars, respectively, in 2019 (Fig. 4) [5]. Oncology, autoimmune/immunology, hematology, ophthalmology, and dermatology are among the top five therapy areas in 2019. Oncologic treatments account for six of the top-selling drugs in 2019, making oncology the most targeted field [6].

Figure 4. Top selling biotech drugs worldwide in 2019 [6].

Bristol Myers Squibb, AbbVie, Pfizer, and Roche are four pharmaceutical companies with more than one best-selling drug of 2019. Bristol Myers Squibb had the most top-selling drugs (Eliquis, Opdivo, and Revlimid) in 2019, accounting for 63% of the company’s total revenue. Whereas AbbVie’s revenues in 2019 were significantly reliant on its main products (Humira and Imbruvica), which accounted for 72% of the company’s total revenues [6]. The United States spent approximately 45 billion U.S. dollars on biotechnology research and development. In addition, the United States had approximately 34% of the world’s share of biotechnology patents filed in 2014, while Germany filed 8% of global biotech patents [5].

The Rise of Biosimilars

A biosimilar is a biologic that is similar to another biologic medicine (known as a reference product) that has already been approved by the FDA in the United States. In terms of safety, purity, and potency, biosimilars are very similar to the reference product, but there may be minor differences in clinically inactive components. The biologics and biosimilars industry in the United States is fast expanding, and as new medications are introduced, the benefits for patient access and cost management will continue to grow. There are 18 biosimilars on the market in the United States as of November 2020, competing against seven reference biologics, with ten more FDA-approved biosimilars expected to hit the market in the coming years [7].

Biosimilars save money in the long run, with higher savings coming from newer launches competing against more expensive drugs. The gap between the originator and the mean Average Sales Price (ASP) of their biosimilars ranged from 8.1 percent to 45.1 percent lower than the originator products as of July 2020 (including insulins) [7]. Biosimilars saved 6.5 billion U.S. dollars annually in the second quarter of 2020, and savings are expected to exceed 100 billion U.S. dollars over the next five years [8].

A biopharmaceutical product knowledge base is the need of the hour

Antibodies are the most successful class of biotherapeutics because of their binding versatility [9]. With the rapid growth of therapeutic antibody research, the chances of a specific antibody being the only one against a certain antigen are decreasing. Understanding the methods used to produce competing antibodies, as well as their pros and cons, can be extremely helpful in moving therapeutic antibodies forward. Data from clinical trials dominate the scientific literature on therapeutic antibodies, rather than the details of pre-clinical development that is underway for nearly two-thirds of all therapeutic antibodies. The information on the latter could only be obtained from patents. Many researchers are put off by patents’ opaque and archaic language but hidden in the text of these files are details about antibody sequences, assay techniques, epitopes, and much more. Patent applications are usually the first public disclosure of novel antibodies, often months or even years before conference papers or clinical trials. Researchers can identify novel antibodies in early stages of development months or years before they are formally announced by mining the patent literature.

There are very few databases that harvest this information. The IMGT Monoclonal Antibody Database and WHOINNIG are two non-commercial resources for antibody research. Other databases that aren’t unique to antibodies, such as ChEMBL, DrugBank, and KEGG DRUG, also capture WHO data. Most databases deliver additional metadata for their therapeutic entries, such as clinical trial status, companies involved in the development, target specificity, and alternative names. While these archives include sequence information, it is currently not possible to query them by sequence or to bulk-download relevant collections of therapeutic sequences for direct bioinformatic analysis.

Excelra is strongly positioned to deliver tailor-made curation on chemically defined antibodies (i.e., antibodies with a known primary amino-acid sequence) connected with their antigenic target, which can be either a protein or a chemical entity.

For more information and to connect with our scientific teams, write to us on:


  1. PhRMA [Pharmaceutical Research and Manufacturers of America] (2013). Medicines in Development: Biologics. 2013 Report. Accessed 10 Jul 2021.
  2. Ecker, D. M., Jones, S. D., Levine, H. L. The therapeutic monoclonal antibody market. MAbs2015, 7, 9–14.
  3. Lu, R. M., Hwang, Y. C., Liu, I. J. et al.Development of therapeutic antibodies for the treatment of diseases. J Biomed Sci2020, 27 (1), 1-30.
  4. The Antibody Society (2019). In: Approved antibodies. Accessed 10 Jul 2021.
  5. Statistica (2020). Select top selling biotech drugs worldwide in 2019. Accessed 10 Jul 2021.
  6. PharmaIntelligence (2020). Top 10 Best-Selling Drugs of 2019. Accessed 10 Jul 2021.
  7. IQVIA Institute Report (2020). Biosimilars in the United States 2020 – 2024. Accessed 10 Jul 2021.
  8. IQVIA Institute Report (2020). Biosimilars in the United States 2020 – 2024. Accessed 10 Jul 2021.
  9. Kaplon, H., Muralidharan, M., Schneider, Z., Reichert, J. M. Antibodies to watch in 2020. MAbs2020, 12(1), 1703531.

GOSTAR- The largest online medicinal chemistry intelligence database

Data Science To Empower
Life Science Innovation

GOSTAR- The largest online medicinal chemistry intelligence database

In medicinal chemistry, the relationship between molecular structure of a compound and its biological activity is referred to as Structure Activity Relationship (SAR). Medicinal chemists modify biomedical molecules by inserting new chemical groups into the compound and test those modifications for their biological effects. Determining and identifying SARs is key to many aspects of the drug discovery process, ranging from hit identification to lead optimization.

Although information on millions of compounds and their bio-activities e.g. reaction ability, solubility, target activity etc., is freely available to the public, it is very challenging to infer a meaningful and novel SAR from that information. The underlying problem in here is the un-structured and heterogeneous nature of these datasets contributed by the scientific & research community in journals, scientific articles, patents, regulatory documents and various secondary sources. Owing to the increasing structural diversity among hit compounds and their potency distribution, it is becoming a challenge to analyze the SAR information. If these relationships are properly extracted, associated and analyzed, they provide valuable information that would support drug discovery and development. To this end, there has been an increasing need and interest in mining and structuring SAR information from bioactivity data available in the public domain.

Global Online Structure Activity Relationship Database (GOSTAR)

Excelra, a leading global biopharma data and analytics company, has responded to this pertinent need by developing a knowledge repository, Global Online Structure Activity Relationship Database (GOSTAR), which provides a 360-degree view of millions of compounds linking their chemical structure to the biological, pharmacological and therapeutic information. GOSTAR contains high-quality, manually annotated and very well-structured SAR data captured from various primary sources (patents and top journals of medicinal chemistry) and secondary sources (conference meetings & abstracts, company drug development pipelines, company annual reports, clinical registries and drug approval reports).

Who can use GOSTAR and how?

The main objective for creating GOSTAR is to assist medicinal chemists, computational chemists and cheminformaticians in their quest for identifying potential small molecules that have decent biological effect and could be of a specific therapeutic use. GOSTAR enables users to quickly visualize, explore, analyze and evaluate SAR data based on their project requirements. The users can explore various SAR associations by searching various identifiers like drug names, chemical structures, bibliography, compound development stage and activity endpoints.

What are the applications of GOSTAR?

Better understanding of SAR data will enable the users to take correct decisions in exploring the chemical space while designing a drug.

Following are the applications of GOSTAR:

  • Target profiling – GOSTAR enables a holistic exploration of the chemical space around a target of interest & enables the users to understand the pathways and indications in which a given target is implicated
  • Structure based drug design – GOSTAR can be used as a compound library to perform virtual screening and hit identification in traditional structure-based drug design methodologies
  • Lead optimization – GOSTAR enables lead optimization by suggesting the structure activity relationships with improved potency, reduced off-target activities, and physiochemical/metabolic properties
  • Assay validation – GOSTAR suggests the right functional assays for secondary validation for the chemical modifications while involved in the tuning of the hit molecule
  • Drug repurposing and Translational science – GOSTAR data can be mined to interrogate diverse targets with a compound of interest to understand the feasibility and viability for drug rescue or for label expansion
  • Competitive intelligence and Novelty analysis – GOSTAR captures drug lifecycle information such as indication, phase of development, sponsor and recruitment/approval status including suspended trials along with the reason for discontinuation that can be used for building the competitive landscape around the drug/target/indication.


Currently, there are hundreds and thousands of chemical classes, and it often becomes daunting task to identify potential candidates for therapeutic use. In such cases, using knowledge repositories like GOSTAR, we can rapidly characterize data points that can help to efficiently capture and encode specific SAR. Below are the key features that showcase why GOSTAR is the ideal and simplistic solution for the complex task of gathering SAR data.

  • Reachability – Easy content accessibility to a wide and diverse user community
  • Utility – Maximize the utilization of content to create insights/concepts
  • Applicability – Selective utilization of content in diverse early discovery programs targeting unmet medical needs
  • Reliability – Standardized and normalized content to support traditional as well as AI/ML driven discovery programs

Try GOSTAR today. To schedule a free demo, write to us at:

For more information on GOSTAR, visit:


G-Protein Coupled Receptors: Structures, Research Landscape and Trends

G-Protein Coupled Receptors:
Structures, Research Landscape and Trends

A brief review of GPCR family: The largest family of druggable targets

G protein-coupled receptors (GPCRs) have become a hot frontier in basic research of life sciences and therapeutic discovery of translational medicines and is widely pursued by both academic and industrial research for drug discovery. They represent an important opportunity for both small molecule-based and antibody-based therapeutics and are the largest family of targets for approved drugs. The discovery of a diverse set of molecules targeting this family could become valuable assets, by solving unexploited horizons like establishing target biological functions and disease relevance.

GPCR structures and families

GPCRs are the largest family of proteins involved in membrane signal transduction and are also the most intensively studied drug targets, largely due to their substantial involvement in human pathophysiology. The pharmacological modulation of GPCRs provides leverage for treatment of diseases of central nervous system (CNS), cancer, viral infections, inflammatory disorders, metabolic disorders, etc.

The superfamily is classified into six classes based on amino acid sequence similarities namely, Class A (rhodopsin-like family); Class B (secretin receptor family); Class C (glutamate receptor family); Class D (fungal mating pheromone receptors); Class E (cAMP receptors) and Class F (frizzled or smoothened receptors), of which only four (A, B, C and F) are found in humans.

GPCRs are involved in various biological processes and disease indications and they make excellent drug targets (Fig 2). Some GPCRs have been linked to cancer development and progression, based on their overexpression and/or up-regulation by diverse factors. A higher expression of GPR49 was found to be involved in the formation and proliferation of basal cell carcinoma, the glycine receptor GPR18 was found to be associated with melanoma metastases, and high levels of GPR87 were found to be associated with lung, cervix, skin, urinary bladder, testis, head and neck squamous cell carcinomas.

Recently, orphan GPCRs have become a potentially novel targets for treatment of diverse set of indications, such as GPR119 for treatment of diabetes, leucine-rich repeat-containing G protein-coupled receptors 4 & 5 (LGR4/5) for treatment of gastrointestinal disease, GPR35 for treatment of an allergic inflammatory condition, GPR55 as an antispasmodic target, proto-oncogene Mas for treatment of thrombocytopenia, and GPR84 for of ulcerative colitis.

Landscape of GPCR research and drug development

GPCRs are the largest ‘target’ class of the ‘druggable genome’ representing approximately 19% of the currently available drug targets. In humans, the GPCR superfamily consists of 827 distinct members, of which 406 are non-olfactory. However, current therapeutics in humans target only 25% of potentially druggable GPCRs, 103 out of possible 403 GPCR targets, for which there is at least one marketed drug in practice.

Current literature analysis shows that GPCRs have traditionally been regarded as the domain for small-molecule drugs and very few targets are well studied. More than 30% of the US Food and Drug Administration (FDA) approved drugs target GPCRs, which makes them the largest druggable class of biomolecules (Fig 4).

Enormous efforts have been expended to find relevant and potent GPCR ligands as lead compounds. Non-olfactory GPCRs constitute more than half of the human genome encoded targets that are not yet exploited for any therapeutic use and the knowledge is disproportionately focused in the scientific literature. Preliminary studies highlight that these receptors have functions in genetic and immune system disorders.

While the drugs that currently target GPCRs are primarily small molecules and peptides, GPCRs also recognize diverse ligands, including inorganic ions, amino acids, proteins, steroids, lipids, nucleosides, nucleotides, and small molecules (Fig 5).

The latest trends in GPCR research indicates that modalities other than small molecules are becoming more popular as GPCR targeting agents with the entry of monoclonal antibodies, peptide drugs and allosteric modulators into early-stage clinical trials. For instance, GLP1 receptor targeting biologics like exenatide, liraglutide, and dulaglutide have been approved for type 2 diabetes, and CGRP receptor targeting erenumab in the treatment of chronic migraine and so many other peptide drugs targeting various GPCRs are also in development.

Current trends in GPCR research

In recent years, there is a significant increase in information available about the sequences, structures and signaling networks of GPCRs and the G proteins, due to breakthroughs in X-ray crystallography and cryo-electron microscopy (cryo-EM), leading to great understanding of GPCR-G protein interactions. This significant increase in information of GPCR-G protein interactions is being explored using several bioinformatics and software tools, including protein data bank GPCRdb gpDB , human gpDB  and many more.

Due to limited spatial and high cost of experimental studies, computational modeling techniques such as bioinformatics, protein-protein docking and molecular dynamics simulations are playing an important role in exploring the GPCR-G protein interactions. Determining the 3-dimensional structural features of various unexplored orphan receptors and their ligand-associated complexes has become an exciting avenue in the GPCRs research in understanding on the molecular recognition and activation mechanisms and help the pharmaceutical investigation of new diseases in variety of therapeutic areas.

As the current human therapeutics cover only 25% of potentially druggable GPCRs, a relatively large extent of GPCRs still remain ‘orphan’ and therapeutically unexploited. This prediction and identification of GPCR ligands for these orphan receptors is an active area of research and interest to pharmaceutical industry.



  • Hutchings CJ. A review of antibody-based therapeutics targeting G protein-coupled receptors: an update. Expert Opin Biol Ther. 2020 Aug;20(8):925-935
  • Ellaithy A, Gonzalez-Maeso J, Logothetis DA, Levitz J. Structural and Biophysical Mechanisms of Class C G Protein-Coupled Receptor Function. Trends Biochem Sci. 2020 Dec;45(12):1049-1064
  • Sriram K, Insel PA. G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs? Mol Pharmacol. 2018 Apr;93(4):251-258
  • Hauser AS, Attwood MA, Rask-Andersen M, Schioth HB, Gloriam DE. Trends in GPCR drug discovery: new agents, targets and indications. Nat Rev Drug Discov. 2017 Dec;16(12):829-842
  • Rask-Andersen M, Masuram S, Schioth HB. The druggable genome: evaluation of drug targets in clinical trials suggests major shifts in molecular class and indication. Annu Rev Pharmacol Toxicol. 2014;54:9-26.
  • Lu S, Zhang J. Small molecule allosteric modulators of G-protein-coupled receptors: drug-target interactions. J Med Chem. 2018
  • Gugger M, White R, Song S, Waser B, Cescato R, Rivière P. GPR87 is an overexpressed G-protein coupled receptor in squamous cell carcinoma of the lung. Reubi JC Dis Markers. 2008; 24(1):41-50
  • M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne. The Protein Data Bank Nucleic Acids Research, 2000;28:235-242
  • Margarita C Theodoropoulou, Pantelis G Bagos, Ioannis C Spyropoulos and Stavros J Hamodrakas. “gpDB: A database of GPCRs, G-proteins, Effectors and their interactions.” Bioinformatics. 2008 Jun 15;24(12);1471-2
  • Satagopam, V.P., Theodoropoulou, M.C., Stampolakis, C.K., Pavlopoulos, G.A., Papandreou, N.C., Bagos, P.G., Schneider, R. & Hamodrakas, S.J. GPCRs, G-proteins, effectors and their interactions: human-gpDB, a database employing visualization tools and data integration techniques. Database (Oxford) 2010;baq019
  • Kooistra AJ, Mordalski S, Pándy-Szekeres G, Esguerra M, Mamyrbekov A, Munk C, Keserű GM, Gloriam DE. GPCRdb in 2021: integrating GPCR sequence, structure and function. Nucleic Acids Research, 2020;49:D335-D343