To address pharmacokinetic and toxicological issues in drug development, once the main source of late attrition of drug candidates, many pharmaceutical companies have now implemented early DMPK (Drug Metabolism and Pharmacokinetics) or early toxicological studies. However, such approaches are difficult to emulate in the academic drug discovery environment. Therefore, we began an initiative “Development of a Drug Discovery Informatics System” in collaboration with several other research groups. The main aim of this initiative is to develop more accurate prediction systems for DMPK and toxicological properties primarily targeting academic scientists. Our group’s focus is to develop a pharmacokinetics database and prediction models.
Any good prediction system depends on high-volume, high-quality training datasets. We collected pharmacokinetic and physicochemical parameters from the public bioactivity database, ChEMBL. However, since ChEMBL compiles data obtained in different experimental conditions, we developed a curation workflow to select the data measured in compatible conditions and to reformat the results as appropriate for our prediction system.
In addition to the public data, we have acquired both in vitro and in vivo experimental data under unified protocols. The in vitro experiments include physicochemical parameters such as solubility and distribution coefficient, and pharmacokinetic parameters such as metabolic stability, protein binding in plasma, protein binding in brain homogenate, and blood-to-plasma concentration ratio. In addition, we collected efflux ratio of P-glycoprotein (P-gp), which is the major transporter in gut and brain. The in vivo data include the drug concentrations in plasma and tissues after oral or intravenous administration of the drug and pharmacokinetic parameters calculated therefrom.
We are currently developing several prediction models using these data, and we intend to provide them sequentially.
Current version contains following chemicals and activity data.
|Number of records|
|All registered compounds||30,391|
|Freebase compounds with different connection||25,277|
|Parameter||Species||in-house data||curated public data||predicted data|
|Name||Type||current||to be released||current||to be released||current|
|Solubility (pH 7.4)||Sol7.4||20||165||367||17,886|
|Solubility (pH 1.2)||Sol1.2||20||165|
|Distribution coefficient (pH 7.4)||logD7.4||20||120|
|In vitro parameters|
|Unbound fraction in plasma||Fu,p||Human||20||459||2,319||17,886|
|Unbound fraction in brain homogenate||Fu,brain||Rat||20||459|
|Blood-to-plasma concentration ratio||Rb||Human|
|Permeability coefficient (LLC-PK1)||Papp||Human||468|
|Permeability coefficient (Caco-2)||Papp||Human||4,408||17,886|
|P-gp net efflux ratio||NER||Human||468|
|Metabolic stability in liver microsome||CLint||Human||20||163||5,275|
|In vivo parameters|
|Drug concentration in plasma (p.o., i.v.)||C||Rat||20||100|
|Drug distribution in tissues (brain, CSF, heart, kidney, liver, lung, muscle, plasma)||C||Rat||20||100|
|Initial drug concentration in plasma||C0||Rat||20||39|
|Maximum drug concentration||Cmax||Rat||20||96|
|Elimination half-life of a drug||T1/2||Rat||20||100|
|Time to reach maximum drug concentration||Tmax||Rat||20||96|
|Area under the drug concentration-time curve||AUC||Rat||20||100|
|Mean residence time of a drug||MRT||Rat||20||100|
|Tissue-to plasma concentration ratio||Kp||Rat||20||100|
|Apparent volume of distribution||Vd||Rat||20||39|
|Apparent volume of distribution at oral administration||Vd/F||Rat||20||96|
|Renal clearance ratio||CR||Human||17,886|
|Fraction excreted unchanged in urine||Fe||Human||343||17,886|
|IC50 for hERG channel||IC50||9,114|
|IC50 for Cav1.2 channel||IC50||204|
|IC50 for Kv1.5 channel||IC50||686|
|IC50 for Nav1.5 channel||IC50||1,321|
|Link to Hepatotoxicity database||606|