{"pk":65506,"title":"Graph-based featurization methods for classifying small molecule compounds","subtitle":null,"abstract":"For over a decade, drug-induced liver injury (DILI) has posed significant drawbacks in the  synthesis  and  development  of  drugs  and  remains  a  consequential  concern.  With finite success within the existing preclinical models, DILI is one of the main causes of drug  withdrawal  or  termination  from  the  market.  Particularly,  this  withdrawal  occurs during  the  late  stages  of  drug  development  (Kullak-Ublick,  2017).    Since  DILI  is difficult to diagnose and treat, it has become an obstacle in the drug production market that in turn affects clinicians, pharmaceutical companies, and consumers. We propose a method for learning features of DILI-positive drugs based on the graphical relationships and patterns they possess within a network of biological databases. We also train various statistical and machine learning models on these learned features in order to classify the drugs  as  DILI-positive  or  negative.  Our  methods  include  Random  Forest,  Neural networks,  and  logistic  regression  classification.  We  utilize  labeled  DILI-positive  and DILI-negative datasets, which were developed by the FDA and the National center for toxicological research, as well as additional literature datasets (Thakkar, 2020) in order to validate our results and assess our featurization and model accuracy. \nKeywords: liver toxicity, hepatoxic drug analysis, drug classification,  FDA clinical trials, graph  databases,  data  processing,  graph  embeddings,  classification  models,  machine-learning featurization, model comparison.","language":"en","license":null,"keywords":[{"word":"liver toxicity, hepatoxic drug analysis, drug classification,  FDA clinical trials, graph  databases,  data  processing,  graph  embeddings,  classification  models,  machine-learning featurization,.."}],"section":"Natural Sciences","is_remote":true,"remote_url":"https://escholarship.org/uc/item/4q43j852","frozenauthors":[{"first_name":"Randy","middle_name":"","last_name":"Posada","name_suffix":"","institution":"","department":""},{"first_name":"Mary","middle_name":"","last_name":"Silva","name_suffix":"","institution":"","department":""},{"first_name":"Marisa","middle_name":"","last_name":"Torres","name_suffix":"","institution":"","department":""},{"first_name":"Jonathan","middle_name":"","last_name":"Allen","name_suffix":"","institution":"","department":""},{"first_name":"Jeff","middle_name":"","last_name":"Drocco","name_suffix":"","institution":"","department":""},{"first_name":"Sarah","middle_name":"","last_name":"Sandholtz","name_suffix":"","institution":"","department":""},{"first_name":"Adam","middle_name":"","last_name":"Zemla","name_suffix":"","institution":"","department":""},{"first_name":"UCSF SPOKE","middle_name":"","last_name":"investigative teams","name_suffix":"","institution":"","department":""}],"date_submitted":"2022-05-05T21:20:58Z","date_accepted":"2022-05-05T21:20:58Z","date_published":"2022-05-05T07:00:00Z","render_galley":null,"galleys":[{"label":"","type":"pdf","path":"https://journalpub.escholarship.org/ucm_mwp_ucmurj/article/65506/galley/50139/download/"}]}