Home  /  Algorithms  /  Vol: 17 Par: 1 (2024)  /  Article
ARTICLE
TITLE

Machine Learning Model for Multiomics Biomarkers Identification for Menopause Status in Breast Cancer

SUMMARY

Identifying menopause-related breast cancer biomarkers is crucial for enhancing diagnosis, prognosis, and personalized treatment at that stage of the patient’s life. In this paper, we present a comprehensive framework for extracting multiomics biomarkers specifically related to breast cancer incidence before and after menopause. Our approach integrates DNA methylation, gene expression, and copy number alteration data using a systematic pipeline encompassing data preprocessing and handling class imbalance, dimensionality reduction, and classification. The framework starts with MutSigCV for data preprocessing and ensuring data quality. The Synthetic Minority Over-sampling Technique (SMOTE) up-sampling technique is applied to address the class imbalance representation. Then, Principal Component Analysis (PCA) transforms the DNA methylation, gene expression, and copy number alteration data into a latent space. The purpose is to discard irrelevant variations and extract relevant information. Finally, a classification model is built based on the transformed multiomics data into a unified representation. The framework contributes to understanding the complex interplay between menopause and breast cancer, thereby revealing more precise diagnostic and therapeutic strategies in the future. The explainable artificial intelligence model Shapley based on the XGBoost regressor showed the power of the selected gene expressions for predicting the menopause status, and the potential biomarkers included RUNX1, PTEN, MAP3K1, and CDH1. The literature confirmed the findings.

 Articles related

Siddhartha Chaudhary,Shivam Yadav,Shweta Kushwaha,Surya Ratan Pratap Shahi    

Machine learning fit in to be a widespread technology now-a-day. Machine learning is preponderantly a subcomponent of Artificial Intelligence which has garnered significant eyeballs resulting in major AI-led developments in the arena of digitalization so... see more


Trishala Ahalpara,Kalyani Deore,Prathamesh Desai,Nida Parkar    

As per W.H.O, nearly 793,000 individuals passed away in 2016 due to suicidal and self-harm tendencies which is approximately one individual every 40 seconds. It is a global phenomenon and occurs throughout the lifespan. There are suggestions that for eve... see more


Trida Ridho Fariz, Ely Nurhidayati    

Land cover information is essential data in the management of watersheds. The challenge in providing land cover information in the Kapuas watershed is the cloud cover and its significant area coverage, thus requiring a large image scene. The presence of ... see more


Emrana Kabir Hashi, Md. Shahid Uz Zaman    

Machine learning techniques are widely used in healthcare sectors to predict fatal diseases. The objective of this research was to develop and compare the performance of the traditional system with the proposed system that predicts the heart disease impl... see more


Muhammad Fajar    

International tourism is one indicator of measuring tourism development. Tourism development is important for the national economy since tourism could boost foreign exchange, create business opportunities, and provide employment opportunities. The predic... see more