EPS: automated feature selection in case-control studies using extreme pseudo-sampling - Citegraph

Paper Info

Title
EPS: automated feature selection in case-control studies using extreme pseudo-sampling

Abstract
Finding informative predictive features in high-dimensional biological case-control datasets is challenging. The Extreme Pseudo-Sampling (EPS) algorithm offers a solution to the challenge of feature selection via a combination of deep learning and linear regression models. First, using a variational autoencoder, it generates complex latent representations for the samples. Second, it classifies the latent representations of cases and controls via logistic regression. Third, it generates new samples (pseudo-samples) around the extreme cases and controls in the regression model. Finally, it trains a new regression model over the upsampled space. The most significant variables in this regression are selected. We present an open-source implementation of the algorithm that is easy to set up, use and customize. Our package enhances the original algorithm by providing new features and customizability for data preparation, model training and classification functionalities. We believe the new features will enable the adoption of the algorithm for a diverse range of datasets.

Year	DOI	Venue
2021	10.1093/bioinformatics/btab214	BIOINFORMATICS
DocType	Volume	Issue
Journal	37	19
ISSN	Citations	PageRank
1367-4803	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ruhollah Shemirani	1	0	0.34
Stephane Wenric	2	0	0.34
Eimear Kenny	3	0	0.34
José Luis Ambite	4	958	110.89

1