Neural Network for ETF Analysis
Table of Contents
Data Preprocessing
ETF data was sourced from justetf.com, focusing on the top holdings of each fund. Key metrics such as Basic EPS, EBITDA, EBIT, Gross Profit, Revenue, Net Income, and Operating Margin were collected for each holding. The challenge lay in harmonizing these features across different funds while handling missing values (NA). Data imputation and normalization were applied to ensure comparability across all holdings.
First Neural Network: Top Holdings Scoring
The first neural network processes input features derived from the top holdings of ETFs. Each holding is scored based on its financial metrics, providing an aggregated score for each ETF. The model was trained using a combination of mean squared error and regularization to ensure robust performance on diverse ETFs.
Second Neural Network: ETF Selection
The second neural network takes the output scores from the first network and additional fund-level features as inputs to determine the optimal ETF selection. By minimizing a custom loss function tailored to portfolio optimization, the model identifies ETFs best aligned with specific investment goals.
Conclusion
The dual-neural-network framework effectively addresses the complexities of ETF analysis by integrating top holdings scoring with fund-level selection. Future work will focus on enhancing the interpretability of the models and incorporating real-time market data to refine predictions. The repository provides comprehensive details and scripts for replication and further development.