Neural Network for ETF Analysis

Data Preprocessing
First Neural Network: Top Holdings Scoring
Second Neural Network: ETF Selection
Conclusion

Data Preprocessing

ETF data was sourced from justetf.com, focusing on the top holdings of each fund. Key metrics such as Basic EPS, EBITDA, EBIT, Gross Profit, Revenue, Net Income, and Operating Margin were collected for each holding. The challenge lay in harmonizing these features across different funds while handling missing values (NA). Data imputation and normalization were applied to ensure comparability across all holdings.

First Neural Network: Top Holdings Scoring

The first neural network processes input features derived from the top holdings of ETFs. Each holding is scored based on its financial metrics, providing an aggregated score for each ETF. The model was trained using a combination of mean squared error and regularization to ensure robust performance on diverse ETFs.

Neural Network for Top Holdings Scoring — Figure 2: Training and Validation on first Neural Network.

Second Neural Network: ETF Selection

The second neural network takes the output scores from the first network and additional fund-level features as inputs to determine the optimal ETF selection. By minimizing a custom loss function tailored to portfolio optimization, the model identifies ETFs best aligned with specific investment goals.

Neural Network for ETF Selection — Figure 3: Neural network architecture for ETF selection.

Conclusion

The dual-neural-network framework effectively addresses the complexities of ETF analysis by integrating top holdings scoring with fund-level selection. Future work will focus on enhancing the interpretability of the models and incorporating real-time market data to refine predictions. The repository provides comprehensive details and scripts for replication and further development.

Neural Network for ETF Analysis

Table of Contents

Data Preprocessing

First Neural Network: Top Holdings Scoring

Second Neural Network: ETF Selection

Conclusion