Skip to content

DATS6101-TeamNeo/time-series-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

42 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Impact of Global Events on S&P 500 Index

R Version License Status

๐ŸŽฏ Overview

Understanding how global crises shape financial markets is crucial for investors, policymakers, and financial analysts. This project analyzes 23 years of S&P 500 data (2000-2024) to uncover how major historical events impact different market sectors and to predict future market behavior using advanced statistical and machine learning techniques.

Why this matters:

  • ๐Ÿ“Š For Investors: Identify sectors that demonstrate resilience during crises and optimize portfolio diversification strategies
  • ๐Ÿ›๏ธ For Policymakers: Understand differential recovery patterns across industries to design targeted economic interventions
  • ๐Ÿ”ฎ For Analysts: Leverage predictive models to forecast market responses to future global disruptions
  • ๐Ÿ“š For Researchers: Access reproducible analysis combining classical econometrics (ARIMA) with modern deep learning (LSTM)

๐Ÿ” Research Questions

  1. Which sectors drive the S&P 500 index?

    • Using ANOVA and linear regression to identify primary market drivers
  2. How do different industries respond to major historical events?

    • Statistical comparison of sector performance before/after crises
  3. Can we forecast the S&P 500 index effectively?

    • Comparative evaluation of ARIMA vs. LSTM models

๐ŸŒ Events Analyzed

The Great Recession (2007-2009)

Financial crisis triggered by subprime mortgage collapse

  • Impact: Financials, Energy, and Real Estate sectors devastated
  • Insight: Banking sector took 3+ years to recover

COVID-19 Pandemic (2020-2023)

Global health crisis causing unprecedented economic disruption

  • Impact: Technology sector surged; Real Estate and Utilities declined
  • Insight: Digital transformation accelerated by 5-7 years

Russia-Ukraine Invasion (2022-present)

Geopolitical conflict affecting global supply chains

  • Impact: Energy and Industrials gained; Technology remained resilient
  • Insight: Energy sector benefited from supply disruptions and pricing power

๐Ÿ“Š Key Findings

Sector Resilience Rankings

Crisis Most Resilient Most Vulnerable
Great Recession Health Care, Consumer Staples Financials, Real Estate, Energy
COVID-19 Information Technology, Energy Real Estate, Utilities
Russia-Ukraine Information Technology, Energy Real Estate

Cross-Event Pattern: Real Estate consistently underperformed across all three crises, while Information Technology demonstrated exceptional resilience.

Model Performance

Our analysis reveals that LSTM neural networks outperform traditional ARIMA models in forecasting accuracy:

  • ARIMA: Fast computation (~minutes), moderate accuracy
  • LSTM: High accuracy, captures complex patterns, requires more computational resources (~hours)

Practical Implication: For short-term trading decisions requiring rapid updates, ARIMA suffices. For strategic portfolio management, LSTM's superior accuracy justifies the computational cost.

Statistical Validation

  • โœ… All sector impacts on closing prices are highly significant (p < 2.2e-16)
  • โœ… T-tests confirm statistically significant changes in sector performance post-crisis (p < 0.05)
  • โœ… Information Technology and Consumer Discretionary are primary index drivers

๐Ÿ› ๏ธ Technologies

Statistical Analysis: R (โ‰ฅ 4.0) with ANOVA, t-tests, time series analysis
Machine Learning: LSTM neural networks (Keras/TensorFlow)
Data: 500+ companies, 11 GICS sectors, 6,000+ trading days
Reproducibility: R Markdown notebooks with full documentation

๐Ÿ“– See INSTALLATION.md for setup instructions


๐Ÿš€ Quick Start

# Clone the repository
git clone https://github.com/DATS6101-TeamNeo/final-project.git
cd final-project

# Open analysis/Main.Rmd in RStudio and knit to generate the full report

For detailed installation and usage instructions, see INSTALLATION.md


๐Ÿ“ Project Structure

โ”œโ”€โ”€ analysis/                     # R Markdown analysis notebooks
โ”‚   โ”œโ”€โ”€ Main.Rmd                  # ๐ŸŽฏ Complete analysis (START HERE)
โ”‚   โ”œโ”€โ”€ Summary.Rmd               # Executive summary
โ”‚   โ”œโ”€โ”€ EDA.Rmd                   # Exploratory data analysis
โ”‚   โ””โ”€โ”€ ARIMA_baseline.Rmd        # ARIMA forecasting model
โ”œโ”€โ”€ reports/                      # Generated HTML reports
โ”‚   โ”œโ”€โ”€ Main.html
โ”‚   โ”œโ”€โ”€ Summary.html
โ”‚   โ”œโ”€โ”€ EDA.html
โ”‚   โ””โ”€โ”€ ARIMA_baseline.html
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ raw/                      # Historical datasets
โ”‚   โ”‚   โ”œโ”€โ”€ sp500_companies.csv
โ”‚   โ”‚   โ””โ”€โ”€ sp500_index.csv
โ”‚   โ””โ”€โ”€ scripts/                  # Data collection scripts
โ”‚       โ””โ”€โ”€ Generate_SandP500.Rmd
โ”œโ”€โ”€ models/
โ”‚   โ””โ”€โ”€ checkpoints.h5            # Pre-trained LSTM model
โ”œโ”€โ”€ figures/                      # Generated plots and images
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ INSTALLATION.md           # Setup guide
โ”‚   โ””โ”€โ”€ Final-Project-Proposal.pdf
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

๐Ÿ’ก Practical Applications

For Portfolio Managers

  • Diversification Strategy: Allocate higher weights to Health Care and Technology during periods of high economic uncertainty
  • Crisis Hedging: Reduce exposure to Real Estate and traditional Energy before anticipated market disruptions

For Risk Analysts

  • Stress Testing: Use sector-specific volatility patterns from historical crises to model portfolio risk
  • Recovery Timelines: Plan liquidity based on observed sector recovery periods (Financials: 3+ years; Technology: <1 year)

For Economic Policymakers

  • Targeted Stimulus: Prioritize support for Real Estate and Financials during financial crises; support travel, hospitality, and retail during pandemic-type events
  • Industry Monitoring: Focus regulatory attention on sectors showing abnormal volatility patterns

๐ŸŽ“ Academic Context

Course: DATS 6101 - Introduction to Data Science
Institution: The George Washington University
Team: Phanindra Kumar Kalaga, Prudhvi Chekuri, Bharat Khandelwal, Dinesh Chandra Gaddam

This project demonstrates:

  • Integration of classical statistics with modern machine learning
  • Reproducible research practices using R Markdown
  • Real-world application of data science to financial markets
  • Rigorous hypothesis testing and model validation

๐Ÿ‘ฅ Authors & Contact

Name Email
Phanindra Kumar Kalaga [email protected]
Prudhvi Chekuri [email protected]
Bharat Khandelwal [email protected]
Dinesh Chandra Gaddam [email protected]

๐Ÿ“– Documentation


๐Ÿ”— Resources


๐Ÿ“œ License

MIT License - see LICENSE file for details

Copyright (c) 2024 Team Neo


๐Ÿ™ Acknowledgments

  • Professor and TAs of DATS 6101 at The George Washington University
  • Yahoo Finance for comprehensive historical data
  • R and TensorFlow communities for excellent open-source tools

โญ If this research helped your work, please star this repository!

Last Updated: November 2024

About

Historical event impact on S&P 500 sectors | R statistical & ML analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •