Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Market Cap Groupings in factorsSPGMI and stocksCRSP #87

Open
spinnj opened this issue Apr 2, 2022 · 1 comment
Open

Market Cap Groupings in factorsSPGMI and stocksCRSP #87

spinnj opened this issue Apr 2, 2022 · 1 comment

Comments

@spinnj
Copy link
Contributor

spinnj commented Apr 2, 2022

Market cap groupings (Large Cap, Mid Cap, Small Cap, Micro Cap) were apparently assigned by ??? and were not official data pulled by from CRSP or S&P Global Markets. Two issues have been identified:

  1. The assignments themselves may not be "correct" in some sense and the methodology used is not known to me.
  2. The assignments appear to be constant over time. The stock "AMD" had market capitalization volatility over the sample period such that it would have likely been small, mid, and large at times, but it is in the "MidCap" group for the entire sample. This will make future merges of the factorsSPGMI data with other data sets unlikely to go well, where vendors would have the cap grouping changing over time.

CRSP reconstitutes membership in their cap groupings quarterly according to a 70%, 85%, 98% set of breakpoints for cumulative market capitalization coverage and also has a method for dealing with names that are on the border between groups.

I'm highlighting this issue for @JustinMShea to see if there's any desire to try to clean up the cap groupings (e.g. by applying an industry standard approach for a point-in-time assignment of stocks to groupings similar to CRSP or if the current data are "good enough".

@JustinMShea
Copy link
Collaborator

I think there is a strong desire to clean this up, great catch @spinnj. The market cap changes over time might make for good questions on assignments, so including this would be helpful. Again, I think @braverock believes the data set should be as close to reality as possible, which I agree with.

We may need to get the original files to do this, and ultimately recreate stocksCRSP and factorsSPGMI objects to be sure everything is correct. The scripts to do so could be included in the package as vignette as well, which would both document the entire process we used as well as give those with access to the raw datasets the ability to quickly load & transform them in R.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants