Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

505 Companies #18

Open
tjradcliffe opened this issue Aug 3, 2017 · 4 comments
Open

505 Companies #18

tjradcliffe opened this issue Aug 3, 2017 · 4 comments

Comments

@tjradcliffe
Copy link

This listing for the S&P500 has 505 companies in it. There should probably be some kind of invariant imposed so updates are rejected if they include something other than 500 companies.

@gh-isoar
Copy link

gh-isoar commented Aug 3, 2017 via email

@rufuspollock
Copy link
Member

@gh-isoar @tjradcliffe but should we eliminate those dual symbols so we have just 500 companies (perhaps with multiple symbols)?

@gh-isoar and thanks for the info 😄

rufuspollock added a commit that referenced this issue Apr 2, 2018
* Note we have 505 companies atm - as per #18.
@gh-isoar
Copy link

gh-isoar commented Apr 10, 2018

@rufuspollock "The S&P 500" trademark is owned by S&P which is the sole arbiter of its definition; obviously this list must accurately reflect that definition. That definition comprises not just 500 companies, but the specific stock tickers that S&P has decided accurately represent those companies. In some cases the correct subset of a company's tickers is obscure; S&P's relevant decisions are documented when made but are impossible to reliably infer from non-S&P data.

I see two categories of usage that the list should support well:

  • calculation of values that can be compared, or used in conjunction, with those calculated by other users of "The S&P 500" definition such as investment houses; such uses must by driven by the first column
  • determination of presence or absence of a company in the list; such uses must operate by matching a substring of the second column

Eliminating the dual symbols would make the first category of usage impossible. Keeping them by putting multiple symbols in the first column would be "de-normalizing" the column - usage would be possible for only the most sophisticated and determined users.

In contrast, redundancy in the second column is easily handled by most users.

@rufuspollock
Copy link
Member

@gh-isoar super useful and clear - and something I think we will add to the README. Thank-you again for your clarifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants