Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial paper outline #11

Open
nicholasturner1 opened this issue May 9, 2022 · 3 comments
Open

Initial paper outline #11

nicholasturner1 opened this issue May 9, 2022 · 3 comments

Comments

@nicholasturner1
Copy link
Owner

nicholasturner1 commented May 9, 2022

TO DO:

  • Debug and finish logit attribution function
  • Functionally analyze the heads that receive many of the top composition weights in Neo125M
  • Generalize parameter matrix functions to directly work on different devices (e.g., GPUs)
  • Finish running composition values for larger models on larger machines
  • Functionally analyze as many interesting heads as possible before conference

OUTLINE

Figure 1: Contributions to the residual stream and the datasets

  • Architecture schematic with the residual stream
  • Subspace vector diagram (contributions to the stream add within any given unit subspace)
  • Table of the number of input/output pairs of different types across models
  • Basic value distributions

Figure: Breakdown by type (QKV) and layer

  • Percentile vs type histogram (w/ and w/o baselines)
  • Percentile vs layer histogram (<- this needs to be normalized)
  • Where do the top values point?
  • Biases vs. att head layer distances

Figure: Can we extract interesting functional properties?

  • Can we extract induction heads?
  • What do the heads with many top values do?

Figure: Higher order terms

  • Input path complexity cartoon (w/ and w/o baselines)
  • Input path complexity plots vs. random

Supplementary Figures

  • Basic value distributions with orig denominator
  • Head-by-head term value plots (one w/ high values and one without)
  • Scatterplot of old denominator vs. new denominator

NICE TO HAVE

  • Measuring composition with the embedding & unembedding weights
  • Think about fusing LN weights with input/output matrices faster to speed things up
  • Include MLPs in path analysis

FUTURE WORK:

  • See if paths functionally maintain signals by injecting noise at an early point, and measuring downstream effects (future work)
  • Look at the maximum value of the reverse edges and see where it pops up
  • Measure network performance before & after knocking out low composition reads and high composition reads
  • More meta: do a deeper dive into orthogonal vectors, basis, and subspaces. Investigative work on what mental tools might be useful. Almost orthogonal vectors, etc.
  • Reserve some portion of the residual stream for embeddings, and then measure how "inflated" the remaining subspaces of the attention head are
  • Major singular values and where they point
  • More work on baselines
  • Rank of input and output weights (by head and layer)
  • Think about the baseline more

Figure: The baseline

  • Computing reverse edges cartoon
  • "95% confidence" and "non-random" thresholds
  • num sent edges heatmap (by attention head)
  • num received edges heatmap (by attention head)

Figure: Individual singular values

  • Do large bandwidth terms mean one large value? Or several?
@nicholasturner1
Copy link
Owner Author

Updated Figure 5 to work with IPC

@nicholasturner1 nicholasturner1 pinned this issue May 14, 2022
@nicholasturner1
Copy link
Owner Author

Shuffled a bunch of things around after the Eleuther talk, including designating material for follow-up papers.

@nicholasturner1
Copy link
Owner Author

nicholasturner1 commented Jul 9, 2022

Action items from meeting with Neel Nanda

IMPORTANT

  • Check whether we can find induction heads using our values
  • Functionally analyze the heads that receive many of the top composition weights (particularly V composition)

NICE TO HAVE

  • Measuring composition with the embedding & unembedding weights
  • Scatterplot of old denominator vs. new denominator
  • Think about fusing LN weights with input/output matrices faster to speed things up
  • Include MLPs in path analysis

FUTURE WORK

  • See if paths functionally maintain signals by injecting noise at an early point, and measuring downstream effects (future work)
  • Look at the maximum value of the reverse edges and see where it pops up
  • Measure network performance before & after knocking out low composition reads and high composition reads
  • More meta: do a deeper dive into orthogonal vectors, basis, and subspaces. Investigative work on what tools might be useful.
  • Reserve some portion of the residual stream for embeddings, and then measure how "inflated" the remaining subspaces of the attention head are

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant