Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap for file bisect instead of relinking #313

Open
mikebentley15 opened this issue Dec 24, 2019 · 0 comments
Open

Wrap for file bisect instead of relinking #313

mikebentley15 opened this issue Dec 24, 2019 · 0 comments
Labels
c++ Involves touching c++ code documentation Involves touching documentation enhancement make Involves touching GNU Makefiles python Involves touching python code tests Involves touching tests

Comments

@mikebentley15
Copy link
Collaborator

Feature Request

Describe the new feature:
This feature request is very similar to Issue #291 (Wrap for symbol bisect instead of weak linking).

Our current approach to file bisect is to create each object file from both compilations. Each search step is to relink a new executable, run the test, and generate a comparison value.

For very large projects, linking may not be a trivial act. For example, with a 1 million line project, linking is single-threaded and may actually be the most costly part of the compilation. This causes significant overhead, especially if the link-time is significantly larger than the test runtime.

Suggested change:
I believe it is worth investigating doing only one link phase and switching between the functions in both files at runtime.

What I'm proposing is to take the two object files and rename the public functions with a unique prefix for both object file versions. Then synthesize a single C file that branches between the two renamed symbols at runtime based on some runtime value. There are a few options for specifying the runtime behavior:

  • command-line option for each file and which version to use
  • command-line option to pass in a file containing each version of each object file to use
  • environment variable(s) specifying each file and which version to use
  • environment variable passing in a file containing each version of each object file to use

I think it would be best to designate which symbol is to be used as a single large step at the beginning of the application by utilizing and initializing function pointers within the synthesized C file.

Naturally, because of concerns and because this approach may not work, I first suggest doing this approach manually at the beginning to verify its validity.

Possible Downside
Supposing we have an application with so many symbols to resolve that it is nontrivial to perform the linking step, we may have a significant problem with the one executable we want to link. Supposing we have 10,000 symbols originally, then this approach would triple that to 30,000 symbols. I am not sure how link-time is affected by the number of symbols. Is it linear? Is it quadratic? Cubic? If it is anything more than linear, than it may not be worthwhile to use this approach. Even though we would be performing many fewer links and only generating a single executable, that one link would take way too long and perhaps cause the linker to run out of system memory at link-time.

Those are two concerns with this approach

  • How does link-time scale with the number of unresolved symbols?
  • How does system memory usage scale at link time with the number of unresolved symbols?

Need for fPIC?
It may be possible that this could cause problems if we do not use the -fPIC flag? I do not think so since all functions from a chosen file will be used together. We should not require interposition in this case. I'm not 100% sure that this is the case, but I'd say I'm 95% sure.

@mikebentley15 mikebentley15 added enhancement python Involves touching python code c++ Involves touching c++ code make Involves touching GNU Makefiles documentation Involves touching documentation tests Involves touching tests labels Dec 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Involves touching c++ code documentation Involves touching documentation enhancement make Involves touching GNU Makefiles python Involves touching python code tests Involves touching tests
Projects
None yet
Development

No branches or pull requests

1 participant