Wrap for file bisect instead of relinking #313
Labels
c++
Involves touching c++ code
documentation
Involves touching documentation
enhancement
make
Involves touching GNU Makefiles
python
Involves touching python code
tests
Involves touching tests
Feature Request
Describe the new feature:
This feature request is very similar to Issue #291 (Wrap for symbol bisect instead of weak linking).
Our current approach to file bisect is to create each object file from both compilations. Each search step is to relink a new executable, run the test, and generate a comparison value.
For very large projects, linking may not be a trivial act. For example, with a 1 million line project, linking is single-threaded and may actually be the most costly part of the compilation. This causes significant overhead, especially if the link-time is significantly larger than the test runtime.
Suggested change:
I believe it is worth investigating doing only one link phase and switching between the functions in both files at runtime.
What I'm proposing is to take the two object files and rename the public functions with a unique prefix for both object file versions. Then synthesize a single C file that branches between the two renamed symbols at runtime based on some runtime value. There are a few options for specifying the runtime behavior:
I think it would be best to designate which symbol is to be used as a single large step at the beginning of the application by utilizing and initializing function pointers within the synthesized C file.
Naturally, because of concerns and because this approach may not work, I first suggest doing this approach manually at the beginning to verify its validity.
Possible Downside
Supposing we have an application with so many symbols to resolve that it is nontrivial to perform the linking step, we may have a significant problem with the one executable we want to link. Supposing we have 10,000 symbols originally, then this approach would triple that to 30,000 symbols. I am not sure how link-time is affected by the number of symbols. Is it linear? Is it quadratic? Cubic? If it is anything more than linear, than it may not be worthwhile to use this approach. Even though we would be performing many fewer links and only generating a single executable, that one link would take way too long and perhaps cause the linker to run out of system memory at link-time.
Those are two concerns with this approach
Need for fPIC?
It may be possible that this could cause problems if we do not use the
-fPIC
flag? I do not think so since all functions from a chosen file will be used together. We should not require interposition in this case. I'm not 100% sure that this is the case, but I'd say I'm 95% sure.The text was updated successfully, but these errors were encountered: