Ever wondered what's making your ELF or Mach-O binary big? Bloaty McBloatface will show you a size profile of the binary so you can understand what's taking up space inside.
Bloaty works on binaries, shared objects, object files, and
static libraries (.a
files). It supports ELF/DWARF and
Mach-O, though the Mach-O support is much more preliminary
(it shells out to otool
/symbols
instead of parsing the
file directly).
This is not an official Google product.
Bloaty uses CMake to build. All dependencies are included as Git submodules. To build, simply run:
$ cmake .
$ make -j6
To run tests, type:
$ make test
All the normal CMake features are available, like out-of-source builds:
$ mkdir build
$ cd build
$ cmake ..
$ make -j6
Run it directly on a binary target. For example, run it on itself.
$ ./bloaty bloaty
On Linux you'll see output something like:
VM SIZE FILE SIZE
-------------- --------------
0.0% 0 .debug_info 13.0Mi 37.6%
0.0% 0 .debug_loc 7.45Mi 21.5%
0.0% 0 .debug_str 5.14Mi 14.8%
40.1% 2.17Mi .text 2.17Mi 6.3%
0.0% 0 .debug_ranges 1.83Mi 5.3%
30.6% 1.66Mi .rodata 1.66Mi 4.8%
0.0% 0 .debug_line 878Ki 2.5%
0.0% 0 .strtab 458Ki 1.3%
7.1% 394Ki .rela.dyn 394Ki 1.1%
6.4% 357Ki .dynstr 357Ki 1.0%
5.5% 307Ki .data.rel.ro 307Ki 0.9%
0.0% 0 .debug_abbrev 283Ki 0.8%
4.2% 235Ki .eh_frame 235Ki 0.7%
0.0% 0 .symtab 187Ki 0.5%
2.2% 123Ki .dynsym 123Ki 0.3%
1.0% 54.1Ki .data 54.1Ki 0.2%
0.8% 44.6Ki .gcc_except_table 44.6Ki 0.1%
0.7% 39.6Ki .gnu.hash 39.6Ki 0.1%
0.7% 36.5Ki .eh_frame_hdr 36.5Ki 0.1%
0.5% 30.0Ki [24 Others] 29.7Ki 0.1%
0.0% 0 .debug_aranges 27.3Ki 0.1%
100.0% 5.42Mi TOTAL 34.7Mi 100.0%
The "VM SIZE" column tells you how much space the binary will take when it is loaded into memory. The "FILE SIZE" column tells you about how much space the binary is taking on disk. These two can be very different from each other:
- Some data lives in the file but isn't loaded into memory, like debug information.
- Some data is mapped into memory but doesn't exist in the
file. This mainly applies to the
.bss
section (zero-initialized data).
The default breakdown in Bloaty is by sections, but many other ways of slicing the binary are supported such as symbols and segments. If you compiled with debug info, you can even break down by compile units and inlines!
$ ./bloaty bloaty -d compileunits
VM SIZE FILE SIZE
-------------- --------------
62.3% 3.04Mi [None] 31.1Mi 94.4%
11.2% 557Ki [91 Others] 556Ki 1.7%
3.7% 182Ki third_party/protobuf/src/google/protobuf/descriptor.cc 179Ki 0.5%
3.2% 162Ki third_party/protobuf/src/google/protobuf/descriptor.pb.cc 161Ki 0.5%
2.4% 117Ki third_party/capstone/arch/AArch64/AArch64InstPrinter.c 117Ki 0.3%
2.1% 103Ki third_party/capstone/arch/ARM/ARMDisassembler.c 103Ki 0.3%
1.9% 96.5Ki third_party/capstone/arch/Sparc/SparcInstPrinter.c 96.5Ki 0.3%
1.6% 82.1Ki third_party/demumble/third_party/libcxxabi/cxa_demangle.cpp 82.1Ki 0.2%
1.5% 74.7Ki third_party/capstone/arch/PowerPC/PPCInstPrinter.c 74.7Ki 0.2%
1.2% 61.8Ki third_party/protobuf/src/google/protobuf/generated_message_reflection.cc 61.8Ki 0.2%
1.2% 59.8Ki src/bloaty.cc 59.7Ki 0.2%
1.1% 55.1Ki third_party/protobuf/src/google/protobuf/text_format.cc 55.1Ki 0.2%
0.9% 43.3Ki third_party/capstone/arch/ARM/ARMInstPrinter.c 43.3Ki 0.1%
0.8% 41.9Ki third_party/re2/re2/parse.cc 41.9Ki 0.1%
0.8% 39.1Ki third_party/protobuf/src/google/protobuf/map_field.cc 39.1Ki 0.1%
0.7% 36.1Ki third_party/protobuf/src/google/protobuf/wire_format.cc 36.1Ki 0.1%
0.7% 36.0Ki src/dwarf.cc 36.0Ki 0.1%
0.7% 35.3Ki third_party/re2/re2/re2.cc 35.3Ki 0.1%
0.7% 33.8Ki third_party/protobuf/src/google/protobuf/extension_set.cc 33.8Ki 0.1%
0.6% 30.8Ki third_party/capstone/arch/AArch64/AArch64Disassembler.c 30.8Ki 0.1%
0.6% 29.4Ki third_party/re2/re2/dfa.cc 29.4Ki 0.1%
100.0% 4.87Mi TOTAL 32.9Mi 100.0%
Run Bloaty with --help
to see a list of available options:
$ ./bloaty --help
Bloaty McBloatface: a size profiler for binaries.
USAGE: bloaty [options] file... [-- base_file...]
Options:
--csv Output in CSV format instead of human-readable.
-c <file> Load configuration from <file>.
-d <sources> Comma-separated list of sources to scan.
-C <mode> How to demangle symbols. Possible values are:
--demangle=<mode> --demangle=none no demangling, print raw symbols
--demangle=short demangle, but omit arg/return types
--demangle=full print full demangled type
The default is --demangle=short.
--disassemble=<function>
Disassemble this function (EXPERIMENTAL)
-n <num> How many rows to show per level before collapsing
other keys into '[Other]'. Set to '0' for unlimited.
Defaults to 20.
-s <sortby> Whether to sort by VM or File size. Possible values
are:
-s vm
-s file
-s both (the default: sorts by max(vm, file)).
-w Wide output; don't truncate long labels.
--help Display this message and exit.
--list-sources Show a list of available sources and exit.
Options for debugging Bloaty:
--debug-vmaddr=ADDR
--debug-fileoff=OFF
Print extended debugging information for the given
VM address and/or file offset.
-v Verbose output. Dumps warnings encountered during
processing and full VM/file maps at the end.
Add more v's (-vv, -vvv) for even more.
You can use Bloaty to see how the size of a binary changed.
On the command-line, pass --
followed by the files you
want to use as the diff base.
For example, here is a size diff between a couple different versions of Bloaty, showing how it grew when I added some features.
$ ./bloaty bloaty -- oldbloaty
VM SIZE FILE SIZE
-------------- --------------
[ = ] 0 .debug_loc +688Ki +9.9%
+19% +349Ki .text +349Ki +19%
[ = ] 0 .debug_ranges +180Ki +11%
[ = ] 0 .debug_info +120Ki +0.9%
+23% +73.5Ki .rela.dyn +73.5Ki +23%
+3.5% +57.1Ki .rodata +57.1Ki +3.5%
+28e3% +53.9Ki .data +53.9Ki +28e3%
[ = ] 0 .debug_line +40.2Ki +4.8%
+2.3% +5.35Ki .eh_frame +5.35Ki +2.3%
-6.0% -5 [Unmapped] +2.65Ki +215%
+0.5% +1.70Ki .dynstr +1.70Ki +0.5%
[ = ] 0 .symtab +1.59Ki +0.9%
[ = ] 0 .debug_abbrev +1.29Ki +0.5%
[ = ] 0 .strtab +1.26Ki +0.3%
+16% +992 .bss 0 [ = ]
+0.2% +642 [13 Others] +849 +0.2%
+0.6% +792 .dynsym +792 +0.6%
+16% +696 .rela.plt +696 +16%
+16% +464 .plt +464 +16%
+0.8% +312 .eh_frame_hdr +312 +0.8%
[ = ] 0 .debug_str -19.6Ki -0.4%
+11% +544Ki TOTAL +1.52Mi +4.6%
Each line shows the how much each part changed compared to
its previous size. Most sections grew, but one section at
the bottom (.debug_str
) shrank. The "TOTAL" line shows
how much the size changed overall.
Bloaty supports breaking the binary down in lots of
different ways. You can combine multiple data sources into
a single hierarchical profile. For example, we can use the
segments
and sections
data sources in a single report:
$ bloaty -d segments,sections bloaty
VM SIZE FILE SIZE
-------------- --------------
0.0% 0 [Unmapped] 7.31Mi 94.2%
-NAN% 0 .debug_info 2.97Mi 40.6%
-NAN% 0 .debug_loc 2.30Mi 31.5%
-NAN% 0 .debug_str 1.03Mi 14.2%
-NAN% 0 .debug_ranges 611Ki 8.2%
-NAN% 0 .debug_line 218Ki 2.9%
-NAN% 0 .debug_abbrev 85.4Ki 1.1%
-NAN% 0 .strtab 62.8Ki 0.8%
-NAN% 0 .symtab 27.8Ki 0.4%
-NAN% 0 .debug_aranges 13.5Ki 0.2%
-NAN% 0 [Unmapped] 2.82Ki 0.0%
-NAN% 0 .shstrtab 371 0.0%
-NAN% 0 .comment 43 0.0%
99.2% 452Ki LOAD [RX] 452Ki 5.7%
73.4% 332Ki .text 332Ki 73.4%
13.3% 60.0Ki .rodata 60.0Ki 13.3%
7.0% 31.8Ki .eh_frame 31.8Ki 7.0%
2.3% 10.5Ki .gcc_except_table 10.5Ki 2.3%
0.9% 4.18Ki .eh_frame_hdr 4.18Ki 0.9%
0.8% 3.54Ki .dynsym 3.54Ki 0.8%
0.8% 3.52Ki .dynstr 3.52Ki 0.8%
0.7% 2.98Ki .rela.plt 2.98Ki 0.7%
0.4% 2.00Ki .plt 2.00Ki 0.4%
0.1% 568 [ELF Headers] 568 0.1%
0.1% 408 .rela.dyn 408 0.1%
0.1% 304 .gnu.version_r 304 0.1%
0.1% 302 .gnu.version 302 0.1%
0.0% 216 .gnu.hash 216 0.0%
0.0% 36 .note.gnu.build-id 36 0.0%
0.0% 32 .note.ABI-tag 32 0.0%
0.0% 28 .interp 28 0.0%
0.0% 26 .init 26 0.0%
0.0% 18 [Unmapped] 18 0.0%
0.0% 9 .fini 9 0.0%
0.8% 3.46Ki LOAD [RW] 1.88Ki 0.0%
45.6% 1.58Ki .bss 0 0.0%
29.3% 1.02Ki .got.plt 1.02Ki 54.1%
14.9% 528 .dynamic 528 27.4%
7.1% 252 .data 252 13.1%
1.4% 48 .init_array 48 2.5%
0.7% 24 .got 24 1.2%
0.5% 16 [Unmapped] 16 0.8%
0.2% 8 .fini_array 8 0.4%
0.2% 8 .jcr 8 0.4%
0.1% 4 [None] 0 0.0%
0.0% 0 [ELF Headers] 2.38Ki 0.0%
100.0% 456Ki TOTAL 7.75Mi 100.0%
Bloaty displays a maximum of 20 lines for each level; other
values are grouped into an [Other]
bin. Use -n <num>
to override this setting. If you pass -n 0
, all data
will be output without collapsing anything into [Other]
.
Bloaty supports reading debuginfo/symbols from separate binaries. This lets you profile a stripped binary, even for data sources like "compileunits" or "symbols" that require this extra information.
Bloaty uses build IDs to verify that the binary and the debug file match. Otherwise the results would be nonsense (this kind of mismatch might sound unlikely but it's a very easy mistake to make, and one that I made several times even as Bloaty's author!).
Make sure you are compiling with build IDs enabled. For gcc
this happens automatically, but Clang decided not to make
this the default, since it makes the link
slower.
For Clang add -Wl,--build-id
to your link line. (If you
want a slightly faster link and don't care about
reproducibility, you can use -Wl,--build-id=uuid
instead).
Then you can strip the binary and uses the unstripped binary as your debug file. For example, with bloaty itself:
$ cp bloaty bloaty.stripped
$ strip bloaty.stripped
$ ./bloaty -d compileunits --debug-file=bloaty bloaty.stripped
It is also possible to remove debug sections only (see
objcopy --strip-debug
) while keeping the symbol table.
You can also create debug file that contain only debug
info (see objcopy --only-keep-debug
).
Bloaty does not currently support the GNU debuglink or
looking up debug files by build ID, which are the methods
GDB uses to find debug
files.
If there are use cases where Bloaty's --debug-file
option
won't work, we can reconsider implementing these.
Any options that you can specify on the command-line, you
can put into a configuration file instead. Then use can use
-c FILE
to load those options from the config file. Also,
a few features are only available with configuration files
and cannot be specify on the command-line.
The configuration file is a in Protocol Buffers text format.
The schema is the Options
message in
src/bloaty.proto.
The two most useful cases for configuration files are:
-
You have too many input files to put on the command-line. At Google we sometimes run Bloaty over thousands of input files. This can cause the overall command-line to exceed OS limits. With a config file, we can avoid this:
filename: "path/to/long_filename_a.o" filename: "path/to/long_filename_b.o" filename: "path/to/long_filename_c.o" # ...repeat for thousands of files.
-
For custom data sources, it can be very useful to put them in a config file, for greater reusability. For example, see the custom data sources defined in custom_sources.bloaty. Also read more about custom data sources below.
Bloaty has many data sources built in. These all provide different ways of looking at the binary. You can also create your own data sources by applying regexes to the built-in data sources (see "Custom Data Sources" below).
While Bloaty works on binaries, shared objects, object
files, and static libraries (.a
files), some of the data
sources don't work on object files. This applies especially
to data sources that read debug info.
Segments are what the run-time loader uses to determine what
parts of the binary need to be loaded/mapped into memory.
There are usually just a few segments: one for each set of
mmap()
permissions required:
$ bloaty -d segments bloaty
VM SIZE FILE SIZE
-------------- --------------
0.0% 0 [Unmapped] 7.31Mi 94.2%
99.2% 452Ki LOAD [RX] 452Ki 5.7%
0.8% 3.46Ki LOAD [RW] 1.88Ki 0.0%
0.0% 0 [ELF Headers] 2.38Ki 0.0%
100.0% 456Ki TOTAL 7.75Mi 100.0%
Here we see one segment mapped [R E]
(read/execute) and
one segment mapped [RW ]
(read/write). A large part of
the binary is not loaded into memory, which we see as
[Unmapped]
.
Object files and static libraries don't have segments. However we fake it by grouping sections by their flags. This gives us a break-down sort of like real segments.
$ ./bloaty bloaty -d segments src/bloaty.o
VM SIZE FILE SIZE
-------------- --------------
0.0% 0 [Unmapped] 7.31Mi 67.6%
0.0% 0 Section [] 2.95Mi 27.3%
85.2% 452Ki LOAD [RX] 452Ki 4.1%
11.3% 59.8Ki Section [AX] 59.8Ki 0.5%
0.0% 0 [ELF Headers] 28.3Ki 0.3%
2.9% 15.4Ki Section [A] 15.4Ki 0.1%
0.7% 3.46Ki LOAD [RW] 1.88Ki 0.0%
0.0% 41 Section [AW] 20 0.0%
100.0% 531Ki TOTAL 10.8Mi 100.0%
Sections give us a bit more granular look into the binary. If we want to find the symbol table, the unwind information, or the debug information, each kind of information lives in its own section. Bloaty's default output is sections.
$ bloaty -d sections bloaty
VM SIZE FILE SIZE
-------------- --------------
0.0% 0 .debug_info 2.97Mi 38.3%
0.0% 0 .debug_loc 2.30Mi 29.7%
0.0% 0 .debug_str 1.03Mi 13.3%
0.0% 0 .debug_ranges 611Ki 7.7%
72.8% 332Ki .text 332Ki 4.2%
0.0% 0 .debug_line 218Ki 2.8%
0.0% 0 .debug_abbrev 85.4Ki 1.1%
0.0% 0 .strtab 62.8Ki 0.8%
13.2% 60.0Ki .rodata 60.0Ki 0.8%
7.0% 31.8Ki .eh_frame 31.8Ki 0.4%
0.0% 0 .symtab 27.8Ki 0.3%
0.0% 0 .debug_aranges 13.5Ki 0.2%
2.3% 10.5Ki .gcc_except_table 10.5Ki 0.1%
1.5% 6.77Ki [Other] 5.60Ki 0.1%
0.9% 4.18Ki .eh_frame_hdr 4.18Ki 0.1%
0.8% 3.54Ki .dynsym 3.54Ki 0.0%
0.8% 3.52Ki .dynstr 3.52Ki 0.0%
0.7% 2.98Ki .rela.plt 2.98Ki 0.0%
0.1% 568 [ELF Headers] 2.93Ki 0.0%
0.0% 34 [Unmapped] 2.85Ki 0.0%
0.0% 4 [None] 0 0.0%
100.0% 456Ki TOTAL 7.75Mi 100.0%
Symbols come from the symbol table, and represent individual functions or variables.
$ ./bloaty -d symbols bloaty
VM SIZE FILE SIZE
-------------- --------------
17.9% 81.9Ki [Unmapped] 7.39Mi 95.3%
62.3% 283Ki [Other] 284Ki 3.6%
2.7% 12.3Ki re2::RE2::Match(re2::StringPiece const&, int, int, re2::RE2::Anchor, re2::String 12.3Ki 0.2%
1.7% 7.83Ki re2::unicode_groups 7.83Ki 0.1%
1.7% 7.56Ki re2::NFA::Search 7.56Ki 0.1%
1.3% 5.76Ki re2::BitState::TrySearch 5.76Ki 0.1%
1.2% 5.43Ki bloaty::Bloaty::ScanAndRollupFile 5.43Ki 0.1%
1.0% 4.49Ki re2::DFA::DFA 4.49Ki 0.1%
1.0% 4.35Ki bool bloaty::(anonymous namespace)::ForEachElf<bloaty::(anonymous namespace)::Do 4.35Ki 0.1%
1.0% 4.34Ki re2::Regexp::Parse 4.34Ki 0.1%
0.9% 4.20Ki re2::RE2::Init 4.20Ki 0.1%
0.9% 4.09Ki re2::Prog::IsOnePass 4.09Ki 0.1%
0.9% 4.04Ki re2::Compiler::PostVisit 4.04Ki 0.1%
0.9% 4.04Ki bloaty::ReadDWARFInlines 4.04Ki 0.1%
0.9% 3.91Ki re2::Regexp::FactorAlternationRecursive 3.91Ki 0.0%
0.8% 3.77Ki re2::DFA::RunStateOnByte 3.77Ki 0.0%
0.8% 3.68Ki re2::unicode_casefold 3.68Ki 0.0%
0.8% 3.52Ki bloaty::ElfFileHandler::ProcessFile 3.52Ki 0.0%
0.7% 3.40Ki re2::DFA::InlinedSearchLoop(re2::DFA::SearchParams*, bool, bool, bool) [clone .c 3.40Ki 0.0%
0.7% 3.38Ki re2::DFA::InlinedSearchLoop(re2::DFA::SearchParams*, bool, bool, bool) [clone .c 3.38Ki 0.0%
0.0% 165 [None] 0 0.0%
100.0% 456Ki TOTAL 7.75Mi 100.0%
You can control how symbols are demangled with the -C MODE
or --demangle=MODE
flag. You can also specify the
demangling mode explicitly in the -d
switch. We have
three different demangling modes:
-C none
or-d rawsymbols
: no, demangling.-C short
or-d shortsymbols
: short demangling: return types, template parameters, and function parameter types are omitted. For example:bloaty::dwarf::FormReader<>::GetFunctionForForm<>()
. This is the default.-C full
or-d fullsymbols
: full demangling.
One very handy thing about -C short
(the default) is that
it groups all template instantiations together, regardless
of their parameters. You can use this to determine how much
code size you are paying by doing multiple instantiations of
templates. Try bloaty -d shortsymbols,fullsymbols
.
When you pass multiple files to Bloaty, the inputfiles
source will let you break it down by input file:
$ ./bloaty -d inputfiles src/*.o
VM SIZE FILE SIZE
-------------- --------------
51.8% 75.2Ki src/bloaty.o 3.05Mi 48.2%
28.2% 40.9Ki src/dwarf.o 2.04Mi 32.2%
12.1% 17.5Ki src/elf.o 579Ki 8.9%
5.5% 7.99Ki src/macho.o 415Ki 6.4%
2.5% 3.57Ki src/main.o 279Ki 4.3%
100.0% 145Ki TOTAL 6.34Mi 100.0%
When you are running Bloaty on a .a
file, the armembers
source will let you break it down by .o
file inside the
archive.
./bloaty -d armembers src/libbloaty.a
VM SIZE FILE SIZE
-------------- --------------
53.1% 75.2Ki bloaty.o 3.05Mi 50.1%
28.9% 40.9Ki dwarf.o 2.04Mi 33.5%
12.4% 17.5Ki elf.o 579Ki 9.3%
5.6% 7.99Ki macho.o 415Ki 6.7%
0.0% 0 [AR Symbol Table] 27.3Ki 0.4%
0.0% 0 [AR Headers] 308 0.0%
100.0% 141Ki TOTAL 6.10Mi 100.0%
You are free to use this data source even for non-.a
files, but it won't be very useful since it will always just
resolve to the input file (the .a
file).
Using debug information, we can tell what compile unit (and
corresponding source file) each bit of the binary came from.
There are a couple different places in DWARF we can look for
this information; currently we mainly use the
.debug_aranges
section. It's not perfect and sometimes
you'll see some of the binary show up as [None]
if it's
not mentioned in aranges (improving this is a TODO). But it
can tell us a lot.
$ ./bloaty -d compileunits bloaty
VM SIZE FILE SIZE
-------------- --------------
27.9% 128Ki [None] 7.43Mi 95.9%
12.9% 59.2Ki src/bloaty.cc 59.0Ki 0.7%
7.3% 33.4Ki re2/re2.cc 32.3Ki 0.4%
6.9% 31.6Ki re2/dfa.cc 31.6Ki 0.4%
6.8% 31.4Ki re2/parse.cc 31.4Ki 0.4%
6.7% 30.9Ki src/dwarf.cc 30.9Ki 0.4%
6.7% 30.6Ki re2/regexp.cc 27.8Ki 0.4%
5.1% 23.7Ki re2/compile.cc 23.7Ki 0.3%
4.3% 19.7Ki re2/simplify.cc 19.7Ki 0.2%
3.2% 14.8Ki src/elf.cc 14.8Ki 0.2%
3.1% 14.2Ki re2/nfa.cc 14.2Ki 0.2%
1.8% 8.34Ki re2/bitstate.cc 8.34Ki 0.1%
1.7% 7.84Ki re2/prog.cc 7.84Ki 0.1%
1.6% 7.13Ki re2/tostring.cc 7.13Ki 0.1%
1.5% 6.67Ki re2/onepass.cc 6.67Ki 0.1%
1.4% 6.58Ki src/macho.cc 6.58Ki 0.1%
0.7% 3.27Ki src/main.cc 3.27Ki 0.0%
0.2% 797 [Other] 797 0.0%
0.1% 666 util/stringprintf.cc 666 0.0%
0.1% 573 util/strutil.cc 573 0.0%
0.1% 476 util/rune.cc 476 0.0%
100.0% 460Ki TOTAL 7.75Mi 100.0%
The DWARF debugging information also contains "line info" information that understands inlining. So within a function, it will know which instructions came from an inlined function from a header file. This is the information the debugger uses to point at a specific source line as you're tracing through a program.
$ ./bloaty -d inlines bloaty
VM SIZE FILE SIZE
-------------- --------------
2.4% 110Ki [None] 7.42Mi 95.6%
90.3% 4.01Mi /usr/include/c++/4.8/bitsstl_vector.h:414 15.3Ki 0.2%
5.5% 250Ki [Other] 250Ki 3.2%
0.3% 11.4Ki /usr/include/c++/4.8/bitsbasic_string.h:539 11.4Ki 0.1%
0.2% 8.81Ki /usr/include/c++/4.8ostream:535 8.81Ki 0.1%
0.2% 7.59Ki /usr/include/c++/4.8/bitsbasic_ios.h:456 7.59Ki 0.1%
0.1% 6.20Ki /usr/include/c++/4.8streambuf:466 6.20Ki 0.1%
0.1% 6.06Ki /usr/include/c++/4.8/bitsbasic_string.h:249 6.06Ki 0.1%
0.1% 4.24Ki /usr/include/c++/4.8/bitsbasic_string.h:240 4.24Ki 0.1%
0.1% 3.61Ki /usr/include/c++/4.8/bitsbasic_ios.h:276 3.61Ki 0.0%
0.1% 3.51Ki /usr/include/c++/4.8/extatomicity.h:81 3.51Ki 0.0%
0.1% 3.19Ki /usr/include/c++/4.8/bitsbasic_string.h:583 3.19Ki 0.0%
0.1% 3.06Ki /usr/include/c++/4.8/bitsbasic_string.h:293 3.06Ki 0.0%
0.1% 2.94Ki /usr/include/c++/4.8/extnew_allocator.h:110 2.94Ki 0.0%
0.1% 2.89Ki /usr/include/c++/4.8ostream:385 2.89Ki 0.0%
0.1% 2.87Ki /usr/include/c++/4.8/bitsstl_construct.h:102 2.87Ki 0.0%
0.1% 2.86Ki /usr/include/c++/4.8/extatomicity.h:84 2.86Ki 0.0%
0.1% 2.76Ki /usr/include/c++/4.8/extatomicity.h:49 2.76Ki 0.0%
0.1% 2.70Ki /usr/include/c++/4.8/bitschar_traits.h:271 2.70Ki 0.0%
0.1% 2.62Ki /usr/include/c++/4.8/bitsbasic_string.h:275 2.62Ki 0.0%
0.1% 2.58Ki /usr/include/c++/4.8ostream:93 2.58Ki 0.0%
100.0% 4.45Mi TOTAL 7.75Mi 100.0%
Sometimes you want to munge the labels from an existing data source. For example, when we use "compileunits" on Bloaty itself, we see files from all our dependencies mixed together:
$ ./bloaty -d compileunits bloaty
VM SIZE FILE SIZE
-------------- --------------
65.5% 3.21Mi [130 Others] 12.3Mi 37.0%
4.6% 232Ki third_party/protobuf/src/google/protobuf/descriptor.cc 3.74Mi 11.2%
5.6% 281Ki third_party/protobuf/src/google/protobuf/descriptor.pb.cc 2.34Mi 7.0%
1.8% 90.4Ki src/bloaty.cc 2.15Mi 6.5%
6.7% 335Ki third_party/capstone/arch/ARM/ARMDisassembler.c 1.64Mi 4.9%
1.3% 63.9Ki src/dwarf.cc 1.32Mi 4.0%
1.6% 82.2Ki third_party/demumble/third_party/libcxxabi/cxa_demangle.cpp 1.17Mi 3.5%
1.5% 73.9Ki third_party/protobuf/src/google/protobuf/text_format.cc 997Ki 2.9%
1.7% 83.5Ki third_party/protobuf/src/google/protobuf/generated_message_reflection.cc 938Ki 2.7%
0.6% 31.3Ki third_party/protobuf/src/google/protobuf/descriptor_database.cc 766Ki 2.2%
1.0% 50.9Ki third_party/protobuf/src/google/protobuf/message.cc 746Ki 2.2%
0.7% 36.4Ki third_party/re2/re2/dfa.cc 621Ki 1.8%
0.8% 42.3Ki third_party/re2/re2/re2.cc 618Ki 1.8%
1.0% 48.3Ki third_party/protobuf/src/google/protobuf/extension_set.cc 608Ki 1.8%
0.9% 46.4Ki third_party/protobuf/src/google/protobuf/map_field.cc 545Ki 1.6%
0.7% 36.1Ki third_party/re2/re2/regexp.cc 538Ki 1.6%
1.7% 86.9Ki third_party/capstone/arch/AArch64/AArch64Disassembler.c 517Ki 1.5%
0.8% 41.8Ki third_party/protobuf/src/google/protobuf/wire_format.cc 513Ki 1.5%
0.5% 25.4Ki third_party/protobuf/src/google/protobuf/generated_message_util.cc 511Ki 1.5%
0.1% 4.33Ki src/main.cc 483Ki 1.4%
0.8% 41.3Ki src/bloaty.pb.cc 465Ki 1.4%
100.0% 4.91Mi TOTAL 33.4Mi 100.0%
If we want to bucket all of these by which library they came from, we can write a custom data source. It specifies the base data source and a set of regexes to apply to it. The regexes are tried in order, and the first matching regex will cause the entire label to be rewritten to the replacement text. Regexes follow RE2 syntax and the replacement can refer to capture groups.
custom_data_source: {
name: "bloaty_package"
base_data_source: "compileunits"
rewrite: {
pattern: "^(\\.\\./)?src"
replacement: "src"
}
rewrite: {
pattern: "^(\\.\\./)?(third_party/\\w+)"
replacement: "\\2"
}
}
Then use the data source like so:
$ ./bloaty -c config.bloaty -d bloaty_package bloaty
VM SIZE FILE SIZE
-------------- --------------
21.7% 1.06Mi third_party/protobuf 14.2Mi 42.6%
42.4% 2.08Mi third_party/capstone 6.88Mi 20.6%
5.1% 256Ki src 5.30Mi 15.9%
5.5% 274Ki third_party/re2 3.97Mi 11.9%
1.6% 82.2Ki third_party/demumble 1.17Mi 3.5%
0.8% 38.0Ki third_party/abseil 526Ki 1.5%
7.8% 390Ki [section .rodata] 390Ki 1.1%
6.4% 320Ki [section .rela.dyn] 320Ki 0.9%
4.6% 231Ki [section .eh_frame] 231Ki 0.7%
0.0% 0 [section .debug_str] 82.7Ki 0.2%
0.9% 44.8Ki [section .gcc_except_table] 44.8Ki 0.1%
0.0% 0 [section .strtab] 40.5Ki 0.1%
0.6% 31.5Ki [23 Others] 38.8Ki 0.1%
0.8% 38.2Ki [section .gnu.hash] 38.2Ki 0.1%
0.7% 36.4Ki [section .eh_frame_hdr] 36.4Ki 0.1%
0.0% 0 [section .debug_aranges] 27.6Ki 0.1%
0.5% 26.4Ki [section .dynstr] 26.4Ki 0.1%
0.0% 0 [section .symtab] 24.9Ki 0.1%
0.4% 20.0Ki [section .data.rel.ro] 20.0Ki 0.1%
0.0% 0 [section .debug_loc] 19.6Ki 0.1%
0.3% 15.4Ki [section .dynsym] 15.4Ki 0.0%
100.0% 4.91Mi TOTAL 33.4Mi 100.0%
We can get an even richer report by combining the
bloaty_package
source with the original compileunits
source:
$ ./bloaty -c config.bloaty -d bloaty_package,compileunits bloaty
VM SIZE FILE SIZE
-------------- --------------
21.7% 1.06Mi third_party/protobuf 14.2Mi 42.6%
21.3% 232Ki third_party/protobuf/src/google/protobuf/descriptor.cc 3.74Mi 26.3%
25.9% 281Ki third_party/protobuf/src/google/protobuf/descriptor.pb.cc 2.34Mi 16.5%
6.8% 73.9Ki third_party/protobuf/src/google/protobuf/text_format.cc 997Ki 6.9%
7.7% 83.5Ki third_party/protobuf/src/google/protobuf/generated_message_reflection.cc 938Ki 6.4%
2.9% 31.3Ki third_party/protobuf/src/google/protobuf/descriptor_database.cc 766Ki 5.3%
4.7% 50.9Ki third_party/protobuf/src/google/protobuf/message.cc 746Ki 5.1%
4.4% 47.8Ki [14 Others] 686Ki 4.7%
4.4% 48.3Ki third_party/protobuf/src/google/protobuf/extension_set.cc 608Ki 4.2%
4.3% 46.4Ki third_party/protobuf/src/google/protobuf/map_field.cc 545Ki 3.7%
3.8% 41.8Ki third_party/protobuf/src/google/protobuf/wire_format.cc 513Ki 3.5%
2.3% 25.4Ki third_party/protobuf/src/google/protobuf/generated_message_util.cc 511Ki 3.5%
1.2% 12.9Ki third_party/protobuf/src/google/protobuf/dynamic_message.cc 316Ki 2.2%
1.6% 17.4Ki third_party/protobuf/src/google/protobuf/extension_set_heavy.cc 288Ki 2.0%
2.3% 25.3Ki third_party/protobuf/src/google/protobuf/stubs/strutil.cc 263Ki 1.8%
1.2% 12.8Ki third_party/protobuf/src/google/protobuf/stubs/common.cc 218Ki 1.5%
1.5% 16.8Ki third_party/protobuf/src/google/protobuf/wire_format_lite.cc 194Ki 1.3%
0.8% 9.22Ki third_party/protobuf/src/google/protobuf/reflection_ops.cc 183Ki 1.3%
1.2% 12.9Ki third_party/protobuf/src/google/protobuf/io/tokenizer.cc 162Ki 1.1%
0.6% 6.90Ki third_party/protobuf/src/google/protobuf/unknown_field_set.cc 150Ki 1.0%
0.3% 3.00Ki third_party/protobuf/src/google/protobuf/any.cc 117Ki 0.8%
0.8% 9.15Ki third_party/protobuf/src/google/protobuf/message_lite.cc 114Ki 0.8%
42.4% 2.08Mi third_party/capstone 6.88Mi 20.6%
15.8% 335Ki third_party/capstone/arch/ARM/ARMDisassembler.c 1.64Mi 23.8%
4.7% 100Ki [22 Others] 579Ki 8.2%
4.1% 86.9Ki third_party/capstone/arch/AArch64/AArch64Disassembler.c 517Ki 7.3%
15.4% 328Ki third_party/capstone/arch/X86/X86DisassemblerDecoder.c 427Ki 6.1%
6.5% 139Ki third_party/capstone/arch/AArch64/AArch64InstPrinter.c 423Ki 6.0%
2.6% 55.6Ki third_party/capstone/arch/Mips/MipsDisassembler.c 408Ki 5.8%
14.1% 299Ki third_party/capstone/arch/X86/X86Mapping.c 380Ki 5.4%
3.5% 73.9Ki third_party/capstone/arch/ARM/ARMInstPrinter.c 293Ki 4.2%
4.5% 96.6Ki third_party/capstone/arch/Sparc/SparcInstPrinter.c 287Ki 4.1%
0.7% 14.4Ki third_party/capstone/arch/X86/X86ATTInstPrinter.c 276Ki 3.9%
3.5% 74.8Ki third_party/capstone/arch/PowerPC/PPCInstPrinter.c 273Ki 3.9%
1.3% 27.8Ki third_party/capstone/arch/PowerPC/PPCDisassembler.c 241Ki 3.4%
1.2% 25.4Ki third_party/capstone/arch/SystemZ/SystemZDisassembler.c 223Ki 3.2%
0.6% 13.3Ki third_party/capstone/arch/X86/X86IntelInstPrinter.c 187Ki 2.7%
5.6% 118Ki third_party/capstone/arch/AArch64/AArch64Mapping.c 154Ki 2.2%
5.2% 111Ki third_party/capstone/arch/ARM/ARMMapping.c 148Ki 2.1%
1.0% 20.3Ki third_party/capstone/arch/X86/X86Disassembler.c 130Ki 1.9%
3.8% 81.5Ki third_party/capstone/arch/Mips/MipsMapping.c 120Ki 1.7%
0.5% 11.3Ki third_party/capstone/arch/XCore/XCoreDisassembler.c 103Ki 1.5%
3.3% 71.0Ki third_party/capstone/arch/PowerPC/PPCMapping.c 100Ki 1.4%
2.1% 44.1Ki third_party/capstone/arch/SystemZ/SystemZMapping.c 91.5Ki 1.3%
5.1% 256Ki src 5.30Mi 15.9%
35.3% 90.4Ki src/bloaty.cc 2.15Mi 40.7%
24.9% 63.9Ki src/dwarf.cc 1.32Mi 25.0%
1.7% 4.33Ki src/main.cc 483Ki 8.9%
16.1% 41.3Ki src/bloaty.pb.cc 465Ki 8.6%
10.3% 26.3Ki src/elf.cc 397Ki 7.3%
2.3% 5.81Ki src/disassemble.cc 204Ki 3.8%
3.2% 8.25Ki src/macho.cc 191Ki 3.5%
6.3% 16.2Ki src/demangle.cc 119Ki 2.2%
5.5% 274Ki third_party/re2 3.97Mi 11.9%
13.3% 36.4Ki third_party/re2/re2/dfa.cc 621Ki 15.3%
15.4% 42.3Ki third_party/re2/re2/re2.cc 618Ki 15.2%
13.2% 36.1Ki third_party/re2/re2/regexp.cc 538Ki 13.2%
9.4% 25.7Ki third_party/re2/re2/compile.cc 363Ki 9.0%
7.0% 19.3Ki third_party/re2/re2/prog.cc 341Ki 8.4%
16.9% 46.2Ki third_party/re2/re2/parse.cc 336Ki 8.3%
8.6% 23.5Ki third_party/re2/re2/simplify.cc 298Ki 7.3%
6.4% 17.5Ki third_party/re2/re2/nfa.cc 267Ki 6.6%
2.4% 6.52Ki third_party/re2/re2/tostring.cc 176Ki 4.3%
2.5% 6.92Ki third_party/re2/re2/onepass.cc 148Ki 3.7%
3.2% 8.65Ki third_party/re2/re2/bitstate.cc 140Ki 3.4%
0.0% 0 third_party/re2/re2/unicode_groups.cc 60.3Ki 1.5%
0.0% 0 third_party/re2/re2/perl_groups.cc 41.1Ki 1.0%
0.7% 1.94Ki third_party/re2/re2/stringpiece.cc 41.0Ki 1.0%
0.8% 2.21Ki third_party/re2/util/strutil.cc 39.8Ki 1.0%
0.0% 0 third_party/re2/re2/unicode_casefold.cc 26.3Ki 0.6%
0.4% 1020 third_party/re2/util/rune.cc 4.77Ki 0.1%
1.6% 82.2Ki third_party/demumble 1.17Mi 3.5%
100.0% 82.2Ki third_party/demumble/third_party/libcxxabi/cxa_demangle.cpp 1.17Mi 100.0%
0.8% 38.0Ki third_party/abseil 526Ki 1.5%
26.5% 10.1Ki third_party/abseil-cpp/absl/strings/escaping.cc 134Ki 25.6%
24.6% 9.35Ki third_party/abseil-cpp/absl/strings/numbers.cc 80.8Ki 15.4%
11.5% 4.35Ki third_party/abseil-cpp/absl/strings/str_cat.cc 58.9Ki 11.2%
10.1% 3.84Ki third_party/abseil-cpp/absl/strings/string_view.cc 44.2Ki 8.4%
4.1% 1.55Ki third_party/abseil-cpp/absl/strings/str_split.cc 41.6Ki 7.9%
4.0% 1.51Ki third_party/abseil-cpp/absl/strings/ascii.cc 40.0Ki 7.6%
3.4% 1.27Ki third_party/abseil-cpp/absl/strings/substitute.cc 38.6Ki 7.3%
9.3% 3.55Ki third_party/abseil-cpp/absl/base/internal/throw_delegate.cc 38.2Ki 7.3%
3.5% 1.33Ki third_party/abseil-cpp/absl/base/internal/raw_logging.cc 31.9Ki 6.1%
2.5% 985 third_party/abseil-cpp/absl/strings/internal/memutil.cc 15.1Ki 2.9%
0.6% 230 third_party/abseil-cpp/absl/strings/internal/utf8.cc 1.85Ki 0.4%
7.8% 390Ki [section .rodata] 390Ki 1.1%
6.4% 320Ki [section .rela.dyn] 320Ki 0.9%
4.6% 231Ki [section .eh_frame] 231Ki 0.7%
0.0% 0 [section .debug_str] 82.7Ki 0.2%
0.9% 44.8Ki [section .gcc_except_table] 44.8Ki 0.1%
0.0% 0 [section .strtab] 40.5Ki 0.1%
0.6% 31.5Ki [23 Others] 38.8Ki 0.1%
0.8% 38.2Ki [section .gnu.hash] 38.2Ki 0.1%
0.7% 36.4Ki [section .eh_frame_hdr] 36.4Ki 0.1%
0.0% 0 [section .debug_aranges] 27.6Ki 0.1%
0.5% 26.4Ki [section .dynstr] 26.4Ki 0.1%
0.0% 0 [section .symtab] 24.9Ki 0.1%
0.4% 20.0Ki [section .data.rel.ro] 20.0Ki 0.1%
0.0% 0 [section .debug_loc] 19.6Ki 0.1%
0.3% 15.4Ki [section .dynsym] 15.4Ki 0.0%
100.0% 4.91Mi TOTAL 33.4Mi 100.0%
Here are some tentative plans for future features.
If we can analyze references between symbols, this would enable a lot of features:
- Detect garbage symbols (ie. how much would the binary
shrink if we compiled with
-ffunction-sections -fdata-sections -Wl,-gc-sections
). - Understand why a particular symbol can't be
garbage-collected (like
ld -why_live
on OS X). - Visualize the dependency tree of symbols (probably as a dominator tree) so users can see the weight of their binary in this way.
One of the things we have to do in Bloaty is deal with
incomplete information. For examples, .debug_aranges
which we use for the compileunits
data source is often
missing or incomplete. Refining the input sources to be
more complete and accurate will make help Bloaty's numbers
be even more accurate.