Age | Commit message (Collapse) | Author |
|
Bug: 151665001
Test: run simpleperf_unit_test.
Change-Id: I0c974ee48c145ce4fc8adf533f445fa614d60216
|
|
In the kernel, each time generating a block of etm data, it also
generates a PERF_RECORD_AUX record. An Aux record contains a timestamp
showing when a block of etm data is generated. It can be used to
synchronize etm data with other records (like mmap and comm records).
So we want to parse etm data each time seeing an Aux record (as in dump
cmd). It needs to know etm data locations in perf.data without reading
the whole file. To fulfill that, this CL also adds AUXTRACE feature section,
in the same format as in linux perf.
Also dump AUX records and their corresponding etm data in dump cmd.
Bug: 135204414
Test: run simpleperf_unit_test.
Test: run `simpleperf record -e cs-etm xxx` && `perf report -D --stdio`.
Change-Id: Ifae716a10fefe0f3d4822a0214384b40ada9da45
|
|
1. In event_fd.cpp, add functions to create aux buffer and read etm
data.
2. In record.h, Add AuxTraceRecord.
2. In RecordReadThread.cpp, wrap etm data into AuxTraceRecords.
3. Add logic to read and write AuxTraceRecords in perf.data.
4. Show recorded etm data size after recording.
Bug: 135204414
Test: run simpleperf_unit_test.
Change-Id: I3b20fe8f3c786f130f38e34962ca9f86a31fc584
|
|
To record on device and report on host, simpleperf keeps arch type
and event types in perf.data. Currently, RecordFileReader needs
its users to read and set the arch type and event types properly.
This patch changes to let RecordFileReader set arch type and
event types. It avoids repeating the effort in users of
RecordFileReader.
Bug: none
Test: run simpleperf_unit_test.
Test: run simpleperf record on device, and report it on host.
Change-Id: Ief637ca6e2c3acfbf74b6447ef7ff0679439ca1d
|
|
Apps may run with libraries with multiple executable segments.
Symbolization ip addresses in these libraries need to use map.pgoff.
The old formula converting ip to vaddr_in_file:
vaddr_in_file = ip - map.start + min_executable_vaddr
The new formula converting ip to vaddr_in_file:
offset_in_file = ip - map.start + map.pgoff
vaddr_in_file = offset_in_file - file_offset_of_min_executable_vaddr
+ min_executable_vaddr
Bug: 124056476
Test: run simpleperf_unit_test.
Test: use simpleperf to profile facebook app, ip addresses hitting libc.so
Test: and libart.so are symbolized correctly.
Change-Id: I5fd3ed822a916c4d04a9868d6d209c43ee190c5b
|
|
When using debug-unwind cmd for system wide profiling result, I
found it might took a lot of memory because RecordCache cached
too many samples. Since simpleperf no longer relies on RecordCache
to sort records, I think it is fine to remove RecordCache instead
of fixing it.
Bug: none.
Test: run simpleperf_unit_test.
Change-Id: Ie28ce17b4158add455004a56bbdac745f9d05f19
|
|
--size-limit option stops recording when the recorded data
reaches the size limit. It is used by
run_simpleperf_without_usb_connection.py to avoid taking too much disk
space.
Bug: http://b/74198167
Test: run simpleperf_unit_test.
Test: run test.py.
Change-Id: I11f0023c342c50e1cf8035430e6af1b3caa329e7
|
|
To convert from a dex_pc (returned by libunwindstack) to a symbol name,
we need below things:
1. The mapping info of the vdex file containing the dex_pc.
2. The offsets of dex files in the vdex file.
So make below changes:
1. Record none executable maps when profiling java code.
2. Refactor dso code to add a new type for dex file, using DexFileDso
to store dex file offsets in a vdex file, and load symbols from that
vdex file.
3. Add read_dex_file.cpp to read java symbols using libdexfile.
4. Change the format of file section in record_file_format.h, to store
dex file offsets in vdex files.
Bug: http://b/73126888
Bug: http://b/77236599
Test: Run simpleperf to profile several apps manually, can see
Test: callstacks of both java code and native code.
Test: Run simpleperf_unit_test.
Change-Id: I08005a03beb3df1a70db034bc463f555934856ba
|
|
Currently, we use --log debug option in record cmd to debug offline unwinding.
However, it has below disadvantages:
1. It adds extra complexity in record cmd.
2. It doesn't keep reg/stack data of samples.
3. It isn't convenient to reproduce samples in problem. Because each time
recording gets different samples.
4. It isn't very suitable for performance test of unwinding, for the same
reason as item 3.
So instead, this CL adds debug-unwind cmd focusing on debugging and testing
offline unwinding. It solves problems mentioned above.
Also change unwinding_result_reporter.py to make it work with perf.data
generated by debug-unwind cmd.
Bug: http://b/72556486
Test: run simpleperf_unit_test.
Test: run unwinding_result_reporter.py manually.
Change-Id: I11cdf1eba993f48d61ef9891ad1be54d29679fdb
|
|
When recording google.sample.tunnel app for 30s:
It took 3s to unwind samples and write unwound samples to file.
It took 0.3s to write samples containing stack/reg data to file.
The result shows recording with post unwinding consumes much
less time than unwinding samples immediately. This means we can
record with higher freq and get smaller lose rate when using
post unwinding. So make below changes:
1. Make post unwinding by default.
2. Replace --post-unwind with --no-post-unwind option.
3. Make --trace-offcpu and callchain joiner work with post unwinding.
4. Remove special operations in --log debug mode. Those will be
supported in a new command.
Bug: http://b/72556486
Test: run simpleperf_unit_test.
Test: run python test.py.
Change-Id: I9a5a5defda9d040985e674c43db19ee68e7aa305
|
|
META_INFO section can be used to pass some small information
in perf.data.
Add simpleperf_version in META_INFO section for debugging.
Bug: http://b/37960318
Test: run simpleperf_unit_test.
Change-Id: If17a147bbc77b5af063fbf77e02ca81430afb8a5
|
|
Bug: http://b/35475170
Test: run simpleperf_unit_test.
Test: run report.py.
Change-Id: Ie9329a64c701bce38f7b440c16cb47e99e83db45
|
|
Instead of dumping all symbols in the hit elf files, dump only
needed symbols can save a lot of space. To do so, read perf.data
after recording to collect hit file and symbol information.
Bug: http://b/32340274
Test: test using `simpleperf record --dump-symbols` manually.
Test: run simpleperf_unit_test.
Change-Id: I480f3e2e7ccebfbb5df16a597724f5f40d62c821
|
|
Bug: http://b/32340274
Test: run simpleperf_unit_test.
Change-Id: I0bed466c145fdbb2988308f56a031c06bad16352
|
|
For `record --dump-symbols` option, change from dumping
DsoRecord and SymbolRecord to dumping file feature section.
It is to avoid reading symbols from elf files during recording,
which takes a lot of time. And we don't want to mix optional
data (the symbol tables) with necessary data (the profiling records).
Bug: http://b/32340274
Test: run simpleperf_unit_test.
Test: run simpleperf runtest.py.
Change-Id: I0a387de243afac93486fc885f223a58060ec07f4
|
|
Also remove set low mark for dwarf callgraph recording.
Bug: http://b/32343227
Test: run simpleperf runtest.py.
Test: run simpleperf_unit_test.
Change-Id: I57c0146b0a52cc1bb940a54f685058fe00677992
|
|
And some tiny improvements.
Bug: http://b/30974760
Test: run simpleperf_unit_test.
Change-Id: Ie2d46c8ab9ee763d107527c9a54590f845569da4
|
|
1. Build libsimpleperf_report.so on host, which exports functions
to access samples.
2. Add simpleperf_report_lib.py to wrap libsimpleperf_report.so.
3. Write report_sample.py to test simpleperf_report_lib.py. The
output format of report_sample.py matches the need of building
FlameGraph.
Bug: http://b/31069528
Test: run report_sample.py on perf.data.
Test: run simpleperf_unit_test.
Change-Id: I4949f8ea506f12101a9c4fb4c896957c96676853
|
|
1. When a cpu is down, read records from event files on that cpu,
then close those event files.
2. When a cpu is up, open event files on that cpu, and create
mapped buffer for those event files to dump records.
3. Instead of creating a mapped buffer for each event type on each
cpu, we can just create a mapped buffer for all event types on
each cpu.
4. When new event files are created, store a EventIdRecord record in
perf.data to notify record_file_reader.cpp.
Bug: http://b/29245608
Test: run simpleperf record cmd and make cpu offline and online.
Test: run simpleperf_unit_test.
Change-Id: Ib97a24b6292fa143e9b35cb105bdddf1e826d60a
|
|
Avoid binary allocation and memory copy in ReadRecordsFromBuffer(),
thus reduce Record construction overhead in
EventSelectionSet::ReadMmapEventDataForFd().
Remove RecordCache used while recording. Replace it with
RecordFileWriter::SortDataSection(). For unwinding while
recording, use low watermark to make records almost sorted
when dumped from the kernel.
Bug: 30649868
Test: run simpleperf_unit_test.
Change-Id: Ie5fb942046900a5960b3c990cf4177c026eaadfb
|
|
Wrap libevent in IOEventLoop, use IOEventLoop in stat command.
Add corresponding tests.
Bug: http://b/30405638
Change-Id: I78b79e0eff1365ab46dde29c2a24a2def586af79
Test: run simpleperf_unit_test.
|
|
Previously we split KernelSymbolRecord because it is > 65535. Then
I found TracingDataRecord can also be > 65535. So it is better to
handle big records when reading and writing perf.data.
record_file_writer.cpp splits a big record into multiple SPLIT
records followed by a SPLIT_END record, and record_file_reader.cpp
restores the big record when reading SPLIT and SPLIT_END records.
Also Add RecordHeader to represent record having size > 65535.
Bug: 29581559
Change-Id: I0b4556988f77b3431c7f1a28fce65cf225d6a067
Test: run simpleperf_unit_test.
|
|
When monitoring tracepoint events, dump tracing data to perf.data
can enable reporting on a different machine.
Bug: 27403614
Change-Id: Ie1af624717a245cacbeb44b4c1bcd499fc9ad8db
|
|
When sampling kernel trace points, it is like to sample more than
one even type. Like `simpleperf record -e kmem:mm_page_alloc,kmem:mm_page_free`.
1. change record command to dump event_id for all records.
2. change report command and record reader to support multiple
event attrs.
3. hide record_cache inside EventSelectionSet.
4. add test to report multiple event types.
Bug: 27403614
Change-Id: Ic22a5527d68e7a843e3cf95e85381f8ad6bcb196
|
|
Change-Id: Ic15d4778c7accd1382de0b440a437aba2cf67016
|
|
perf.data can be too large to be loaded into memory.
To avoid this, use fread() instead of mmap() to read perf.data,
and always use RecordCache to sort records.
Fix unit tests failure caused by previous change.
Bug: 25194400
Change-Id: If29dc0bb0ed992ba34202c2cb1a204a1d9123b7a
|
|
As libbacktrace only supports unwinding for the same architecture it is running on, simpleperf
report command running on host can't unwind perf.data collected on device. So we'd better do
unwinding work in record command on device.
Bug: 22229391
Change-Id: I085ca074ea83dab79f08563523bdbc7a36650a64
|
|
Change-Id: I0aa0e8c9491370b5e4fafdaf8cdc5613c26c78f5
|
|
The new method is more accurate and has lower time complexity.
Bug: 22229391
Change-Id: I8b3016798b8a0e20335adeb7ec5dda0068044142
|
|
1. add OS_RELEASE and ARCH feature in perf.data. ARCH feature is used when parsing
recorded user registers.
2. support `--call-graph dwarf` option in record command.
Bug: 22229391
Change-Id: I56dbdd101338658ce6a9b59aa8be90e712e007f5
|
|
1. refactor BuildId type.
2. check build id before parsing symbols in report command.
Bug: 22179177
Change-Id: Iefc797a88d4a168e109db786105120c8d6914369
|
|
Bug: 19483574
Change-Id: I92f16d6616f274f31ea54e305fe1de10049baf02
|
|
Also add the function to remove old perf.data.
Bug: 19483574
Change-Id: I605bb637674d4674f95503a160de8c530fe87812
|
|
Bug: 19483574
Change-Id: Ie2acd8a157bca9ad3c01a2e4b37e139aba89670f
|
|
Bug: 19483574
Change-Id: Id879713a75c2d3a6289d8847b95ee0bb4a2cc8a0
|