aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-29Revert "Add support for trace event processing using callacks"Joel Fernandes
This reverts commit 2b9b25c74416328a0f68af5b60b73534e4ccbd02.
2017-06-29Merge branch 'joelagnel-for-unique-word-speedup'KP Singh
2017-06-29trappy/ftrace: refactor getting class for unique wordJoel Fernandes
Lets move this out of the main parsing loop so that the code is easier to read. Suggested-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29trappy: optimize integer conversion in generate_data_dictJoel Fernandes
generate_data_dict uses try/except for integer conversion. This is expensive consider the frequency of the calls. Optimize it by using a slightly more uglier but much fast version of the same. With this I get a speed of ~8.5% in parsing. Change-Id: I909ad170756fd284c7d924950945e755880ceb5f Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29trappy: skip empty array regex sub for common caseJoel Fernandes
I found that trappy spends a lot of time looking for the array pattern "{}". Vast majority of trace lines don't have them. Instead of running the regex every time, check for the ={} pattern using the 'in' operator which is much faster than the regex for the common case, and then do the regex sub if needed. The speed up is huge and in my test run, I saw parsing stage on a 100MB trace speed up from 13s -> 10.5s !! Change-Id: I7ae68efc99eaf8844624871be2a4d66a7820a9b0 Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29trappy: remove double check for unique wordJoel Fernandes
Its not necessary to check for unique word twice. Reusing the 'class detection' path for this purpose is sufficient. This improve performance by 3-6% of the parsing stage. Change-Id: Iff8ebd0086d2ccc10bec14e3d40a63f3c04aaf1a Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29Merge branch 'cobrien7-event-callbacks'KP Singh
2017-06-29Add support for trace event processing using callbacksConnor O'Brien
Some trace analysis tasks are most easily expressed by iterating over events in chronological order. Add an apply_callbacks() method to the GenericFtrace class that will run user-defined functions on trace events in the order they occurred. Signed-off-by: Connor O'Brien <connoro@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29Merge branch 'joelagnel-for-trappy-cache'KP Singh
2017-06-29tests: Caching testsBrendan Jackman
Add test for invalid cache Add test for caching with extra events Add test for normalize_time and window parameters Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29ftrace: Store and retrieve window params in trace cacheBrendan Jackman
Store a JSON file with the parameters used to parse a trace in the cache. If, on retrieval, we find a mismatch then simply invalidate the cache. A richer implementation might detect whether we can still use the cache (e.g. if the requested window is a subset of the cached one), but for now I don't think that's worth the effort/complexity. Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-29trappy: Speed up trappy by caching trace parsingJoel Fernandes
Pandas is extremely fast at parsing csv to data frames. Astonishingly it takes < 1s to serialize/deserialize a 100MB work of traces with 430000 events to/from csv. We leverage this and write out a data frames into a csv file when they are created for the first time. Next time we read it out if it exists. To make sure, the cache isn't stale, we take the md5sum of the trace file and also ensure all CSVs exist before reading from the cache. I get a speed up of 16s to 1s when parsing a 100MB trace. Co-developed-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27Merge remote-tracking branch 'goog/mirror-upstream-master' into HEADJoel Fernandes
2017-06-27Merge branch 'credp-remove_raw_trace'KP Singh
2017-06-27trappy/test: fix tests for removal of raw.txt trace parsingChris Redpath
tests/test_ftrace.py: Stop creating and checking for existence of <trace>.raw.txt files. tests/test_sched.py: Stop creating raw.txt file. tests/raw_trace.raw.txt: Remove the file as it is no longer needed. The raw-formatted events are moved into raw_trace.txt tests/raw_trace.txt: Replace the formatted sched_switch events with raw formatted events, like we get when parsing a trace file. tests/trace_empty.txt: Remove the default sched_ events we would expect to parse in raw format - since we want to have events here which we are not looking for. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27trappy/parsing: remove unnecessary raw flags from API and fix commentsChris Redpath
Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27trappy/ftrace: Raise an exception if attempting to parse .raw.txt filesChris Redpath
Since we no longer generate these, we should let the user either remove their text files manually or merge the files if they no longer have the .dat file. The exception describes how to deal with this format change and which events should be moved from the raw to the formatted text file if the user no longer has the .dat file. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27trappy/ftrace: Stop generating trace.raw.txtChris Redpath
Now that we have events constructed before generating trace files, we can scan them and determine which events need to be formatted and which should be unformatted (raw). trace-cmd report lets us specify events we want to print raw. Use this to produce a single trace.txt which has all the necessary formatted & raw events instead of calling trace-cmd report twice. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27trappy/Parsing: Split trace class init into two phasesChris Redpath
This patch set is to modify the parsing step so that we don't need to build a raw and a formatted trace file. To do that, we need to know which events should have raw output and which should use their default formatting. The events indicate which are which, but currently we generate the trace files before we populate the events. Splitting the initialisation into two parts means that we can populate the events so that a later patch can create the text trace with each event either formatted or raw as required. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-27trappy/parsing: Ensure parse_raw is set for sched_switch eventsChris Redpath
When the sched_switch event is added to the parser the constructor is not called so we never set the parse_raw flag. Ensure there is a class variable to hold the correct value and use it in the constructor. We use the raw file to extract the events since we actually do construct one of these objects during parsing. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-25trappy: Optimize integer conversion in generate_data_dictJoel Fernandes
generate_data_dict uses try/except for integer conversion. This is expensive consider the frequency of the calls. Optimize it by using a slightly more uglier but much fast version of the same. With this I get a speed of ~8.5% in parsing. Change-Id: I909ad170756fd284c7d924950945e755880ceb5f Signed-off-by: Joel Fernandes <joelaf@google.com>
2017-06-25trappy: Speed up trappy parsing by 20% by skipping regex subJoel Fernandes
I found that trappy spends a lot of time looking for the array pattern "{}". Vast majority of trace lines don't have them. Instead of running the regex every time, check for the ={} pattern using the 'in' operator which is much faster than the regex for the common case. The speed up is huge and in my test run, I saw parsing stage on a 100MB trace speed up from 13s -> 10.5s !! Change-Id: I7ae68efc99eaf8844624871be2a4d66a7820a9b0 Signed-off-by: Joel Fernandes <joelaf@google.com>
2017-06-25trappy: Remove double check for unique wordJoel Fernandes
Its not necessary to check for unique word twice. Reusing the 'class detection' path for this purpose is sufficient. This improve performance by 3-6% of the parsing stage. Change-Id: Iff8ebd0086d2ccc10bec14e3d40a63f3c04aaf1a Signed-off-by: Joel Fernandes <joelaf@google.com>
2017-06-23Merge remote-tracking branch 'origin/master' into HEADJoel Fernandes
2017-06-21Add support for trace event processing using callacksConnor O'Brien
Some trace analysis tasks are most easily expressed by iterating over events in chronological order. Add a run_event_callbacks() method to the GenericFTrace class that will run user-defined functions on trace events in the order they occurred. Signed-off-by: Connor O'Brien <connoro@google.com>
2017-06-21Merge branch 'joelagnel-systrace-line-test'KP Singh
2017-06-21systrace: Store pid as int typeJoel Fernandes
Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-21trappy: Fix systrace userspace parsing testsJoel Fernandes
Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-21systrace: Fix broken regexJoel Fernandes
The regex is broken for strings like this: C|594|HW_VSYNC_ON_0|1 This is because the "1" (data) is optional and so the greedy operator globs everything causing func = "HW_VSYNC_ON_0|1" and data = NaN Change the data to be greedy until '|' is encountered so with that: func = "HW_VSYNC_ON_0" and data = "1". Change-Id: I3695a8da8ad10598192cea08ca92e27271565ccf Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-21tests: Add a test for line numbers in systraceJoel Fernandes
I promised @derkling I would write this so here you go. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-21Merge branch 'derkling-add-ns-clock-support'KP Singh
2017-06-21trappy/ftrace: add support for boot clock timestampPatrick Bellasi
FTrace can be configured to report events timestamp using different clock sources which can be selected via the trace_clock sysfs attribute as described in: https://www.kernel.org/doc/Documentation/trace/ftrace.txt The global clock source reports time in [s] with a [us] resolution. Other sources instead, like for example the boot clock, uses [ns]. Thus, in these last cases we do not have decimals in the timestamp. Let's update the special fields regexp to match both [s] and [ns] formatted times and do the required pre-processing to ensure that DataFrames are alwasy expressed using a [s].[decimals] format. This also update the base test to add a set of trace events which are expressed in [ns] resolution. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14Merge remote-tracking branch 'origin/master' into HEADJoel Fernandes
2017-06-15Merge branch 'joelagnel-for-line'KP Singh
2017-06-15trappy/tests: Fix test for checking invalid lineJoel Fernandes
Due to the recent patches to skip invalid lines using a regex, this test case fails. Fix it by appending the invalid line to the end of the test trace. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-15systrace: Make systrace reuse hasnt_started from base classJoel Fernandes
Since Patrick changed hasnt_started to use the regex for checking if a line is valid, its better if we don't use the hardcoded logic we have now that checks if the trace line begins with '#'. Instead lets piggy back onto the base class if systrace thinks the line shouldn't be skipped. This makes it a more consistent design. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-15trappy/tests: add a new test for new line number parsingPatrick Bellasi
Let's test that the parsed events have the proper line number reported in their DF. Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-15trappy/ftrace: add a line number columnJoel Fernandes
Useful for joining DataFrames that have timestamp collisions or for iterating through a group of DataFrames in line order. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-15trappy/ftrace: add support to account line numbersPatrick Bellasi
Knowing the line number of a trace event into the original trace file can be super useful for two main reasons: 1. it can be used as a truly unique ID of each DF entry, since it is not subject to the possible index collisions due to the timestamp resoluton 2. it can be super-useful to correlate DF entries with the original trace If this ID is also tracked the same way tools like kernelshark does, than we have a definitive and easy tool to match DF records with kernelshark. Since kernelshark uses a 0-based indexing for trace events, let's account for proper trace events line numbers starting from 0. This number is available as a class property, in case we are interested to know to real total number of events which are available in the original trace. Moreover, it will be used by a following patch to add a unique ID to each DF entry. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-15trappy/ftrace: use special fileds RE to drive trace_hasnt_startedPatrick Bellasi
Currently we open an FTrace two times: one to parse metadata and another to actually parse trace events. The second time we open the trace we don't skip the metadata since they are filtered out by the attention words match. Since trappy is now empowered by the trace_hasnt_started() iterator filter, let's use this to skip the metadata before entering the parsing loop. This change is preparatory for a proper accounting of trace events from the parsing loop. While we are at that, let's be eco-friendly and recycle the special field regexp also for the exit condition of __populate_metadata(). Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14Merge branch 'joelagnel-for-systrace-parsing' into fixit2KP Singh
2017-06-14trappy/systrace: Add test for parsing userspace eventsJoel Fernandes
Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14trappy: Handle the case for special events injected with tracing_mark_writeJoel Fernandes
Certain fake events can be injected using tracing_mark_write and should be classified into their own event classes than tracing_mark_write. One way to handle this is to use tracing_mark_write only as a backup. For this reason lets introduced the concept of backup event classes and classify TracingMarkWrite as that. This case normally doesn't happen because dynamic events are inserted into the beginning of the event list from what I've seen. Since tracing_mark_write is a built-in event, and not dynamic, its always at after and so is less preferred. However we should protect for the future case where there might be a built-in which happens to be after TrackingMarkWrite class in the list, so lets handle it now. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14trappy/systrace: Provide custom data string parsingJoel Fernandes
Here we provide custom data string parsing for Android userspace events. Borrowed the regex and parsing logic from Patrick's patch. Co-developed-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14trappy: Add tracing_mark_write as a new eventJoel Fernandes
Create tracing_mark_write as an event and treat it like such. It overrides data string processing since it cannot use 'key=value' parsing. Instead it relies on the tracer to provide one (such as systrace has custom trace format from userspace events). If tracer cannot provide one, then it just returns a default dictionary with key 'string' and value as the string in the event. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14trappy: Split out parsing of trace string into separate functionJoel Fernandes
In preparation of being able to parse systrace, split out data string processing into separate function so it can be overridden by sub classes. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-14trappy/ftrace: Parse data string along with special fieldsJoel Fernandes
This is needed so that we're not dependent anymore on making sure data starts at something with '=' and will allow to parse custom systrace tracing_mark_write events which don't have an '=' in them. The new regex will also work for lines that do already have an '=' and doesn't change their behavior. The last line of the new regex in this patch works like this: r"(?P<timestamp>[0-9]+\.[0-9]+): (\w+:\s+)+(?P<data>.+)" We use skip over '<string>:<whitespace> ' and the next thing we find is considered the start of the data. This works well to find the start of a line for existing traces and is needed for systraces which don't have '='. Signed-off-by: Joel Fernandes <joelaf@google.com> Reviewed-by: KP Singh <kpsingh@google.com>
2017-06-13Revert "Merge remote-tracking branch 'joel-local/trappy-event-callbacks' ↵Joel Fernandes
into HEAD" This reverts commit fdd827ae1b164ce7e05c5131d0c32f8028e395a5, reversing changes made to 3bcad0b54476cdb1162a71ffdab54bfc4b80f9bf.
2017-06-13Revert "trappy: Add tgid information from systrace"Joel Fernandes
This reverts commit 0cf2110dc3824e4fdf1463b5b785b06c36474d7c.
2017-06-13Revert "remove spurious print"Joel Fernandes
This reverts commit da27bee4e96d3e2f4513d1f3394d781642ec2ce5.