-
Notifications
You must be signed in to change notification settings - Fork 70
Address the slow growth of memory utilization with perf #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
1c26438 to
7f7610d
Compare
| should_restart_time_based = ( | ||
| time_elapsed >= self._RESTART_AFTER_S and current_rss >= self._PERF_MEMORY_USAGE_THRESHOLD | ||
| ) | ||
| should_restart_growth_based = memory_growth > self._RSS_GROWTH_THRESHOLD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, looks good
can we get some metrics behind this, what the average growth rate of perf from initialization to post profiling for busy vs non-busy system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The system I tested is an AWS instance with small number of cpus. Therefore, our tests do not represent your use case. On idle system I do see around 50MB as baseline RSS and 1~2MB growth every duration. I tried to run more cpu intensive workloads but it seems not increase the rss much because the system configuration has limited memory and cpus. So, it would be good if you can evaluate this change on your side with real use cases. Can you try that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this would need testing in environments where there are lot of processes ( over 1k-1.5k) running. I would suggest we can start testing the root cause fix here #1002 and then test this fix.
@ashokbytebytego ^ ^
| # we use double for dwarf. | ||
| _MMAP_SIZES = {"fp": 129, "dwarf": 257} | ||
| _RSS_GROWTH_THRESHOLD = 100 * 1024 * 1024 # 100MB in bytes | ||
| _BASELINE_COLLECTION_COUNT = 3 # Number of function calls to collect RSS before setting baseline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i would suggest having some data backed to configure this collection count baseline
Signed-off-by: Min Yeol Lim <[email protected]>
Signed-off-by: Min Yeol Lim <[email protected]>
Signed-off-by: Min Yeol Lim <[email protected]>
Signed-off-by: Min Yeol Lim <[email protected]>
7f7610d to
495eaae
Compare

It seems the perf has known issue regarding memory utilization when perf runs continuously. As it's observed on idle system where there are not many processes running and output file sizes are not huge, it looks the perf has to be restarted if its internal memory usage is beyond a threshold.
The solution I implement here is to restart the perf collection when the memory utilization growth is more than 100MB (which can be changed depending on use cases, we can also make it as command line argument), compared the first memory RSS size.
The issue is different from what were reported at #990 which is related to perf output handling.
Description
Related Issue
Motivation and Context
How Has This Been Tested?
Screenshots
Checklist: