Age | Commit message (Collapse) | Author |
|
Bug: 31559095
Test: attempt to build host bionic
Change-Id: I9e02fcd01f3f5c23fb4a64bd13d6b0a2fb2c1dbb
|
|
We build both static and dynamic libjemalloc and libjemalloc_jet for
the host. For the target, only the static libraries are built, just
as before.
Disable sanitizer for the unit/integration tests on the host.
Bug: 113365582
Test: 'mmma external/jemalloc', verify that it builds the same target
libraries, verify it builds host libraries and they can be used to
build ckati.
Change-Id: I84212d0c51995aab25f83deb087d1670ddfb00a8
|
|
This reverts commit fa9301657a7f5c1405cf2691d8e15e23eae113aa.
Reason for revert: Unknown build failure.
Change-Id: I0e29c414274568c70db361675014383f758acbc0
|
|
We build both static and dynamic libjemalloc and libjemalloc_jet for
the host. For the target, only the static libraries are built, just
as before.
Test: 'mmma external/jemalloc', verify that it builds the same target
libraries, verify it builds host libraries and they can be used to
build ckati.
Change-Id: I588b84fba7785d31dc5200a805a43b34bb0723d1
|
|
Default to always purging, and then assume that people can change
this value if they are willing to leave a bit of PSS hanging around.
Bug: 36401135
Test: Built and booted bullhead.
Test: Ran jemalloc unit tests.
Test: Ran bionic unit tests.
Test: Ran a test that allocated and free'd a large piece of memory,
Test: and verified that after changing the parameter, the PSS
Test: sticks around (decay timer set to 1), the PSS is purged (decay
Test: timer set to 0).
Change-Id: I7819bd8b91a1df600967fa15eebc19fa382d7ab4
|
|
Test: Ran unit tests on bullhead.
Change-Id: I54b488114b54415263c0fa22cfe946ecc5883351
|
|
Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing
synchronization for tcache flushing. This bug was introduced by
1cb181ed632e7573fb4eab194e4d216867222d27 (Implement explicit tcache
support.), which was first released in 4.0.0.
(cherry picked from commit 3ecc3c84862ef3e66b20be8213b0301c06c692cc)
Bug: 35867477
Test: Booted angler using normal config and svelte config.
Test: Ran bionic unit tests/jemalloc tests.
Test: Ran the art ThreadStress tests on the normal config/svelte config.
Change-Id: I8de25e7eae8f0b055febafee1b5e4d170371bbcd
|
|
Merge remote-tracking branch 'aosp/upstream-new' into upgrade
Includes regenerating the necessary files.
Bug: 33321361
Test: Built for x86_64/arm64. Built the no tcache and tcache enabled
Test: configs (nexus 9/nexus 6p). Compared the before after running
Test: all dumps through the memory replay and verified no unexpected
Test: increases. Ran bionic unit tests in both configs, 32 bit and 64 bit
Test: variants. Ran the jemalloc unit tests in both configs.
Change-Id: I2e8f3305cd1717c7efced69718fff90797f21068
|
|
This resolves #517.
|
|
Test: Built angler and verified the jemalloc syscalls are gone.
Test: Ran bionic unit tests and jemalloc unit tests on angler.
Change-Id: I55d12bdfc84832d031e875244c20cc63820c0c5c
|
|
Add the pages_[no]huge() functions, which toggle huge page state via
madvise(..., MADV_[NO]HUGEPAGE) calls.
The first time a page run is purged from within an arena chunk, call
pages_nohuge() to tell the kernel to make no further attempts to back
the chunk with huge pages. Upon arena chunk deletion, restore the
associated virtual memory to its original state via pages_huge().
This resolves #243.
|
|
This resolves #509.
|
|
This is only used when mmap fails, so it's almost completely
worthless.
Test: Booted on angler, bionic unit tests passed, jemalloc unit tests
Test: passed. Ran the bionic unit tests and verified that brk/sbrk
Test: are never called.
Change-Id: Ie71a31337678983220d78c40bbaff6694eb22560
|
|
Some versions of Android provide a pthreads library without providing
pthread_atfork(), so in practice a separate feature test is necessary
for the latter.
|
|
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to
madvise(2), so that MADV_FREE is detected and used for Linux kernel
versions 4.5 and newer. Refactor pages_purge() so that on systems which
support both flags, MADV_FREE is preferred over MADV_DONTNEED.
This resolves #387.
|
|
|
|
Rather than relying on two's complement negation for alignment mask
generation, use bitwise not and addition. This dodges warnings from
MSVC, and should be strength-reduced by compiler optimization anyway.
|
|
Add extent serial numbers and use them where appropriate as a sort key
that is higher priority than address, so that the allocation policy
prefers older extents.
This resolves #147.
|
|
This fixes a regression caused by
40ee9aa9577ea5eb6616c10b9e6b0fa7e6796821 (Fix stats.cactive accounting
regression.) and first released in 4.1.0.
|
|
Included in this change are all of the updated generated files.
Bug: 32673024
Test: Built the angler build (normal config) and the volantis build
Test: (svelte config). Ran memory_replay 32 bit and 64 bit on both
Test: platforms before and after and verified results are similar.
Test: Ran bionic unit tests and jemalloc unit tests.
Test: Verified that two jemalloc unit test failures are due to
Test: Android extension that puts all large chunks on arena 0.
Change-Id: I12428bdbe15f51383489c9a1d72d687499fff01b
|
|
This reverts commit af33e9a59735a2ee72132d3dd6e23fae6d296e34.
This resolves #495.
|
|
This resolves #495.
|
|
Fix chunk_alloc_cache() to support decommitted allocation, and use this
ability in arena_chunk_alloc_internal() and arena_stash_dirty(), so that
chunks don't get permanently stuck in a hybrid state.
This resolves #487.
|
|
|
|
|
|
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
|
|
Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes,
since OS X 10.12 cannot tolerate a child unlocking mutexes that were
locked by its parent.
Refactor; this was a side effect of experimenting with zone
{de,re}registration during fork(2).
|
|
This resolves #396.
|
|
The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC),
whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but
still has resolution (~1ms) that is adequate for our purposes.
This resolves #479.
|
|
glibc defines its malloc implementation with several weak and strong
symbols:
strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc)
strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree)
strong_alias (__libc_free, __free) strong_alias (__libc_free, free)
strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)
The issue is not with the weak symbols, but that other parts of glibc
depend on __libc_malloc explicitly. Defining them in terms of jemalloc
API's allows the linker to drop glibc's malloc.o completely from the link,
and static linking no longer results in symbol collisions.
Another wrinkle: jemalloc during initialization calls sysconf to
get the number of CPU's. GLIBC allocates for the first time before
setting up isspace (and other related) tables, which are used by
sysconf. Instead, use the pthread API to get the number of
CPUs with GLIBC, which seems to work.
This resolves #442.
|
|
Refactor tsd so that tsdn_fetch() does not trigger allocation, since
allocation could cause infinite recursion.
This resolves #458.
|
|
Rather than protecting dss operations with a mutex, use atomic
operations. This has negligible impact on synchronization overhead
during typical dss allocation, but is a substantial improvement for
chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be
called multiple times during chunk deallocations.
This change also has the advantage of avoiding tsd in deallocation paths
associated with purging, which resolves potential deadlocks during
thread exit due to attempted tsd resurrection.
This resolves #425.
|
|
Add spin_t and spin_{init,adaptive}(), which provide a simple
abstraction for adaptive spinning.
Adaptively spin during busy waits in bootstrapping and rtree node
initialization.
|
|
Simplify decay-based purging attempts to only be triggered when the
epoch is advanced, rather than every time purgeable memory increases.
In a correctly functioning system (not previously the case; see below),
this only causes a behavior difference if during subsequent purge
attempts the least recently used (LRU) purgeable memory extent is
initially too large to be purged, but that memory is reused between
attempts and one or more of the next LRU purgeable memory extents are
small enough to be purged. In practice this is an arbitrary behavior
change that is within the set of acceptable behaviors.
As for the purging fix, assure that arena->decay.ndirty is recorded
*after* the epoch advance and associated purging occurs. Prior to this
fix, it was possible for purging during epoch advance to cause a
substantially underrepresentative (arena->ndirty - arena->decay.ndirty),
i.e. the number of dirty pages attributed to the current epoch was too
low, and a series of unintended purges could result. This fix is also
relevant in the context of the simplification described above, but the
bug's impact would be limited to over-purging at epoch advances.
|
|
Instead, move the epoch backward in time. Additionally, add
nstime_monotonic() and use it in debug builds to assert that time only
goes backward if nstime_update() is using a non-monotonic time source.
|
|
|
|
Add missing #include <time.h>. The critical time facilities appear to
have been transitively included via unistd.h and sys/time.h, but in
principle this omission was capable of having caused
clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of
gettimeofday(), which in turn could cause spurious non-monotonic time
updates.
Refactor nstime_get() out of nstime_update() and add configure tests for
all variants.
Add CLOCK_MONOTONIC_RAW support (Linux-specific) and
mach_absolute_time() support (OS X-specific).
Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a
fragile Linux-specific workaround, which we're unlikely to use at all
now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we
have no choice besides non-monotonic clocks, gettimeofday() is only
incrementally worse.
|
|
|
|
Use pszind_t size classes rather than szind_t size classes, and always
reserve space for NPSIZES elements. This removes unused heaps that are
not multiples of the page size, and adds (currently) unused heaps for
all huge size classes, with the immediate benefit that the size of
arena_t allocations is constant (no longer dependent on chunk size).
|
|
These compute size classes and indices similarly to size2index(),
index2size() and s2u(), respectively, but using the subset of size
classes that are multiples of the page size. Note that pszind_t and
szind_t are not interchangeable.
|
|
|
|
They are used on all platforms in prng.h.
|
|
GCC 4.9.3 cross-compiled for sparc64 defines __sparc_v9__, not
__sparc64__ nor __sparcv9. This prevents LG_QUANTUM from being defined
properly. Adding this new value to the check solves the issue.
|
|
Some bug (either in the red-black tree code, or in the pgi compiler) seems to
cause red-black trees to become unbalanced. This issue seems to go away if we
don't use compact red-black trees. Since red-black trees don't seem to be used
much anymore, I opted for what seems to be an easy fix here instead of digging
in and trying to find the root cause of the bug.
Some context in case it's helpful:
I experienced a ton of segfaults while using pgi as Chapel's target compiler
with jemalloc 4.0.4. The little bit of debugging I did pointed me somewhere
deep in red-black tree manipulation, but I didn't get a chance to investigate
further. It looks like 4.2.0 replaced most uses of red-black trees with
pairing-heaps, which seems to avoid whatever bug I was hitting.
However, `make check_unit` was still failing on the rb test, so I figured the
core issue was just being masked. Here's the `make check_unit` failure:
```sh
=== test/unit/rb ===
test_rb_empty: pass
tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black
test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced
tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black
test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced
node_remove:test/unit/rb.c:190: Failed assertion: (imbalances) == (0) --> 2 != 0: Tree is unbalanced
<jemalloc>: test/unit/rb.c:43: Failed assertion: "pathp[-1].cmp < 0"
test/test.sh: line 22: 12926 Aborted
Test harness error
```
While starting to debug I saw the RB_COMPACT option and decided to check if
turning that off resolved the bug. It seems to have fixed it (`make check_unit`
passes and the segfaults under Chapel are gone) so it seems like on okay
work-around. I'd imagine this has performance implications for red-black trees
under pgi, but if they're not going to be used much anymore it's probably not a
big deal.
|
|
Add a configure check for __builtin_unreachable instead of basing its
availability on the __GNUC__ version. On OS X using gcc (a real gcc, not the
bundled version that's just a gcc front-end) leads to a linker assertion:
https://github.com/jemalloc/jemalloc/issues/266
It turns out that this is caused by a gcc bug resulting from the use of
__builtin_unreachable():
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438
To work around this bug, check that __builtin_unreachable() actually works at
configure time, and if it doesn't use abort() instead. The check is based on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438#c21.
With this `make check` passes with a homebrew installed gcc-5 and gcc-6.
|
|
Bug: http://b/31496165
Change-Id: Ia7acd464fd8dcc9cb88e61b4f4d369db49f5de76
Test: mma
|
|
Change the decay time from zero to one second, to avoid problems with
the purge interferring with allocations taking too long.
Also, decrease the arenas from 2 to 1 for the svelte config to offset
the PSS increase caused by the decay time change.
Bug: 30077848
(cherry picked from commit 08795324eae5f68d211dc5483746af51203dc661)
Change-Id: I8389ab5af322b6e40e6afd86193c6de8a738421b
|
|
Bug: 28860984
Change-Id: If12daed270ec0a85cd151aaaa432d178c8389757
|
|
Bug: 28860984
Change-Id: I9eaf67f53f9872177d068d660d1e051cecdc82a0
|
|
In the case where prof_alloc_prep() is called with an over-estimate of
allocation size, and sampling doesn't end up being triggered, the tctx
must be discarded.
|