aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-09-12Build jemalloc for host, too.Christopher Ferris
We build both static and dynamic libjemalloc and libjemalloc_jet for the host. For the target, only the static libraries are built, just as before. Test: 'mmma external/jemalloc', verify that it builds the same target libraries, verify it builds host libraries and they can be used to build ckati. Change-Id: I588b84fba7785d31dc5200a805a43b34bb0723d1
2018-08-22bionic provides PR_SET_VMA now.android-o-mr1-iot-release-1.0.4Elliott Hughes
Bug: N/A Test: builds Change-Id: Idd5c63d26cdba56bdfd8080749d1320d481ef3fb
2017-03-01Fix/refactor tcaches synchronization.android-n-mr2-preview-2Jason Evans
Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing synchronization for tcache flushing. This bug was introduced by 1cb181ed632e7573fb4eab194e4d216867222d27 (Implement explicit tcache support.), which was first released in 4.0.0. (cherry picked from commit 3ecc3c84862ef3e66b20be8213b0301c06c692cc) Bug: 35867477 Test: Booted angler using normal config and svelte config. Test: Ran bionic unit tests/jemalloc tests. Test: Ran the art ThreadStress tests on the normal config/svelte config. Change-Id: I8de25e7eae8f0b055febafee1b5e4d170371bbcd
2016-12-12Update to jemalloc 4.4.0.Christopher Ferris
Merge remote-tracking branch 'aosp/upstream-new' into upgrade Includes regenerating the necessary files. Bug: 33321361 Test: Built for x86_64/arm64. Built the no tcache and tcache enabled Test: configs (nexus 9/nexus 6p). Compared the before after running Test: all dumps through the memory replay and verified no unexpected Test: increases. Ran bionic unit tests in both configs, 32 bit and 64 bit Test: variants. Ran the jemalloc unit tests in both configs. Change-Id: I2e8f3305cd1717c7efced69718fff90797f21068
2016-12-03Add --disable-syscall.Jason Evans
This resolves #517.
2016-12-03Fix pages_purge() when using MADV_DONTNEED.Jason Evans
This fixes a regression caused by e98a620c59ac20b13e2de796164cc67f050ed2bf (Mark partially purged arena chunks as non-hugepage.).
2016-11-24Mark partially purged arena chunks as non-hugepage.Jason Evans
Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls. The first time a page run is purged from within an arena chunk, call pages_nohuge() to tell the kernel to make no further attempts to back the chunk with huge pages. Upon arena chunk deletion, restore the associated virtual memory to its original state via pages_huge(). This resolves #243.
2016-11-17Add pthread_atfork(3) feature test.Jason Evans
Some versions of Android provide a pthreads library without providing pthread_atfork(), so in practice a separate feature test is necessary for the latter.
2016-11-17Refactor madvise(2) configuration.Jason Evans
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to madvise(2), so that MADV_FREE is detected and used for Linux kernel versions 4.5 and newer. Refactor pages_purge() so that on systems which support both flags, MADV_FREE is preferred over MADV_DONTNEED. This resolves #387.
2016-11-16Avoid gcc tautological-compare warnings.Jason Evans
2016-11-16Avoid gcc type-limits warnings.Jason Evans
2016-11-15Fix an MSVC compiler warning.Jason Evans
2016-11-15Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).Jason Evans
This avoids warnings in some cases, and is otherwise generally good hygiene.
2016-11-15Consistently use size_t rather than uint64_t for extent serial numbers.Jason Evans
2016-11-15Add extent serial numbers.Jason Evans
Add extent serial numbers and use them where appropriate as a sort key that is higher priority than address, so that the allocation policy prefers older extents. This resolves #147.
2016-11-11Simplify extent_quantize().Jason Evans
2cdf07aba971d1e21edc203e7d4073b6ce8e72b9 (Fix extent_quantize() to handle greater-than-huge-size extents.) solved a non-problem; the expression passed in to index2size() was never too large. However the expression could in principle underflow, so fix the actual (latent) bug and remove unnecessary complexity.
2016-11-11Fix/simplify chunk_recycle() allocation size computations.Jason Evans
Remove outer CHUNK_CEILING(s2u(...)) from alloc_size computation, since s2u() may overflow (and return 0), and CHUNK_CEILING() is only needed around the alignment portion of the computation. This fixes a regression caused by 5707d6f952c71baa2f19102479859012982ac821 (Quantize szad trees by size class.) and first released in 4.0.0. This resolves #497.
2016-11-11Fix extent_quantize() to handle greater-than-huge-size extents.Jason Evans
Allocation requests can't directly create extents that exceed HUGE_MAXCLASS, but extent merging can create them. This fixes a regression caused by 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.) and first released in 4.0.0. This resolves #497.
2016-11-10Merge remote-tracking branch 'aosp/upstream-new' into fixChristopher Ferris
Included in this change are all of the updated generated files. Bug: 32673024 Test: Built the angler build (normal config) and the volantis build Test: (svelte config). Ran memory_replay 32 bit and 64 bit on both Test: platforms before and after and verified results are similar. Test: Ran bionic unit tests and jemalloc unit tests. Test: Verified that two jemalloc unit test failures are due to Test: Android extension that puts all large chunks on arena 0. Change-Id: I12428bdbe15f51383489c9a1d72d687499fff01b
2016-11-07Refactor prng to not use 64-bit atomics on 32-bit platforms.Jason Evans
This resolves #495.
2016-11-07Fix run leak.Jason Evans
Fix arena_run_first_best_fit() to search all potentially non-empty runs_avail heaps, rather than ignoring the heap that contains runs larger than large_maxclass, but less than chunksize. This fixes a regression caused by f193fd80cf1f99bce2bc9f5f4a8b149219965da2 (Refactor runs_avail.). This resolves #493.
2016-11-04Fix arena data structure size calculation.Jason Evans
Fix paren placement so that QUANTUM_CEILING() applies to the correct portion of the expression that computes how much memory to base_alloc(). In practice this bug had no impact. This was caused by 5d8db15db91c85d47b343cfc07fc6ea736f0de48 (Simplify run quantization.), which in turn fixed an over-allocation regression caused by 3c4d92e82a31f652a7c77ca937a02d0185085b06 (Add per size class huge allocation statistics.).
2016-11-03Fix large allocation to search optimal size class heap.Jason Evans
Fix arena_run_alloc_large_helper() to not convert size to usize when searching for the first best fit via arena_run_first_best_fit(). This allows the search to consider the optimal quantized size class, so that e.g. allocating and deallocating 40 KiB in a tight loop can reuse the same memory. This regression was nominally caused by 5707d6f952c71baa2f19102479859012982ac821 (Quantize szad trees by size class.), but it did not commonly cause problems until 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.). These regressions were first released in 4.0.0. This resolves #487.
2016-11-03Fix chunk_alloc_cache() to support decommitted allocation.Jason Evans
Fix chunk_alloc_cache() to support decommitted allocation, and use this ability in arena_chunk_alloc_internal() and arena_stash_dirty(), so that chunks don't get permanently stuck in a hybrid state. This resolves #487.
2016-11-02Check for existance of CPU_COUNT macro before using it.Dave Watson
This resolves #485.
2016-11-02Do not use syscall(2) on OS X 10.12 (deprecated).Jason Evans
2016-11-02Add os_unfair_lock support.Jason Evans
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended replacement.
2016-11-02Fix/refactor zone allocator integration code.Jason Evans
Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes, since OS X 10.12 cannot tolerate a child unlocking mutexes that were locked by its parent. Refactor; this was a side effect of experimenting with zone {de,re}registration during fork(2).
2016-11-01Add "J" (JSON) support to malloc_stats_print().Jason Evans
This resolves #474.
2016-10-29Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.Jason Evans
The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.
2016-10-29Use syscall(2) rather than {open,read,close}(2) during boot.Jason Evans
Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.
2016-10-29Do not mark malloc_conf as weak on Windows.Jason Evans
This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).
2016-10-28Do not mark malloc_conf as weak for unit tests.Jason Evans
This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.
2016-10-28Support static linking of jemalloc with glibcDave Watson
glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.
2016-10-28Fix over-sized allocation of rtree leaf nodes.Jason Evans
Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).
2016-10-21Do not (recursively) allocate within tsd_fetch().Jason Evans
Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.
2016-10-13Make dss operations lockless.Jason Evans
Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be called multiple times during chunk deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.
2016-10-13Add/use adaptive spinning.Jason Evans
Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.
2016-10-12Disallow 0x5a junk filling when running in Valgrind.Jason Evans
Explicitly disallow junk:true and junk:free runtime settings when running in Valgrind, since deallocation-time junk filling and redzone validation cause false positive Valgrind reports. This resolves #470.
2016-10-11Fix and simplify decay-based purging.Jason Evans
Simplify decay-based purging attempts to only be triggered when the epoch is advanced, rather than every time purgeable memory increases. In a correctly functioning system (not previously the case; see below), this only causes a behavior difference if during subsequent purge attempts the least recently used (LRU) purgeable memory extent is initially too large to be purged, but that memory is reused between attempts and one or more of the next LRU purgeable memory extents are small enough to be purged. In practice this is an arbitrary behavior change that is within the set of acceptable behaviors. As for the purging fix, assure that arena->decay.ndirty is recorded *after* the epoch advance and associated purging occurs. Prior to this fix, it was possible for purging during epoch advance to cause a substantially underrepresentative (arena->ndirty - arena->decay.ndirty), i.e. the number of dirty pages attributed to the current epoch was too low, and a series of unintended purges could result. This fix is also relevant in the context of the simplification described above, but the bug's impact would be limited to over-purging at epoch advances.
2016-10-10Do not advance decay epoch when time goes backwards.Jason Evans
Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.
2016-10-10Refactor arena->decay_* into arena->decay.* (arena_decay_t).Jason Evans
2016-10-10Refine nstime_update().Jason Evans
Add missing #include <time.h>. The critical time facilities appear to have been transitively included via unistd.h and sys/time.h, but in principle this omission was capable of having caused clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of gettimeofday(), which in turn could cause spurious non-monotonic time updates. Refactor nstime_get() out of nstime_update() and add configure tests for all variants. Add CLOCK_MONOTONIC_RAW support (Linux-specific) and mach_absolute_time() support (OS X-specific). Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a fragile Linux-specific workaround, which we're unlikely to use at all now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we have no choice besides non-monotonic clocks, gettimeofday() is only incrementally worse.
2016-10-06Simplify run quantization.Jason Evans
2016-10-04Refactor runs_avail.Jason Evans
Use pszind_t size classes rather than szind_t size classes, and always reserve space for NPSIZES elements. This removes unused heaps that are not multiples of the page size, and adds (currently) unused heaps for all huge size classes, with the immediate benefit that the size of arena_t allocations is constant (no longer dependent on chunk size).
2016-10-04Implement pz2ind(), pind2sz(), and psz2u().Jason Evans
These compute size classes and indices similarly to size2index(), index2size() and s2u(), respectively, but using the subset of size classes that are multiples of the page size. Note that pszind_t and szind_t are not interchangeable.
2016-10-04Use TSDN_NULL rather than NULL as appropriate.Jason Evans
2016-09-26Close file descriptor after reading "/proc/sys/vm/overcommit_memory".Jason Evans
This bug was introduced by c2f970c32b527660a33fa513a76d913c812dcf7c (Modify pages_map() to support mapping uncommitted virtual memory.). This resolves #399.
2016-09-26Formatting fixes.Jason Evans
2016-09-26Change how the default zone is foundMike Hommey
On OSX 10.12, malloc_default_zone returns a special zone that is not present in the list of registered zones. That zone uses a "lite zone" if one is present (apparently enabled when malloc stack logging is enabled), or the first registered zone otherwise. In practice this means unless malloc stack logging is enabled, the first registered zone is the default. So get the list of zones to get the first one, instead of relying on malloc_default_zone.