Age | Commit message (Collapse) | Author |
|
For protected FW sections avoid trying to grab a 2MB page
but continue to use small pages with a tight size.
If an allocation fails do not fail the whole device intialization,
but just treat this case similar to not finding a protected allocator -
remove the allocator reference from the device and continue.
Bug 264977054
Commit-Topic: R43P0_KMD
Change-Id: I024503ef833eb01d2e36e3075e39aea30d891a80
Signed-off-by: Debarshi Dutta <debarshid@google.com>
|
|
Merge DDK version R43P0 from upstream branch
Provenance: 48a9c7e25986318c8475bc245de51e7bec2606e8 (ipdelivery/EAC/v_r43p0)
VX504X08X-BU-00000-r43p0-01eac0 - Valhall Android DDK
VX504X08X-BU-60000-r43p0-01eac0 - Valhall Android Document Bundle
VX504X08X-DC-11001-r43p0-01eac0 - Valhall Android DDK Software Errata
VX504X08X-SW-99006-r43p0-01eac0 - Valhall Android Renderscript AOSP parts
Bug 278174418
Commit-Topic: R43P0_KMD
Signed-off-by: Debarshi Dutta <debarshid@google.com>
Change-Id: I84fb19e7ce5f28e735d44a4993d51bd985aac80b
|
|
Remove differentiation between kernel thread and ioctl triggered
allocations - if the owner has a kill signal pending, stop requesting
pages.
Bug: 265224675
Change-Id: I70acfc9f3e6dc07dc040c456f11e3ddac5d49494
|
|
SBMerger: 526756187
Change-Id: I3cbd7d81818ce93bc2ab9d95bc2cc3dd8d2aaa61
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
Bug: 276704984
Change-Id: Id86861197e8f0929b3594fa28d21b8e3b6bee0f9
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
SBMerger: 526756187
Change-Id: I2aef3b329e47c52ef205c6849552ab82feab7675
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
This commit fixes a race condition in kbase_mmu_page_fault_worker when
a memory pool is required to grow. It addresses a potential racing
window where the worker is dealing with a given region's growable
pages on fault recovery yet the application side triggers a buffer
close on the specific region.
Change-Id: I25234396defd874ade30cf5075ed918e1142d96c
Bug: 287629203
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5549
(cherry picked from commit 221aa13af3d02f6b820adba0f50db7d203c41ba6)
|
|
If reset failed, both KMD and the hardware are in an unrecoverable
state. Any future attempts to process work or reset the GPU will fail,
and it may take a long time (30mins) for the device to reboot and return
to normal.
Collect a system ramdump and reboot the device immediately when reset
fails.
Bug: 276855700
Test: Simulated failed reset and checked that a ramdump was generated.
Change-Id: Iba901e1654d150b834303e0caa8fba2dc468b5ac
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
SBMerger: 526756187
Change-Id: Iddb56c0a11fefedd9d44d653ddf327d075e4d919
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
Commit 0935897 (pa/1761483) added two additional katom flags, but
updates to these new flags were not protected by hwaccess_lock, and
could thus race with other updates and ultimately corrupt atom_flags.
Bug: 265931966
Test: SST soak test
Change-Id: I95acc5e335d8013394b11149abf5d9b793648c6f
|
|
GPUCORE-35974: Add Memory Barrier between CS_REQ/ACK and CSG_DB_REQ/ACK
The access to GLB_DB_REQ/ACK needs to be ordered with respect to
CSG_REQ/ACK and CSG_DB_REQ/ACK to avoid a scenario where a CSI
request overlaps with a CSG request or 2 CSI requests overlap and
FW ends up missing the 2nd request. Memory barrier is required,
both on Host and FW side, to guarantee the ordering.
Bug: 286056062
Test: SST soak test
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/4688
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5435
Change-Id: I4de23e3f37b81749c6d668952b4f8dd21c669fea
|
|
SBMerger: 526756187
Change-Id: I78a4e882d943b157a365612055ea922088ca2bff
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
SBMerger: 526756187
Change-Id: Ibe152c3a5f6bde3b32b1349e33175811bc895c38
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
Rename kutf to mali_kutf. Enable mali_kutf and
mali_kutf_clk_rate_trace_test_portal.
Bug: 267758398
Test: insmod
Change-Id: I36fecd89bce4f87d31d452f5a913c95c22513c53
Signed-off-by: Yunju Lee <yunjulee@google.com>
|
|
During an invalid GPU page fault, kbase will try to flush the GPU cache
and disable the faulting address space (AS). There is a small window
between flushing of the GPU L2 cache (MMU resumes) and when the AS is
disabled where existing jobs on the GPU may access memory for that AS,
dirtying the GPU cache.
This is a problem as the kctx->as_nr is marked as KBASEP_AS_NR_INVALID
and thus no cache maintenance will be performed on the AS of the faulty
context when cleaning up the csg_slot and releasing the context.
This patch addresses that issue by:
1. locking the AS via a GPU command
2. flushing the cache
3. disabling the AS
4. unlocking the AS
This ensures that any jobs remaining on the GPU will not be able to
access the memory due to the locked AS. Once the AS is unlocked, any
memory access will fail as the AS is now disabled.
The issue only happens on CSF GPUs. To avoid any issues, the code path
for non-CSF GPUs is left undisturbed.
(cherry picked from commit 566789dffda3dfec00ecf00f9819e7a515fb2c61)
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5071
Bug: 274014055
Change-Id: I2028182878b4f88505cc135a5f53ae4c7e734650
|
|
This patch addresses the dead lock condition due to circular locking
dependency between hwaccess_lock and clk_rtm->lock.Hwaccess_lock needs
to be taken before clk_rtm->lock to avoid locking dependency.
Change-Id: I1064dbbac7800282bf3a1ac167c9c476177aefd8
(cherry picked from commit e0dfe9669c3456ada4b860f6ba9859c59ffec9a7)
Bug: 274687461
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5258
|
|
If kbase_release is called while jobs are in progress, the driver
will start by calling kbasep_platform_context_term before waiting
for jobs to finish in kbase_context_flush_jobs. When the jobs do
finish, the driver will call kbasep_platform_event_work_end, which
leads to issues since the platform callback has already cleaned
up resources for the kbase_context.
Make sure kbase_context_flush_jobs is called before
kbasep_platform_context_term.
Test: start/stop processes over and over
Bug: 278366794
Change-Id: Iee0297f4b64a3f6b59a5df0c26e46d446257a652
|
|
silent reset on GPU power up
Commands for GPU cache maintenance and TLB invalidation were sent after
acquiring 'hwaccess_lock' and checking if the 'gpu_powered' flag is set.
The combination of lock and the flag ensured that GPU registers remained
accessible whilst the commands were in progress. If the flag was not set
then the GPU power up was not performed and the commands were rightfully
skipped.
The 'gpu_powered' flag is set immediately after the Top-level power up
of GPU is done by the platform specific power_on_callback() and so the
registers can be safely accessed. If the callback returns 1 then a
silent soft-reset of the GPU is performed after setting the flag.
This lead to a race between the cache maintanence commands and the soft
reset of GPU, due to which the commands did not complete or got lost
and there was a timeout.
This commit replaces the 'gpu_powered' flag with the 'gpu_ready' flag
as the latter is set after the soft-reset is done and all the in-use
GPU address spaces have been enabled. It is okay to skip the commands
when the flag is false, as L2 cache would be in powered down state.
The page migrate function is also updated to use 'gpu_ready' flag as
that was also affected by the similar race with silent reset in
GPUCORE-35861 and 'kbdev->pm.lock' had to be used.
Change-Id: I4cefe3add2863d7b29f111d437061031b66e7080
(cherry picked from commit e31494f5b7b9e9101aab4bd75fa4dc7d7f47b66a)
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5284
Bug: 281540759
|
|
Bug: 281607159
SBMerger: 526756187
Change-Id: I15bc929d24d73e636f12cf880126e8192ba7d9cb
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
The GPU internal counter is updated every IPA_CONTROL_TIMER_DEFAULT_VALUE_MS milliseconds.
If an utilization update occurs prematurely and
the counter has not been updated, the same counter
value will be obtained, resulting in a difference
of zero.
To handle this scenario, this change will skip the
utilization update if the counter difference is zero
and the update occurred less than 1.5 times the internal
update period (IPA_CONTROL_TIMER_DEFAULT_VALUE_MS).
Bug: 277649158
Test: boot and trace
Change-Id: I6e3063355f560a2872297fd32d66be8a468cdf79
Signed-off-by: Wei Wang <wvw@google.com>
|
|
The WARN() call at the beginning of the function schedule_on_tick()
is incorrect as kbase_gpu_interrupt() might enqueue another
tick_work(2) into the scheduler before the already inflight worker
tick_work(1) sets the tick_timer_active variable to true.
This could result in a condition where the hrtimer hasn't still expired
and tick_work(1) starts executing resulting in the WARN_ON() being fired.
The timer works asynchronously with the tick_work() and hence this warning
can be removed from here.
Bug 207824944
Change-Id: I873624c76b0de102bbcdd451a8402cb1c096edda
|
|
Aliased regions containing the BASE_MEM_WRITE_ALLOC_PAGES_HANDLE MMU
sink-page were not previously being unmapped correctly. In
particular, the PGD entries for these pages. This change addresses
that issue. Further, care is taken to ensure the flush_pa_range
path operates correctly, for applicable GPUs.
Also updated various WARN_ONs to WARN_ONCEs in MMU layer, in places
where these could potentially occur in large numbers, rapidly -
thereby helping to reduce the chances of system stress in future,
as could potentially have been caused by this particular issue.
GPUCORE-36048 Remove SAME_VA flag from regular allocation
This patchset removes the SAME_VA flag from the regular allocation
done in the defect test for GPUCORE-35611. The test was failing on
32-bit systems because there was no way to enforce that the aliased
memory and the regular allocation would fall into the same region,
and thus a later assumption in the test would not hold.
Change-Id: Ie665fb9330a7338b7e148d1c1db13fe3cc98ee5c
(cherry picked from commit 823c7b2de1933ca42cf179862d033d79d1289073)
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/4800
Bug: 260122837
|
|
alloc from kthread
The backing pages for native GPU allocations aren't always allocated in
the ioctl context. A JIT_ALLOC softjob or KCPU command can get processed
in the kernel worker thread. GPU page fault handling is anyways done in
a kernel thread.
Userspace can make Kbase allocate large number of backing pages from the
kernel thread to cause out of memory situation, which would eventually
lead to a kernel panic as OoM killer would run out of suitable processes
to kill.
Though Kbase will account for the backing pages and OoM killer will try
to kill the culprit process, the memory already allocated by the process
won't get freed as context termination would remain blocked or won't
kick-in until kernel thread keeps trying to allocate the backing pages.
For the allocation that is done from the context of kernel thread,
OoM killer won't consider the kernel thread for killing and kernel
would keep retrying to allocate physical page as long as the OoM
killer is able to kill processes.
For the memory allocation done from the ioctl context, kernel would
eventually stop retrying when it sees that process has been marked
for killing by the OoM killer.
This commit adds a check for process exit in the page allocation loop.
The check allows kernel thread to swiftly exit the page allocation loop
once OoM killer has initiated the killing of culprit process (for which
kernel thread is trying to allocate pages) thereby unblocking context
termination and freeing of GPU memory already allocated by the process.
This helps in preventing the kernel panic and also limits the number of
innocent processes that gets killed.
The use of __GFP_RETRY_MAYFAIL flag didn't help in all the scenarios.
The flag ensures that OoM killer is not invoked directly and kernel
doesn't keep retrying to allocate the page. But when system is running
low on memory, other threads can invoke the OoM killer and the page
allocation request from kthread could continue to get satisfied due to
the killing of other processes and so the kthread may not always timely
exit the page allocation loop.
(cherry picked from commit 3c5c9328a7fc552e61972c1bbff4b56696682d30)
GPUCORE-36402: Fix potential memleak and NULL ptr deref issue in Kbase
The commit 3c5c9328a7fc552e61972c1bbff4b56696682d30 updated Kbase to
check for the process exit in every iteration of the page allocation
loop when the allocation is done from the context of kernel worker
thread. The commit introduced a potential memleak and NULL pointer
dereference issue (which was reported by Coverity).
This commit adds the required fix for the 2 issues and also sets the
task pointer only for the Userspace created contexts and not for the
contexts created by Kbase i.e. privileged context created for the HW
counter dumping and for the WA of HW issue TRYM-3485.
Bug: 275614526
Change-Id: I8107edce09a2cb52d8586fc9f7990a25166f590e
Signed-off-by: Guus Sliepen <gsliepen@google.com>
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5169
(cherry picked from commit 8294169160ebb0d11d7d22b11311ddf887fb0b63)
|
|
This reverts commit 75b4a4ab15df252b112439300203dbc9b6d46922.
Bug: 274002431
Change-Id: I7055294a6615e8ff282b47f822d67ecb709307a3
|
|
Provenance: 48a9c7e25986318c8475bc245de51e7bec2606e8 (ipdelivery/EAC/v_r43p0)
VX504X08X-BU-00000-r43p0-01eac0 - Valhall Android DDK
VX504X08X-BU-60000-r43p0-01eac0 - Valhall Android Document Bundle
VX504X08X-DC-11001-r43p0-01eac0 - Valhall Android DDK Software Errata
VX504X08X-SW-99006-r43p0-01eac0 - Valhall Android Renderscript AOSP parts
Change-Id: I5df1914eba386e0bf507d4951240e1744f666a29
|
|
Provenance: 300534375857cb2963042df7b788b1ab5616c500 (ipdelivery/EAC/v_r42p0)
VX504X08X-BU-00000-r42p0-01eac0 - Valhall Android DDK
VX504X08X-BU-60000-r42p0-01eac0 - Valhall Android Document Bundle
VX504X08X-DC-11001-r42p0-01eac0 - Valhall Android DDK Software Errata
VX504X08X-SW-99006-r42p0-01eac0 - Valhall Android Renderscript AOSP parts
Change-Id: I3b15e01574f03706574a8edaf50dae4ba16e30c0
|
|
Bug: 265007605
Test: build_slider.sh
UMD: http://ag/22336262
Change-Id: Ifc22c6b961860ad7955e974d21c2b7960fa55647
|
|
This reverts commit 04bf4049652e9aa3e952bdc30c560054e1c0f060.
Bug: 274827412
Reason for revert: stability
Change-Id: I923387539eabbf72f51376decf95526f13339656
|
|
This reverts commit d4a9cc691fdde6aae0f5d40ad3d949ab76518e42.
Bug: 274827412
Reason for revert: stability
Change-Id: Id952d2656a642b0f363d579a51843a03e7750c2c
|
|
This reverts commit 04bf4049652e9aa3e952bdc30c560054e1c0f060.
Bug: 274827412
Reason for revert: stability
Change-Id: I530dc9425d9cb52ab88e8211c789def29b7607ac
|
|
This reverts commit d4a9cc691fdde6aae0f5d40ad3d949ab76518e42.
Bug: 274827412
Reason for revert: stability
Change-Id: I929c4e7b11bd5b62a0c14a5b960b32127b26233a
|
|
unmap of tracking page
This commit introduces new checks to ensure that,
like allocations of native memory, JIT memory
allocations are blocked after the unmap of the
tracking page.
Bug: 275615867
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5168/
Change-Id: I32460df4e8898784e75084193e038a912f67b33e
(cherry picked from commit 240d4e9206528a43340c22aa69b124436f9a4e01)
|
|
Userspace can cause a memory leak for physical pages of SAME_VA allocations through GROUP_SUSPEND kcpu command.
This commit fixes the memleak issue
Bug: 275620394
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5167
Change-Id: Iec155e23ea135cf1ea7592f38934dc617cc6b10e
(cherry picked from commit 1f565b867e7bff3b3307db0960fabf028f95d981)
|
|
kbase avoids flushing MMU updates on coherent systems, as these
systems are expected to snoop CPU caches instead.
This presents a problem on GS101/GS201 devices, where GPU->CPU cache
snoop requests do not work as intended when the GPU is in protected mode
(b/192236116) and the GPU ends up seeing stale memory / runs into page
faults.
As a software workaround, always flush MMU updates regardless of
coherency mode, so that the GPU page tables are accurate.
Note: This was initially added in I5473345d and reverted in I2a41a2044.
Bug: 200555454
Change-Id: I51187cd7c042bde42c4fcdf976a9f7f8828155e1
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
This adds a trigger_uevent debugfs node that takes the uevent type and
info as write parameter and fires the corresponding uevent.
Bug: 275367216
Bug: 275367223
Test: Combined with userspace patches: b/276704984#comment2
Change-Id: Ic1e069259e5d068a4677c8d1472d74485b8a904c
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
Add the following types of GPU uevents:
1. KMD_ERROR: Reports incidents where kbase runs into an error
(includes FW errors).
2. GPU_RESET: Reports failed or successful GPU reset incidents.
Bug: 275367216
Bug: 275367223
Test: Combined with userspace patches: b/276704984#comment2
Change-Id: Ie0d18f96c590cba561e8425eba210136bfef039d
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
Add an interface to emit uevents with env GPU_UEVENT_TYPE and
GPU_UEVENT_INFO from kbase. This will be used to report common
GPU failure conditions.
To avoid flooding the userspace with uevents, these are ratelimited
to one uevent per GPU_UEVENT_TYPE per GPU_UEVENT_TIMEOUT_MS.
Bug: 275367216
Bug: 275367223
Test: Combined with userspace patches: b/276704984#comment2
Change-Id: I557df22c87f435aca4d05e0038609e1c9f82de54
Doc: go/pixel-gpu-instability-monitoring
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
Physical address of GPU bus fault is useful for debugging purpose.
However the physical address (emitted via 0x%016llX) is also sensitive
information so it should be cautiously exposed to user space.
Linux kernel provides the control for physical address exposure
via 'kptr_restrict'. To allow this control to work, '%pK' must be used.
Bug: 275623256
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5166
Change-Id: I23171bafc47e96045e42dad533ed28fc8bbcef6b
(cherry picked from commit d35be16de81d9bc55dc0a586d661391e1989d6c0)
|
|
android13-gs-pixel-5.10-udc" into android13-gs-pixel-5.10-udc
|
|
Init call for SLC portion of the GPU context was missing.
Bug: 276392249
Change-Id: I7f9b8a89a463f66845f5da91adca63d30f138c83
Signed-off-by: Jack Diver <diverj@google.com>
|
|
Bug: 254279889
Test: Boots to home
Signed-off-by: Jack Diver <diverj@google.com>
(cherry picked from https://partner-android-review.googlesource.com/q/commit:08be62386b8b087e1979c0396a23847246ca36bb)
Change-Id: I1427019107b67139381390a5a73bf518b99927c8
|
|
kbase_pm_update_active() may cancel an ongoing poweroff before the
poweroff has completed and enqueued a gpu_poweroff_wait_work item,
which when executed would have unblocked any waiters in
kbase_pm_wait_for_poweroff_work_complete().
kbase_pm_update_active() must therefore also call wake_up(poweroff_wait)
after resetting poweroff_wait_in_progress to false, to prevent
kbase_pm_wait_for_poweroff_work_complete() from waiting indefinitely.
This change also modifies the diagnostic patch in
kbase_pm_wait_for_poweroff_work_complete() to avoid triggering a
subsystem coredump if a gpu_poweroff_wait_work item is actually pending.
Bug: 274137481
Test: Stability soak testing
Change-Id: I9009a6eed7aa305ae04179263e308ba4259afc6a
|
|
SBMerger: 516612970
Change-Id: I2e07185ea841a4f0de9998a41ddfbef7d9e6aa8e
Signed-off-by: SecurityBot <android-nexus-securitybot@system.gserviceaccount.com>
|
|
Use mgm_resize_callback to update memory group size.
Add entry point allowing memory group size to be queried.
Bug: 264990406
Test: Boot to home
Test: gfx-bench mh3.1
Change-Id: I80f595724c7418b97e07679719d2b76e4ee7b96f
Signed-off-by: Jack Diver <diverj@google.com>
|
|
Completed atoms are expected to always have a flag indicating they were
submitted.
A warning is present to assert this fact.
Currently, if the flag is not present it will block GPU suspend.
Remove the if to unblock suspend and prevent a kernel lockup.
Bug: 233522199
Change-Id: I541ac835ec36562f7724b35e171d71537e763ed9
Signed-off-by: Jack Diver <diverj@google.com>
|
|
coredump_work isn't guaranteed to happen before reset, which means the
resulting SSCD can contain either of pre-reset or post-reset state.
post-reset state isn't helpful in debugging a GPU hang. Ensure that we
always collect the pre-reset state by flushing the coredump worker
before resetting the GPU.
Bug: 264595878
Test: Raced reset debugfs write with trigger_core_dump sysfs write to
check that the device is stable and coredump happens before reset.
Change-Id: I7a553f8dd156d5dbee2d8008a70545641ed8dbe9
Signed-off-by: Varad Gautam <varadgautam@google.com>
|
|
Add a heuristic to ratelimit SSCD generation for "GPU hang"-type
coredumps. Typically when the GPU hangs, this codepath is hit multiple
times leading to unnecessary SSCD generation per hang (sometimes > 200
coredumps for a single incident).
The heuristic skips SSCD generation depending on:
1. whether there was a "GPU hang" coredump recently within the
GPU_HANG_SSCD_TIMEOUT_MS time window.
2. whether there was an unsuccesful GPU reset, which implies the
system will end up rebooting soon.
Change-Id: I761057aee9c4ff9f32d658c49b99eb162486033b
Bug: 264595878
Signed-off-by: Varad Gautam <varadgautam@google.com>
Test: b/264595878#comment7
|
|
Kbase upstreaming: Pending
Change-Id: I867d64897785348d499ad4d9a4f4c95f95e8df85
Signed-off-by: Varad Gautam <varadgautam@google.com>
Bug: 264595878
|
|
In the original bug, the protected memory imports via Base
were ignoring the actual size of the import that came back from
the kernel memory import routines. These resulted in errors as when
these imports were freed, the incorrect size was passed resulting in
only sub-regions of the original mapped range being unmapped resulting
in cases where the GPU and CPU VAs ended up being inconsistent.
A WAR was added to prevent VMA splits temporarily until a fix was provided
for the protected memory size mismatch. As a result of this fix this WAR is
no longer necessary. The consequences of this WAR is now resulting in failures
for the case when an application tries to call mprotect(restrictive) on a memory already
allocated and mmapped on by Vulkan API calls.
Vulkan alloc() invokes the cmem_heap_alloc() function, which for the general
case allocates some extra memory to fulfil the worse case alignment requirements.
As a result invoking mprotect on the partial user provided range always result
in VMA splits().
For further reference look at this article.
https://lwn.net/Articles/182847/
Bug 269535398
This reverts commit 6d1d889156e68493842f5bb18fc9aed74cc57454.
Change-Id: Ic5749fab2613d6495fd3669356697ff40bfafcb7
|
|
*Affects CSF GPUs only, but changes to common code.*
During an invalid GPU page fault, kbase will try to flush the GPU cache
and disable the faulting address space (AS). There is a small window
between flushing of the GPU L2 cache (MMU resumes) and when the AS is
disabled where existing jobs on the GPU may access memory for that AS,
dirtying the GPU cache.
This is a problem as the kctx->as_nr is marked as KBASEP_AS_NR_INVALID
and thus no cache maintenance will be performed on the AS of the faulty
context when cleaning up the csg_slot and releasing the context.
This patch addresses that issue by:
1. locking the AS via a GPU command
2. flushing the cache
3. disabling the AS
4. unlocking the AS
This ensures that any jobs remaining on the GPU will not be able to
access the memory due to the locked AS. Once the AS is unlocked, any
memory access will fail as the AS is now disabled.
Change-Id: I5e02face6ca0fa4526576dd70d0261ea3ee69506
(cherry picked from commit 566789dffda3dfec00ecf00f9819e7a515fb2c61)
Provenance: https://code.ipdelivery.arm.com/c/GPU/mali-ddk/+/5071
Bug: 274014055
|