openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for over 200 supported car makes and models.
You can not select more than 25 topicsTopics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Tinygrad does not yet output RDNA3 kernels directly. You can either install comgr or use `AMD_LLVM=1` (default) if you have [LLVM@19](https://github.com/tinygrad/tinygrad/blob/e2ed673c946c8f1774d816c75e52a994c2dd8a88/.github/actions/setup-tinygrad/action.yml#L208).
`[gid.x, gid.y, gid.z], [lid.x, lid.y, lid.z]` of the current thread.
#### Section 2: Wave info
`<lane> <instruction hex>`
RDNA3 divides threads into chunks of 32. Each thread is assigned to a "lane" from 0-31.
In Remu, even though all threads run one at a time, each 32 thread chunk (a wave) shares state like SGPR, VGPR, LDS, EXEC mask, etc.
Remu can simulate up to one wave sync instruction.
For more details, see work_group.rs.
Section 2 can have a green or gray color.
Green = The thread is actively executing the instruction.
Gray = The thread has been "turned off" by the EXEC mask, it skips execution of some instructions. (refer to "EXECute Mask" on [page 23](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf#page=23) of ISA docs for more details.)
To see the colors in action, try running `DEBUG=6 PYTHONPATH="." MOCKGPU=1 AMD=1 python test/test_ops.py TestOps.test_arange_big`. See how only lane 0 writes to global memory: