Readings Memory models A Formal Analysis of the NVIDIA PTX Memory Consistency Model RISC-V Unprivileged ISA Manual: Chapter 17 and Appendix A on RVWMO Memory barriers Memory Barriers: a Hardware View for Software Hackers Linux Kernel Memory Barriers Parallel programming Rust Atomics and Locks Is Parallel Programming Hard, And, If So, What Can You Do About It? GPGPU ISA and uarch design Vortex GPGPU ISA Analyzing Modern NVIDIA GPU cores Apple G13 GPU Architecture Reference MTIA: First Generation Silicon Targeting Meta’s Recommendation Systems Meta’s Second Generation AI Chip: Model-Chip Co-Design and Productionization Experiences GPGPU compiler design Convergence and Scalarization for Data-Parallel Architectures DL compiler design Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations Sanitizers AddressSanitizer: A Fast Address Sanity Checker ThreadSanitizer - data race detection in practice Dynamic Race Detection with LLVM Compiler