Knowledge in one or more of the following areas, memory subsystem design, cache memory, LPDDR/DDR/HBM/CXL memory.
Knowledge and experience with common LLM (Large Language Model) workloads.
Proficiency in System Verilog, C or C++, scripting languages such as Python.
Experience with high-level simulators for performance or power estimation is a plus.
Knowledge in server-class GPU/ML architecture is a plus.
Responsibilities
Responsible for an analytical model of LLM inference and training memory usage
Responsible for running the performance simulation to extract the workload's memory footprint and bandwidth requirement. Hence, to derive the energy cost of memory data movement
Responsible for identifying memory subsystem capacity or bandwidth bottlenecks and improve the performance and energy efficiency
Current EE or CS master or Ph.D students with computer architecture backgrounds