Skip to content

IME TG Minutes

Guido Araujo edited this page Jul 30, 2024 · 136 revisions

Back to TG

2024/07/22 (Next)

  • Slides (TBD)
  • Video
  • Agenda
  • Discussion Guido presented two propositions that prove that (a) CI is maximized when m = n; and (b) that the CI is limited by the number of available vector registers assigned as accumulators. Earl presented an engine model aiming at minimizing energy consumption and maximizing the CI. The model does not use vector registers and proposes a new set of ISA accumulator registers to perform matrix multiplication operations. Steven raised the point that since we are discussing a matrix operation engine we should also pay attention to other matrix operations which are very relevant to HPC workloads.

2024/07/15

  • Slides
  • Video(TBD)
  • Agenda
  • Discussion
    • The group discussed the Computational Intensity (CI) scalability. Steven and Greg raised the point that one should consider the memory subsystem constraints. Phillip mentioned that this should not be a concern, given that the TG focus is on the IME ISA definition and that architecture implementation aspects should not be a matter of concern. Greg mentioned that architecture implementation aspects like memory and datapath width would constrain performance and might limit scalability. Erich claimed that even with the current memory subsystem, achieving very high CI (of the order of ~60) is possible for very large VLEIN like 16K. The group agreed that discussing other metrics would be useful. Abel presented updates on Option D.

2024/07/01

2024/06/17

2024/06/03

  • Slides
  • Video
  • Agenda *
  • Discussion
    • Jose proposed a variation of Option C (called C*) in which tiles (size lambaˆ2) are encoded into vector registers. It allows the microkernel to load as many matrix tiles as possible depending on the size of the architected vector registers. Tile multiplication takes vector-encoded tiles as operands and performs outer-product multiple-accumulate. Tile loads are controlled using a set of three nesting loops, which use a special instruction to determine the number of ml, nl, and kl elements to load, zeroing the remaining positions of the vector registers. This approach may enable portability across architecture generations.

2024/05/20

  • Slides
  • Video
  • Agenda
    • Update on TG Chair/Vice-chair selection
    • Update on TG schedule
    • Kick-off on matrix data type and geometry configuration
  • Discussion
    • The group roadmap was revisited and updated.
    • Abel and Greg pointed out that new Matrix CSRs (MCSRs) add new architectural state and do not reuse what is already in RVV. Options would be: (a) adopt larger instructions to enable matrix type/shape encoding; (b) try harder to re-use what is in RVV by, for example, passing type/shape information through scalar registers (Guido's group is working on this).
    • Jose suggested calling Matrix Operation Shape (ms). Steve suggested generalizing that to any matrix operation. The slides have been updated as suggested. Steve also suggested the possibility that, at execution time, the architecture would assign physical registers to logical registers depending on the number of physical vector registers available.

2024/05/06

  • Slides
  • Video
  • Agenda
  • Discussion
    • A schedule for the group's work was proposed and discussed.
    • Workloads for the architecture evaluation were discussed. Greg proposed using MLPerf, and the group agreed to use it as a reference for ML workloads. ConvBench and IBM POWER10 ML model profiling will still be available for those interested. As for HPC, GEMM-based OpenBLIS was discussed, and it was agreed that it should also be a reference.
    • Memory access evaluation. CN.Ke extended the work on burst analysis to caches and discussed the impact on the various architectural options. It also proposed a new matrix transpose instruction.

2024/04/22

  • Slides
  • Video (TBD)
  • Agenda
    • Revisiting what we have achieved so far
    • Definition of gaps, agenda, and working groups

2024/04/08

  • Slides
  • Video
  • Agenda
    • Greg Favor's considerations of uArch aspects of the IME design.

2024/03/25

2024/03/11

  • Slides
  • Video
  • Agenda
    • Chair/Vice-chair selection update
    • Moving forward on qualitative analysis
      • Computational Intensity
      • Locality evaluation
    • NEC + BSC Presentation
      • Matrix Tile Extension: Portable ISA For Vector-Integrated Matrix Unit
      • Erich Focht (NEC) and Marc Casas (BSC)

2024/02/26

2024/02/12

  • Slides
  • Video
  • Agenda
    • Revisiting the IME preliminary options
    • Qualitative vs Quantitative approaches
    • Metrics and workloads

Before the creation of the IME TG, we had a number of meetings at the SIG Vector group that covered material related to the IME architecture. In order to avoid people jumping from one TG to another, we copied those SIG Vector minutes related to IME architecture below.


2023/12/11

  • Video
  • Agenda
    • Update on status of Integrated Matrix Extensions (IME) Task Group proposal.
    • Thoughts on variants for RISC-V matrix extensions
    • Thoughts on sparsity support for RISC-V matrix extensions

2023/11/27

  • Video
  • Agenda
    • Update on status of Integrated Matrix Facility (IMF) Task Group proposal.
    • Thoughts on variants for RISC-V matrix extensions
    • Thoughts on sparsity support for RISC-V matrix extensions

2023/10/30

  • Slides
  • Video
  • Agenda
    • Vote for proposing the creation of the Integrated Matrix Facility (IMF) Task Group
    • Presentation by Abel Bernabeu on Option D for IMF

2023/10/16

  • Slides
  • Video
  • Agenda
    • Presentation from AndesTech on their work on matrix extensions

2023/10/16

  • Slides
  • Video
  • Agenda
    • Continue overview of possible vector-matrix extension approaches - That is, matrix extensions that only use the current architected vector registers to store matrices; time permitting include comparison with attached matrix facility approach

2023/10/11

  • Slides
  • Video
  • Agenda
    • Continue overview of possible vector-matrix extension approaches - That is, matrix extensions that only use the current architected vector registers to store matrices; time permitting include comparison with attached matrix facility approach

2023/10/02

  • Slides
  • Agenda
    • Overview of possible vector-matrix extension approaches - That is, matrix extensions that only use the current architected vector registers to store matrices
Clone this wiki locally