Metadata-Version: 2.1
Name: umi-hw
Version: 0.1.2
Summary: Universal Memory Interface (UMI)
Author: Zero ASIC
License: Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that control, are controlled by, or are under common
              control with that entity. For the purposes of this definition,
              "control" means (i) the power, direct or indirect, to cause the
              direction or management of such entity, whether by contract or
              otherwise, or (ii) ownership of fifty percent (50%) or more of the
              outstanding shares, or (iii) beneficial ownership of such entity.
        
              "You" (or "Your") shall mean an individual or Legal Entity
              exercising permissions granted by this License.
        
              "Source" form shall mean the preferred form for making modifications,
              including but not limited to software source code, documentation
              source, and configuration files.
        
              "Object" form shall mean any form resulting from mechanical
              transformation or translation of a Source form, including but
              not limited to compiled object code, generated documentation,
              and conversions to other media types.
        
              "Work" shall mean the work of authorship, whether in Source or
              Object form, made available under the License, as indicated by a
              copyright notice that is included in or attached to the work
              (an example is provided in the Appendix below).
        
              "Derivative Works" shall mean any work, whether in Source or Object
              form, that is based on (or derived from) the Work and for which the
              editorial revisions, annotations, elaborations, or other modifications
              represent, as a whole, an original work of authorship. For the purposes
              of this License, Derivative Works shall not include works that remain
              separable from, or merely link (or bind by name) to the interfaces of,
              the Work and Derivative Works thereof.
        
              "Contribution" shall mean any work of authorship, including
              the original version of the Work and any modifications or additions
              to that Work or Derivative Works thereof, that is intentionally
              submitted to Licensor for inclusion in the Work by the copyright owner
              or by an individual or Legal Entity authorized to submit on behalf of
              the copyright owner. For the purposes of this definition, "submitted"
              means any form of electronic, verbal, or written communication sent
              to the Licensor or its representatives, including but not limited to
              communication on electronic mailing lists, source code control systems,
              and issue tracking systems that are managed by, or on behalf of, the
              Licensor for the purpose of discussing and improving the Work, but
              excluding communication that is conspicuously marked or otherwise
              designated in writing by the copyright owner as "Not a Contribution."
        
              "Contributor" shall mean Licensor and any individual or Legal Entity
              on behalf of whom a Contribution has been received by Licensor and
              subsequently incorporated within the Work.
        
           2. Grant of Copyright License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              copyright license to reproduce, prepare Derivative Works of,
              publicly display, publicly perform, sublicense, and distribute the
              Work and such Derivative Works in Source or Object form.
        
           3. Grant of Patent License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              (except as stated in this section) patent license to make, have made,
              use, offer to sell, sell, import, and otherwise transfer the Work,
              where such license applies only to those patent claims licensable
              by such Contributor that are necessarily infringed by their
              Contribution(s) alone or by combination of their Contribution(s)
              with the Work to which such Contribution(s) was submitted. If You
              institute patent litigation against any entity (including a
              cross-claim or counterclaim in a lawsuit) alleging that the Work
              or a Contribution incorporated within the Work constitutes direct
              or contributory patent infringement, then any patent licenses
              granted to You under this License for that Work shall terminate
              as of the date such litigation is filed.
        
           4. Redistribution. You may reproduce and distribute copies of the
              Work or Derivative Works thereof in any medium, with or without
              modifications, and in Source or Object form, provided that You
              meet the following conditions:
        
              (a) You must give any other recipients of the Work or
                  Derivative Works a copy of this License; and
        
              (b) You must cause any modified files to carry prominent notices
                  stating that You changed the files; and
        
              (c) You must retain, in the Source form of any Derivative Works
                  that You distribute, all copyright, patent, trademark, and
                  attribution notices from the Source form of the Work,
                  excluding those notices that do not pertain to any part of
                  the Derivative Works; and
        
              (d) If the Work includes a "NOTICE" text file as part of its
                  distribution, then any Derivative Works that You distribute must
                  include a readable copy of the attribution notices contained
                  within such NOTICE file, excluding those notices that do not
                  pertain to any part of the Derivative Works, in at least one
                  of the following places: within a NOTICE text file distributed
                  as part of the Derivative Works; within the Source form or
                  documentation, if provided along with the Derivative Works; or,
                  within a display generated by the Derivative Works, if and
                  wherever such third-party notices normally appear. The contents
                  of the NOTICE file are for informational purposes only and
                  do not modify the License. You may add Your own attribution
                  notices within Derivative Works that You distribute, alongside
                  or as an addendum to the NOTICE text from the Work, provided
                  that such additional attribution notices cannot be construed
                  as modifying the License.
        
              You may add Your own copyright statement to Your modifications and
              may provide additional or different license terms and conditions
              for use, reproduction, or distribution of Your modifications, or
              for any such Derivative Works as a whole, provided Your use,
              reproduction, and distribution of the Work otherwise complies with
              the conditions stated in this License.
        
           5. Submission of Contributions. Unless You explicitly state otherwise,
              any Contribution intentionally submitted for inclusion in the Work
              by You to the Licensor shall be under the terms and conditions of
              this License, without any additional terms or conditions.
              Notwithstanding the above, nothing herein shall supersede or modify
              the terms of any separate license agreement you may have executed
              with Licensor regarding such Contributions.
        
           6. Trademarks. This License does not grant permission to use the trade
              names, trademarks, service marks, or product names of the Licensor,
              except as required for reasonable and customary use in describing the
              origin of the Work and reproducing the content of the NOTICE file.
        
           7. Disclaimer of Warranty. Unless required by applicable law or
              agreed to in writing, Licensor provides the Work (and each
              Contributor provides its Contributions) on an "AS IS" BASIS,
              WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
              implied, including, without limitation, any warranties or conditions
              of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
              PARTICULAR PURPOSE. You are solely responsible for determining the
              appropriateness of using or redistributing the Work and assume any
              risks associated with Your exercise of permissions under this License.
        
           8. Limitation of Liability. In no event and under no legal theory,
              whether in tort (including negligence), contract, or otherwise,
              unless required by applicable law (such as deliberate and grossly
              negligent acts) or agreed to in writing, shall any Contributor be
              liable to You for damages, including any direct, indirect, special,
              incidental, or consequential damages of any character arising as a
              result of this License or out of the use or inability to use the
              Work (including but not limited to damages for loss of goodwill,
              work stoppage, computer failure or malfunction, or any and all
              other commercial damages or losses), even if such Contributor
              has been advised of the possibility of such damages.
        
           9. Accepting Warranty or Additional Liability. While redistributing
              the Work or Derivative Works thereof, You may choose to offer,
              and charge a fee for, acceptance of support, warranty, indemnity,
              or other liability obligations and/or rights consistent with this
              License. However, in accepting such obligations, You may act only
              on Your own behalf and on Your sole responsibility, not on behalf
              of any other Contributor, and only if You agree to indemnify,
              defend, and hold each Contributor harmless for any liability
              incurred by, or claims asserted against, such Contributor by reason
              of your accepting any such warranty or additional liability.
        
           END OF TERMS AND CONDITIONS
        
           APPENDIX: How to apply the Apache License to your work.
        
              To apply the Apache License to your work, attach the following
              boilerplate notice, with the fields enclosed by brackets "[]"
              replaced with your own identifying information. (Don't include
              the brackets!)  The text should be enclosed in the appropriate
              comment syntax for the file format. We also recommend that a
              file or class name and description of purpose be included on the
              same "printed page" as the copyright notice for easier
              identification within third-party archives.
        
           Copyright 2024 ZeroASIC Corp.
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
        
           Unless required by applicable law or agreed to in writing, software
           distributed under the License is distributed on an "AS IS" BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
           See the License for the specific language governing permissions and
           limitations under the License.
        
           ------------------
        
Project-URL: Homepage, https://github.com/zeroasiccorp/umi
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS
Requires-Dist: siliconcompiler >=0.26.0
Requires-Dist: lambdalib <0.2.10,>=0.2.9
Provides-Extra: test
Requires-Dist: switchboard-hw >=0.2.14 ; extra == 'test'
Requires-Dist: flake8 ==7.1.0 ; extra == 'test'

# Universal Memory Interface (UMI)

## 1. Introduction

### 1.1 Design Philosophy

* Make everything addressable
* Keep it simple

### 1.2 Architecture

The Universal Memory Interface (UMI) is a transaction based standard for accessing memory through request-response message exchange patterns. UMI includes five distinct abstraction layers:

* **Protocol**: Protocol/application specific payload (Ethernet, PCIe)
* **Transaction**: Address based request-response messaging
* **Signal**: Latency insensitive signaling (packet, ready, valid)
* **Link**: Communication integrity (flow control, reliability)
* **Physical**: Electrical signaling (electrons, wires, etc.)

![UMI](docs/_images/umi_stack.svg)

### 1.3 Key Features

  * independent request and response channels
  * word sizes up to 1024 bits
  * up to 256 word transfers per transaction
  * atomic transaction support
  * quality of service support
  * protection and security support
  * reserved opcodes for users and future expansion

### 1.4 Key Terms

* **Transaction**: Complete request-response memory operation.
* **Message**: Unidirectional request or response, consisting of a command header, address fields, and an optional data payload.
* **Host**: Initiator of memory requests.
* **Device**: Responder to memory requests.

----
## 2. Protocol UMI (PUMI) Layer

UMI transaction payloads are treated as a series of opaque bytes and can carry arbitrary data, including higher level protocols. The maximum data size available for communication protocol data and headers is 32,768 bytes. The following table illustrates recommended bit packing for a number of common communication standards.

| Protocol  | Payload(UMI DATA) | Header(UMI Data)|UMI Addresses + Command |
|:---------:|:-----------------:|:---------------:|:----------------------:|
| Ethernet  | 64B - 1,518B      |14B              | 20B                    |
| CXL-68    | 64B               |2B               | 20B                    |
| CXL-256   | 254B              |2B               | 20B                    |

----
## 3. Transaction UMI (TUMI) Layer

### 3.1 Theory of Operation

UMI transactions are request-response memory exchanges between Hosts and Devices. Hosts send memory access requests to devices and get responses back.  The figure below illustrates the relationship between hosts, devices, and the interconnect network.

![UMI](docs/_images/tumi_connections.png)

Basic UMI read/write transaction involves the transfer of LEN+1 words of data of width 2^SIZE bytes between a device and a host.

**Summary:**
* UMI transaction type, word size (SIZE), transfer count (LEN), and other options are encoded in a 32bit transaction command header (CMD).
* Device memory access is communicated through a destination address (DA) field.
* The host source address is communicated through the source address (SA) field.
* The destination address indicates the memory address of the first byte in the transaction.
* Memory is accessed in increasing address order starting with DA and ending with DA + (LEN+1)\*(2^SIZE)-1.
* The maximum data field size is 32,768 bytes.

**Hosts:**

* Send read, write memory access request messages
* Validate and execute incoming responses
* Identify egress interface through which to send requests (in case of multiple)

**Devices:**

* Validate and execute incoming memory request messages
* Initiate response messages when required
* Identify egress interface through which to send responses (in case of multiple)

**Constraints:**
* Device and source addresses must be aligned to the native word size.
* Requests with the same HOSTID arrive at the target device in the same order that they left the host.
* Responses with the same HOSTID return to the host in the same order that they left the device.

### 3.2 Message Format

#### 3.2.1 Message Fields

| Term        | Meaning    |
|-------------|------------|
| CMD         | Command (type + options)
| DA          | Destination address of message
| SA          | Source address (where to return a response)
| DATA        | Data payload
| OPCODE      | Command opcode
| SIZE        | Word size
| LEN         | Word transfers per message
| QOS         | Quality of service required
| PROT        | Protection mode
| EX          | Exclusive access indicator
| EOF         | End of frame indicator
| EOM         | End of message indicator
| U           | User defined message bit
| R           | Reserved message bit
| ERR         | Error code
| HOSTID      | Host ID
| DEVID       | Device ID
| MSB         | Most significant bit

#### 3.2.2 Message Byte Order

Request and response messages are packed together in the following order:

|                  |MSB-1:160|159:96|95:32|31:0|
|------------------|:-------:|:----:|:---:|:--:|
| 64b architecture |DATA     |SA    |DA   | CMD|
| 32b architecture |DATA     |DATA  |SA,DA| CMD|

#### 3.2.3 Message Types

The table below documents all UMI message types. CMD[4:0] is the UMI opcode defining the type of message being sent. CMD[31:5] are used for message specific options. Complete functional descriptions of each message can be found in the [Message Description Section](#34-transaction-descriptions).

|Message     |DATA|SA|DA|31:27 |26:25|24:22     |21:20|19:16|15:8 |7:5 |4:0  |
|------------|:--:|--|--|:----:|:---:|:--------:|:---:|-----|-----|----|-----|
|INVALID     |    |  |  |--    |--   |--        |--   |--   |--   |0x0 |0,0x0|
|REQ_RD      |    |Y |Y |HOSTID|U    |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x1|
|REQ_WR      |Y   |Y |Y |HOSTID|U    |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x3|
|REQ_WRPOSTED|Y   |Y |Y |HOSTID|U    |0 ,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x5|
|REQ_RDMA    |    |Y |Y |HOSTID|U    |0 ,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x7|
|REQ_ATOMIC  |Y   |Y |Y |HOSTID|U    |0 ,EOF,EOM|PROT |QOS  |ATYPE|SIZE|R,0x9|
|REQ_USER0   |Y   |Y |Y |HOSTID|U    |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0xB|
|REQ_FUTURE0 |Y   |Y |Y |HOSTID|U    |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0xD|
|REQ_ERROR   |    |Y |Y |HOSTID|U    |U         |U    |U    |U    |0x0 |R,0xF|
|REQ_LINK    |    |  |  |U     |U    |U         |U    |U    |U    |0x1 |R,0xF|
|RESP_RD     |Y   |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x2|
|RESP_WR     |    |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x4|
|RESP_USER0  |    |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x6|
|RESP_USER1  |Y   |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0x8|
|RESP_FUTURE0|    |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0xA|
|RESP_FUTURE1|Y   |  |Y |HOSTID|ERR  |EX,EOF,EOM|PROT |QOS  |LEN  |SIZE|R,0xC|
|RESP_LINK   |    |  |  |U     |U    |U         |U    |U    |U    |0x0 |R,0xE|

### 3.3 Message Fields

### 3.3.1 Source Address and Destination Address (SA[63:0], DA[63:0])

The destination address (DA) specifies the target address of a request or response message. For requests, the DA field is the full device address to access. For responses, the DA field returned is a copy of the requester SA field.  The SA field can be a full address (32/64 bits) or a partial routing address and a set of optional [UMI signal layer](#UMI-Signal-layer) controls needed to drive the interconnect network.

Responses do not have the SA field.  At the SUMI level, while the SA bus is always present, its value is undefined in response packets.  Implementations must not depend on the value of the SA bus in response packets.

The table below shows the bit mapping for SA field.

| SA       |63:56 |55:48|47:40|39:32|31:24 |23:16|15:8|7:0  |
|----------|:----:|:---:|:---:|:---:|:----:|:---:|:---|:---:|
| 64b mode |R     | R   | R   | U   | U    | U   |U   |U    |
| 32b mode | --   | --  | --  | --  | R    | U   |U   |U    |

### 3.3.2 Transaction Word Size (SIZE[2:0])

The SIZE field defines the number of bytes in a transaction word. Devices are not required to support all SIZE options. Hosts must only send messages with a SIZE supported by the target device.

|SIZE[2:0] |Bytes per word|
|:--------:|:------------:|
| 0b000    | 1
| 0b001    | 2
| 0b010    | 4
| 0b011    | 8
| 0b100    | 16
| 0b101    | 32
| 0b110    | 64
| 0b111    | 128

### 3.3.3 Transaction Length (LEN[7:0])

The LEN field defines the number of words of size 2^SIZE bytes transferred by a transaction. The number of transfers is equal to LEN + 1, equating to a range of 1-256 transfers per transaction. The current address of transfer number 'i' in a transaction is defined by:

ADDR_i = START_ADDR + (i-1) * 2^SIZE.

### 3.3.4 Protection Mode (PROT[1:0])

The PROT field indicates the protected access level of the transaction, enabling controlled access to memory.

|PROT[Bit] | Value | Function            |
|:--------:|:-----:|---------------------|
| [0]      | 0     | Unprivileged access |
|          | 1     | Privileged access   |
| [1]      | 0     | Secure access       |
|          | 1     | Non-secure access   |

### 3.3.5 Quality of Service (QOS[3:0])

The QOS field controls the quality of service required from the interconnect network. The interpretation of the QOS bits is interconnect network specific.

### 3.3.6 End of Message (EOM)

The EOM bit is reserved for UMI signal layer and is used to track the transfer of the last word in a message.

### 3.3.7 End of Frame (EOF)

The EOF bit can be used to indicate the last message in a sequence of related UMI transactions. Use of the EOF bit at an endpoint is optional and implementation specific.

### 3.3.8 Exclusive Access (EX)

The EX field is used to indicate exclusive access to an address. The function is used to enable atomic load-store exchanges. The sequence of operations is:

1. Host sends a REQ_RD to address A (with EX=1) with SA B
2. Host sends a REQ_WR to address A (with EX=1) with SA B
3. Device:
   1. If address A has NOT been modified by another host (i.e., write with a different SA) since last exclusive read, device performs write to address A and returns ERR = 0b01 in RESP_WR to host.
   2. If address A has been modified by another host since last exclusive read, device returns a ERR = 0b00 in RESP_WR to host and does not perform write to address A.

### 3.3.9 Error Code (ERR[1:0])

The ERR field indicates the error status of a response (RESP_WR, RESP_RD) transaction.

|ERR[1:0]| Meaning                            |
|:------:|------------------------------------|
| 0b00   | OK (no error)                      |
| 0b01   | EXOK (successful exclusive access) |
| 0b10   | DEVERR (device error)              |
| 0b11   | NETERR (network error)             |

DEVERR trigger examples:

* Insufficient privilege level for access
* Write attempted to read-only location
* Unsupported word size
* Access attempt to disabled function

NETERR trigger examples:

* Device address unreachable
* Packet cannot be routed due to data bus width narrowing

### 3.3.9 Atomic Transaction Type (ATYPE[7:0])

The ATYPE field indicates the type of the atomic transaction.

|ATYPE[7:0]| Meaning     |
|:--------:|-------------|
| 0x00     | Atomic add  |
| 0x01     | Atomic and  |
| 0x02     | Atomic or   |
| 0x03     | Atomic xor  |
| 0x04     | Atomic max  |
| 0x05     | Atomic min  |
| 0x06     | Atomic maxu |
| 0x07     | Atomic minu |
| 0x08     | Atomic swap |

### 3.3.10 Host ID (HOSTID[4:0])

The HOSTID field indicates the ID of the host making a transaction request. All transactions with the same ID value must remain in order.

### 3.3.11 User Field (U)

Message bit designated with a U are available for use by application and signal layer implementations. Any undefined user bits shall be set to zero.

### 3.3.12 Reserved Field (R)

Message bit designated with an R are  reserved for future UMI enhancements and shall be set to zero.

## 3.4 Message Descriptions

### 3.4.1 INVALID

INVALID indicates an invalid message. A receiver can choose to ignore the message or to take corrective action.

### 3.4.2 REQ_RD

REQ_RD reads (2^SIZE)*(LEN+1) bytes from device address(DA). The device initiates a RESP_RD message to return data to the host source address (SA).

If at some point in the network REQ_RD is determined to be unroutable (for example, at a network boundary), RESP_RD should be sent back to the SA of the request with ERR=NETERR with no data (DATA=0 at the SUMI level, empty array at the TUMI level).  All other fields in RESP_RD (SIZE, LEN, etc.) should match those in the request.

If REQ_RD cannot be executed by a device for any reason (including an unsupported SIZE), RESP_RD should be sent back to the SA of the request with ERR=DEVERR and no data; all other fields (SIZE, LEN, etc.) should match those in the request.

### 3.4.3 REQ_WR

REQ_WR writes (2^SIZE)*(LEN+1) bytes to destination address(DA). The device then initiates a RESP_WR acknowledgment message to the host source address (SA).

If REQ_WR cannot be transmitted past a certain point in the network due to a narrowing in the data bus width, RESP_WR should be sent back to the SA of the request with ERR=NETERR; all other fields (SIZE, LEN, etc.) should match those in the request.  The same behavior applies when REQ_WR is unroutable.

If REQ_WR cannot be executed by a device for any reason (including an unsupported SIZE), RESP_WR should be sent back to the SA of the request with ERR=DEVERR; all other fields (SIZE, LEN, etc.) should match those in the request.

### 3.4.4 REQ_WRPOSTED

REQ_WRPOSTED performs a unidirectional posted-write of (2^SIZE)*(LEN+1) bytes to destination address (DA).  There is no response message sent by the device back to the host.

If the destination address is reachable and SIZE is supported at the destination and the entire path leading to it, the REQ_WRPOSTED message is guaranteed to complete, otherwise it may fail silently.  This means that REQ_WRPOSTED may be dropped silently if it cannot pass through part of the network due to data bus narrowing, if the transaction is determined to be unroutable at some point along its path (e.g., at a network boundary), or if the request is unsupported by a device.

### 3.4.5 REQ_RDMA

REQ_RDMA reads (2^SIZE)\*(LEN+1) bytes of data from a primary device destination address (DA) along with a source address (SA). The primary device then initiates a REQ_WRPOSTED message to write (2^SIZE)\*(LEN+1) data bytes to the address (SA) in a secondary device. REQ_RDMA requires the complete SA field for addressing and does not support pass through information for the UMI signal layer.

REQ_RDMA may be dropped silently if it is determined to be unroutable, or if the request is unsupported by the primary device.

### 3.4.6 REQ_ATOMIC{ADD,OR,XOR,MAX,MIN,MAXU,MINU,SWAP}

REQ_ATOMIC initiates an atomic read-modify-write memory operation of size (2^SIZE) at destination address (DA). The REQ_ATOMIC sequence involves:

1. Host sending data (DATA), destination address (DA), and source address (SA) to the device,
2. Device reading data address DA
3. Applying a binary operator {ADD,OR,XOR,MAX,MIN,MAXU,MINU,SWAP} between D and the original device data
4. Writing the result back to device address DA
5. Returning the original device data to host address SA with a RESP_RD message.

If REQ_ATOMIC cannot be transmitted past a certain point in the network due to a narrowing in the data bus width, RESP_RD should be sent back to the SA of the request with ERR=NETERR and no data; all other fields (SIZE, LEN, etc.) should match those in the request.  The same behavior applies when REQ_ATOMIC is unroutable.

If REQ_ATOMIC cannot be executed by a device for any reason (including an unsupported SIZE), RESP_RD should be sent back to the SA of the request with ERR=DEVERR and no data; all other fields (SIZE, LEN, etc.) should match those in the request.

### 3.4.7 REQ_ERROR

REQ_ERROR sends a unidirectional message to a device (ERR) to indicate that an error has occurred. The device can choose to ignore the message or to take action. There is no response message sent back to the host from the device.

### 3.4.8 REQ_LINK

RESP_LINK is a reserved CMD only message for link layer non-memory mapped actions such as credit updates, time stamps, and framing. CMD[31-8] are all available as user specified control bits. The message is local to the signal (physical) layer and does not include routing information and does not elicit a response from the receiver.

### 3.4.9 REQ_USER

REQ_USER message types are reserved for non-standardized custom UMI messages.

### 3.4.10 REQ_FUTURE

REQ_FUTURE message types are reserved for future UMI feature enhancements.

### 3.4.11 RESP_RD

RESP_RD returns (2^SIZE)*(LEN+1) bytes of data to the host source address (SA) specified by the REQ_RD message.

If RESP_RD cannot be transmitted past a certain point in the network due to a narrowing in the data bus width, then the transaction should be modified so that ERR=NETERR, and the DATA field should be dropped (DATA=0 at the SUMI level, empty array at the TUMI level).  All other fields (SIZE, LEN, etc.) should be unmodified.

RESP_RD may be dropped silently in the network if it is determined to be unroutable.

### 3.4.12 RESP_WR

RESP_WR returns an acknowledgment to the original source address (SA) specified by the the REQ_WR transaction. The message does not include any DATA.

RESP_WR may be dropped silently in the network if it is determined to be unroutable.

### 3.4.13 RESP_LINK

RESP_LINK is a reserved CMD only transaction for link layer non-memory mapped actions such as credit updates, time stamps, and framing. CMD[31-8] are all available as user specified control bits. The transaction is local to the signal (physical) layer and does not include routing information.

### 3.4.14 RESP_USER

RESP_USER message types are reserved for non-standardized custom UMI messages.

### 3.4.15 RESP_FUTURE

RESP_FUTURE message types are reserved for future UMI feature enhancements.

----
## 4. Signal UMI Layer (SUMI)

### 4.1 Theory of Operation

The UMI signal layer (SUMI) defines the mapping of UMI transactions to a
point-to-point, latency insensitive, parallel, synchronous interface with a [valid ready handshake protocol](#32-handshake-protocol).

![UMI](docs/_images/sumi_connections.png)

The SUMI signaling layer defines a subset of TUMI information to be transmitted as an  atomic packet. The follow table documents the legal set of SUMI packet parameters .

| Field    | Width (bits)       |
|:--------:|--------------------|
| CMD      | 32                 |
| DA       | 32, 64             |
| SA       | 32, 64             |
| DATA     | 64,128,256,512,1024|

The following example illustrates a complete request-response transaction between a host and a device.

![UMIX7](docs/_images/example_rw_xaction.svg)

UMI messages can be split into multiple atomic SUMI packets as long as message ordering and byte ordering is preserved. A SUMI packet is a complete routable mini-message comprised of a CMD, DA, SA, and DATA field, with DA and SA fields updated to reflect the correct byte addresses of the DATA payload. The end of message (EOM) bit indicates the arrival of the last packet in a message.

The following examples illlustrate splitting of UMI read and write messages into shorter SUMI packets.

TUMI read example:

* TUMI_REQ_RD  (SIZE=0, LEN=71, DA=200, SA=100)
* TUMI_RESP_RD (SIZE=0, LEN=71, DA=100, DATA=...)

Potential SUMI packet sequence:

* SUMI_REQ_RD  (SIZE=0, LEN=71, DA=200, SA=100, EOM=1)
* SUMI_RESP_RD (SIZE=0, LEN=12, DA=100, DATA=..., EOM=0)
* SUMI_RESP_RD (SIZE=0, LEN=23, DA=113, DATA=..., EOM=0)
* SUMI_RESP_RD (SIZE=0, LEN=34, DA=137, DATA=..., EOM=1)

TUMI write example:

* TUMI_REQ_WR  (SIZE=0, LEN=71, DA=200, SA=100, DATA...)
* TUMI_RESP_WR (SIZE=0, LEN=71, DA=100)

Potential SUMI packet sequence:

* SUMI_REQ_WR  (SIZE=0, LEN=12, DA=200, SA=100, DATA=..., EOM=0)
* SUMI_REQ_WR  (SIZE=0, LEN=23, DA=213, SA=113, DATA=..., EOM=0)
* SUMI_REQ_WR  (SIZE=0, LEN=34, DA=237, SA=137, DATA=..., EOM=1)
* SUMI_RESP_WR (SIZE=0, LEN=12, DA=100, EOM=0)
* SUMI_RESP_WR (SIZE=0, LEN=23, DA=113, EOM=0)
* SUMI_RESP_WR (SIZE=0, LEN=34, DA=137, EOM=1)

Note that SA and DA increment in the sequence of transactions resulting from a split request.  In a split response, only DA increments in the resulting transactions, because responses don't have the SA field.  Please be aware of this incrementing behavior when storing user information in SA or DA, since incrementing could modify that information.  Formally, bit *n* in an address is safe from modification if the original outbound transaction satisfies:

A\[n-1:0\] + (2^SIZE)*(LEN+1) < 2^n

If A\[n-1:0\]=0, this reduces to the requirement that the number of bytes in the transaction is less than 2^n.  As a simple example, consider A\[1:0\]=0b00, SIZE=0.  Bit A\[2\] is safe from modification if LEN=0, 1, or 2 but not if LEN=3.  If A\[1:0\] is instead 0b10, bit A\[2\] is only safe when LEN=0.

### 4.1.1 Splitting Rules

Generalizing from the example above, this section describes the formal rules for splitting a SUMI packet.

Definitions:
1. The number of split outputs is denoted *N*.
2. A field of the *i*th split output is referred to as FIELD_out\[i\], with 0<=i<=N-1.
3. The notation FIELD_out\[p:q\] means the values FIELD_out\[p\] through (inclusive) FIELD_out\[q\].
4. The notation FIELD_in means the value of FIELD in the SUMI packet being split.

Rules:
1. Splitting is allowed only for REQ_RD, REQ_WR, REQ_WRPOSTED, REQ_RDMA, RESP_RD, RESP_WR, when EX=0.
2. Copy HOSTID, ERR, EOF, PROT, QOS, SIZE, OPCODE, and any USER or RESERVED fields into each split output.
3. LEN_out\[i\] may be different for each split output as long as sum(LEN_out[0:N-1])+N == LEN_in+1.
4. DA_out\[i\] := DA_out\[i-1\] + (2^SIZE)*(LEN_out\[i-1\]+1), 1<=i<=(N-1).  DA_out\[0\] := DA_in.
5. SA_out\[i\] := SA_out\[i-1\] + (2^SIZE)*(LEN_out\[i-1\]+1), 1<=i<=(N-1).  SA_out\[0\] := SA_in.  Applies only to split requests, because responses do not have the SA field.
6. EOM_out\[i\] := EOM_in & (i == (N-1)).

### 4.1.2 Merging Rules

Merging, the inverse of splitting, is also permitted for related SUMI packets.  This may be done to improve packet transmission performance by reducing network bandwidth required.  This may also improve host or device performance: for example, a device may be able to deal with related requests more efficiently if they have been merged together into a single SUMI packet.  Similarly, a host may be able to process merged responses more effectively.  This section describes the formal rules for merging SUMI packets.

Definitions:
1. The number of merge inputs is denoted *N*.
2. A field of the *i*th merge input is referred to as FIELD_in\[i\], with 0<=i<=N-1.
3. The notation FIELD_in\[p:q\] means the values FIELD_in\[p\] through (inclusive) FIELD_in\[q\]
4. The notation FIELD_out means the value of FIELD in the output of a SUMI packet merge.

Rules:
1. Merging is allowed only for REQ_RD, REQ_WR, REQ_WRPOSTED, REQ_RDMA, RESP_RD, RESP_WR, when EX=0.
2. HOSTID, ERR, EOF, PROT, QOS, SIZE, OPCODE, and any USER or RESERVED fields must match in all merge inputs.  These values are copied into the merge output.
3. EOM_in\[i\] must be 0 for 0<=i<=(N-2), that is, it must be zero for all but the last merge input.  EOM_in\[N-1\] may be either 0 or 1.
4. DA_in\[i\] must be equal to DA_in\[i-1\] + (2^SIZE)*(LEN_in\[i-1\]+1), 1<=i<=(N-1).
5. DA_out := DA_in\[0\].
6. SA_in\[i\] must be equal to SA_in\[i-1\] + (2^SIZE)*(LEN_in\[i-1\]+1), 1<=i<=(N-1).  Applies only to merged requests.
7. SA_out := SA_in\[0\].  Applies only to merged requests.
8. LEN_out := sum(LEN_in\[0:N-1\])+N-1.
9. EOM_out := EOM_in\[N-1\].

### 4.2 Handshake Protocol

SUMI adheres to the following ready/valid handshake protocol:

![UMI](docs/_images/ready_valid.svg)

1. A transaction occurs on every rising clock edge in which READY and VALID are both asserted.
2. Once VALID is asserted, it must not be de-asserted until a transaction completes.
3. READY, on the other hand, may be de-asserted before a transaction completes.
4. The assertion of VALID must not depend on the assertion of READY.  In other words, it is not legal for the VALID assertion to wait for the READY assertion.
5. However, it is legal for the READY assertion to be dependent on the VALID assertion (as long as this dependence is not combinational).

The following examples help illustrate the handhsake protocol.

#### LEGAL: VALID asserted before READY

![UMIX1](docs/_images/ok_valid_ready.svg)

#### LEGAL: READY asserted before VALID

![UMIX2](docs/_images/ok_ready_valid.svg)

#### LEGAL: READY and VALID asserted simultaneously

![UMIX3](docs/_images/ok_sametime.svg)

#### LEGAL: READY toggles with no effect

![UMIX4](docs/_images/ok_ready_toggle.svg)

#### LEGAL: VALID asserted for multiple cycles (multiple transactions)

![UMIX6](docs/_images/ok_double_xaction.svg)

#### **ILLEGAL**: VALID de-asserted without waiting for READY

![UMIX5](docs/_images/bad_valid_toggle.svg)

### 4.3 Verilog Standard Interfaces

#### 4.3.1 Host Interface

```verilog
output          uhost_req_valid;
input           uhost_req_ready;
output [CW-1:0] uhost_req_cmd;
output [AW-1:0] uhost_req_dstaddr;
output [AW-1:0] uhost_req_srcaddr;
output [DW-1:0] uhost_req_data;

input           uhost_resp_valid;
output          uhost_resp_ready;
input [CW-1:0]  uhost_resp_cmd;
input [AW-1:0]  uhost_resp_dstaddr;
input [AW-1:0]  uhost_resp_srcaddr;
input [DW-1:0]  uhost_resp_data;
```
#### 4.3.1 Device Interface

```verilog
input           udev_req_valid;
output          udev_req_ready;
input [CW-1:0]  udev_req_cmd;
input [AW-1:0]  udev_req_dstaddr;
input [AW-1:0]  udev_req_srcaddr;
input [DW-1:0]  udev_req_data;

output          udev_resp_valid;
input           udev_resp_ready;
output [CW-1:0] udev_resp_cmd;
output [AW-1:0] udev_resp_dstaddr;
output [AW-1:0] udev_resp_srcaddr;
output [DW-1:0] udev_resp_data;
```

## 5. UMI Link Layer (LUMI)

UMI link layer interface converts the parallel SUMI interface into packetized, framed interface. The packets over LUMI will be sent by sending cmd, dstaddr, srcaddr and data on the same lines.

### 5.1 Signals

The following table provides the LUMI interface signals presented from a device side perspective. All signals are single ended and unidirectional. All unidirectional signals must be deterministically driven at all
times.

| SIGNAL        | DRIVER | DESCRIPTION                           |
| ------------- | ------ | ------------------------------------- |
| nreset        | host   | Asynchronous active low reset         |
| clk           | host   | LUMI clock                            |
| rxctrl[3:0]   | host   | RX link control signals(eg. valid,..) |
| rxstatus[3:0] | device | RX link status signals(optional)      |
| rxdata[N-1:0] | host   | RX link data signals                  |
| txctrl[3:0]   | device | TX link control signals(eg. valid,..) |
| txstatus[3:0] | host   | TX link status signals(optional)      |
| txdata[N-1:0] | device | TX link data signals                  |

LUMI supports data width of 8, 16, 32, 64 and 128 bits.

The following diagram show how a host and device is connected over LUMI.

![Host-Device Connection Diagram. Note that the RX of the device is connected
to the TX of the host (and vice versa).](docs/_images/swizzle_lumi.png)

### 5.2 Signal Description

#### nreset

Asynchronous active low reset. To prevent power up and initialization issues the device 'nreset' pin must be sampled by a synchronizer with asynchronous assert and synchronous deassert logic.
[REF](https://github.com/siliconcompiler/lambdalib/blob/main/lambdalib/auxlib/rtl/la_rsync.v)

#### clk

Data link clock driven by host.

#### txctrl[0]/rxctrl[0]

Valid signal for the Rx (host -> device) or Tx (device -> host) packet. A HIGH value indicates valid data and valid data is transmitted on every cycle with valid high.
Unlike UMI SUMI layer LUMI does not require a ready signal in order to transmit data. The interface uses credit flow control as described in section 5.4 below.
This signal is mandatory in all implementations.

#### txctrl[1]/rxctrl[1]

Optional signal indicating burst traffic. When high this signal indicates that the current packet is continuous to the previous one and therefore does not carry the header. It can only be asserted when the packet is continuous to the previous one and has the same SUMI header.

#### txctrl[2]/rxctrl[2]

Optional forward error correction (fec) signal to handle soft errors in rxdata.

#### txctrl[3]/rxctrl[3]

Optional redundancy "aux" signal to handle manufacturing errors or persistent in the field error of one of the rxdata pins.

#### txstat[3:0]/rxstat[3:0]

Optional status indications.

#### txdata[N-1:0]/rxdata[N-1:0]

LUMI egress/ingress data bus, active high. Supports 8b, 16b, and 64b modes. The data width is identical between the host and device and needs to be negotiated before the link can be used.

### 5.3 Packet format

The LUMI standard requires the host to fully support UMI protocol.

* [Universal Memory Interface (UMI)](https://github.com/zeroasiccorp/umi)

LUMI packet format follows the UMI one and serializes the UMI cmd, dstaddr, srcaddr and data fields into one serial bit stream.

| [511:0] | [63:0]  | [63:0]  | [31:0] |
| ------- | ------- | ------- | ------ |
| data    | srcaddr | dstaddr | cmd    |

LUMI packets are transmitted over the Tx/Rx pins with reduces interface size and are sent LSB first.
The following example shows packet transmission over 64b interface:

| Cycle | 63:32    | 31:0       |
| ----- | -------- | ---------- |
| 1     | A[31:0]  | C[31:0]    |
| 2     | S[31:0]  | A[63:32]   |
| 3     | D[31:0]  | S[63:32]   |
| 4     | D[95:64] | D[63:32]   |
| ...   |          |            |
| 11    | NA       | D[511:480] |

The following features are implemented in order to optimize the link efficiency:

* Command (C), Address (A) and Source Address (S) fields will only be transmitted where they are meaningful, per UMI spec.
  e.g. - data will not be sent on read commands

* Data fields will only be sent up to the packet size, even if SUMI data width is 64B LUMI will only transmit the bytes up to the specific message length.
  The following example shows a 4 byte SUMI packet over LUMI:

  SUMI packet:
  | [511:0]          | [63:0]  | [63:0]  | [31:0] |
  | ---------------- | ------- | ------- | ------ |
  | 60B pad, 4B data | srcaddr | dstaddr | cmd    |

  Where the command is write command, SIZE=0, LEN=3.
  As this command only uses 4 bytes of data it will be transmitted over a 64b LUMI using 3 cycles only. The padding bytes will not be sent.

  | Cycle | 63:32    | 31:0       |
  | ----- | -------- | ---------- |
  | 1     | A[31:0]  | C[31:0]    |
  | 2     | S[31:0]  | A[63:32]   |
  | 3     | D[31:0]  | S[63:32]   |

* Packet burst (optional) - when ctrl[1] pin is being used lumi can merge continuous packets.

### 5.4 Flow control

LUMI is using credit based flow control. The credit init/update messages will be sent over the link using LUMI link-layer commands and are controlled by the receiver side. The transmitter side of each link is responsible for not exceeding published credits. If the transmitter does exceed published credits, subsequent behavior of the receiver is undefined.
Credit update messages are using command only in order to reduce the overhead.

Credit init/update messages will be sent using link-layer UMI command:

| Message       | [31:16] data | [15:12] addr                           | [11:8] LNK CMD    | [7:0] UMI CMD  |
| ------------- | ------------ | -------------------------------------- | ----------------- | -------------- |
| Invalid       | NA           | NA                                     | 0x0 invalid       | link layer CMD |
| credit init   | #credit      | 0x0 - req credit<br/>0x1 - resp credit | 0x1 credit init   | link layer CMD |
| credit update | #credit      | 0x0 - req credit<br/>0x1 - resp credit | 0x2 credit update | link layer CMD |

The credit are in LUMI data width units. One credit represents a single data cycle with valid high.

### 5.5 Credit/link initialization

After reset both sides of the link wake up in non-active state and can only accept credit-init transactions. Once a credit init message is received the transmitter may start sending packets up to the provided credit.

### 5.6 Physical layer mapping

UMI link layer can be transported over several physical layer options.
The following options are supported and their mapping outlined below:

* Zero ASIC Chip Link (CLINK) physical layer
* Bunch of Wires (BoW)
* Advanced Interface Bus (AIB)
* Universal Chiplet Interconnect Express (UCIe)

## Appendix A: UMI Transaction Translation

### A.1 RISC-V

UMI transactions map naturally to RISC-V load store instructions. Extra information fields not provided by the RISC-V ISA (such as as QOS and PRIV) would need to be hard-coded or driven from CSRs.

| RISC-V Instruction   | DATA | SA       | DA | CMD         |
|:--------------------:|------|----------|----|-------------|
| LD RD, offset(RS1)   | --   | addr(RD) | RS1| REQ_RD      |
| SD RD, offset(RS1)   | RD   | addr(RD) | RS1| REQ_WR      |
| AMOADD.D rd,rs2,(rs1)| RD   | addr(RD) | RS1| REQ_ATOMADD |

The address(RD)refers to the ID or source address associated with the RD register in a RISC-V CPU. In a bus based architecture, this would generally be the host-id of the CPU.

### A.2 TileLink

### A.2.1 TileLink Overview

TileLink [[REF 1](#references)] is a chip-scale interconnect standard providing multiple masters (host) with coherent memory-mapped access to memory and other slave (device) devices.

Summary:

* provides a physically addressed, shared-memory system
* provides coherent access for an arbitrary mix of caching or non-caching masters
* has three conformance levels:
  * TL-UL: Uncached simple read/write operations of a single word (TL-UL)
  * TL-UH: Bursting read/write without support for coherent caches
  * TL-C: Complete cache coherency protocol
* has five separate channels
  * Channel A: Request messages sent to an address
  * Channel B: Request messages sent to a cached block (TL-C only)
  * Channel C: Response messages from a cached block (TL-C only)
  * Channel D: Response messages from an address
  * Channel E: Final handshake for cache block transfer (TL-C only)

### A.1.1 TileLink <-> UMI Mapping

This section outlines the recommended mapping between UMI transaction and the TileLink messages. Here, we only explore mapping TL/UH TileLink modes with UMI 64bit addressing and UMI bit mask support up to 128 bits.

| Symbol | Meaning                   | TileLink Name       |
|:------:|---------------------------|---------------------|
| C      | Data is corrupt           | {a,b,c,d,e}_corrupt |
| BMASK  | Mask (2^SIZE)/8 (strobe)  | {a,b,c,d,e}_mask    |
| HOSTID | Source ID                 | {a,b,c,d,e}_source  |

The following table shows the mapping between TileLink and UMI transactions, with TL-UL and TL-UH TileLink support. TL-C conformance is left for future development.

| TileLink Message| UMI Transaction |CMD[26:25]|
|-----------------|-----------------|----------|
| Get             | REQ_RD          | 0b00     |
| AccessAckData   | RESP_WR         | --       |
| PutFullData     | REQ_WR          | 0bC0     |
| PutPartialData  | REQ_WR          | 0bC0     |
| AccessAck       | RESP_WR         | --       |
| ArithmaticData  | REQ_ATOMIC      | 0b00     |
| LogicalData     | REQ_ATOMIC      | 0bC0     |
| Intent          | REQ_USER0       | 0b00     |
| HintAck         | RESP_USER0      | --       |

The TileLink has a single long N bit wide 'size' field, enabling 2^N to transfers per message. This is in contrast to UMI which has two fields: a SIZE field to indicate word size and a LEN field to indicate the number of words to be transferred. The number of bytes transferred by a UMI transaction is (2^SIZE)*(LEN+1).

The pseudo code below demonstrates one way of translating from the TileLink size and the UMI SIZE/LEN fields.

```c
if (tilelink_size<8){
   SIZE = tilelink_size;
   LEN = 0;
} else {
   SIZE = 7;
   LEN  = 2^(tilelink_size-8+1)-1
}
```
The TileLink master id and masking signals are mapped to the UMI SA field as shown in the table below.

| SA       |63:56 |55:48|47:40|39:32|31:24 |23:16|15:8 |7:0  |
|----------|:----:|:---:|:---:|:---:|:----:|:---:|:----|:---:|
| 64b mode |R     | R   | R   | U   | U    | U   |BMASK|BMASK|

The TileLink atomic operations encoded in the param field map to the UMI ATYPE field as follows.

| TileLink param |UMI ATYPE   |
|----------------|:----------:|
| MIN (0)        | ATOMICMIN  |
| MAX (1)        | ATOMICMAX  |
| MINU (2)       | ATOMICMINU |
| MAXU (3)       | ATOMICMAXU |
| XOR(0)         | ATOMICXOR  |
| OR  (1)        | ATOMICOR   |
| AND  (2)       | ATOMICAND  |
| SWAP  (3)      | ATOMICSWAP |

### A.2 AXI4

### A.2.1 AXI4 Overview

AXI is a transaction based memory access protocol with five independent channels:

* Write requests
* Write data
* Write response
* Read request
* Read data

Constraints:

* AXI transactions must not cross 4,096 Byte address boundaries
* The maximum transaction size is 4,096 Bytes

### A.2.2 AXI4 <-> UMI Mapping

The table below maps AXI terminology to UMI terminology.

| AXI             | UMI         |
|-----------------|-------------|
| Manager         | Host        |
| Subordinate     | Device      |
| Transaction     | Transaction |

The table below shows the mapping between the five AXI channels and UMI messages.

| AXI Channel     | UMI Message |
|-----------------|-------------|
| Write request   | REQ_WR      |
| Write data      | REQ_WR      |
| Write response  | RESP_WR     |
| Read request    | REQ_RD      |
| Read data       | RESP_RD     |

The AXI LEN, SIZE, ADDR, DATA, QOS, PROT[1:0], HOSTID, LOCK fields map directly to equivalent UMI CMD fields. See the tables below for mapping of other AXI signals to the SA fields:

 SA        |63:56 |55:48|47:40|39:32 |31:24   |23:16        |15:8|7:0 |
|----------|:----:|:---:|:---:|:----:|:------:|:-----------:|:--:|:--:|
| 64b mode |R     | R   | R   |U     |U,REGION|U,CACHE,BURST|STRB|STRB|
| 32b mode |--    | --  | --  | --   |R       |U,CACHE,BURST|STRB|STRB|

Restrictions:
 * PROT[2] is not supported.(set to 0)
 * Data width limited to 128 bits
 * HOSTID limited to 4 bits
 * REGION only supported in 64bit mode

### A.3 AXI Stream

### A.3.1 AXI Stream Overview

AXI-Stream is a point-to-point protocol, connecting a single Transmitter and a single Receiver.

### A.3.2 AXI Stream <-> UMI Mapping

The mapping between AXI stream and UMI is shown int he following tables.

| AXI             | SUMI signal|
|-----------------|------------|
| tvalid          | valid      |
| tready          | ready      |
| tdata           | DATA       |
| tlast           | EOF        |
| tid             | HOSTID     |
| tuser           | SA         |
| tkeep           | SA         |
| tstrb           | SA         |
| twakeup         | SA

 SA        |63:56 |55:48    |47:40|39:32|31:24 |23:16 |15:8 |7:0  |
|----------|:----:|:-------:|:---:|:---:|:----:|:----:|:---:|:---:|
| 64b mode |U     |U,TWAKEUP|TUSER|TDEST|TKEEP |TKEEP |TSTRB|TSTRB|
| 32b mode |--    | --      | --  | --  |TKEEP |TKEEP |TSTRB|TSTRB|

Restrictions:
 * Data width limited to 128 bits
 * TID limited to 4 bits
 * TDEST, TUSER, TWAKEUP only available in 64bit address mode.

## Appendix B: LUMI mapping to physical layer

The following examples are provided as reference for mapping LUMI over BoW, AIB and UCIe.

### B.1 Bunch of Wires mapping
LUMI over BoW will use BoW physical layer only. BoW physical layer does not have any framing to the data and therefore requires sending LUMI valid signal over a data lane.
The signal mapping is the following:

| BoW signal   | CLINK signal   | Description                          |
| ------------ | -------------- | ------------------------------------ |
| TX Data      | txdata + txvld | Data to transmit over BoW            |
| RX Data      | rxdata + rxvld | Data received over BoW               |
| Core clk     | clk[0]         | CLINK clock to be used as BoW clock  |

Other, optional, signals like FEC and AUX will not be used by LUMI.

### B.2 AIB mapping

AIB uses a simple, no framing data structure. When transporting LUMI over AIB the LUMI interface will connect to the AIB MAC interface.
The signal mapping for AIB MAC is the following:

| AIB signal   | CLINK signal | Description                          |
| ------------ | ------------ | ------------------------------------ |
| data_out     | txdata       | Data to transmit over AIB            |
| data_in      | rxdata       | Data received over AIB               |
| m_ns_fwd_clk | clk[0]       | CLINK clock to be used as AIB clock  |
| m_fw_fwd_clk | ------       | CLINK does not use Rx clock          |
| ns_mac_rdy   | txctrl[0]    | Valid signal for TX data             |
| fs_mac_rdy   | rxctrl[0]    | Valid signal for RX data             |

Other optional AIB Plus signals are not required for LUMI-AIB connection and will not be used.

### B.3 UCIe mapping

LUMI over UCIe will use UCIe Raw Die-to-Die interface (RDI).
The signal mapping for RDI is the following:

| UCIe signal    | CLINK signal | Description                          |
| -------------- | ------------ | ------------------------------------ |
| lclk           | clk[0]       | clock                                |
| lp_irdy        | txctrl[0]    | data ready signal - same as valid    |
| lp_valid       | txctrl[0]    | data valid indication                |
| lp_data        | txdata       | data to be transmitted               |
| lp_retimer_crd | ------       | Not used (for retimer only)          |
| pl_trdy        | ------       | Not used (FC handled at CLINK level) |
| pl_valid       | rxctrl[0]    | data valid from phy                  |
| pl_data        | rxdata       | data from phy                        |
| pl_retimer_crd | ------       | Not used (for retimer only)          |

UCIe also requires implementing other phy control logic to maintain the link. The following signals will be handled by the UCIe<->CLINK bridge and not exposed to the CLINK. They should handled and set before the link is declaered active.

- lp_state_req
- lp_linkerror
- pl_state_sts
- pl_inband_pres
- pl_error
- pl_cerror
- pl_nferror
- pl_trainerror
- pl_phyinrecenter
- pl_stallreq

----
## References

[1] [TileLink Specification (version 1.7)](https://static.dev.sifive.com/docs/tilelink/tilelink-spec-1.7-draft.pdf)

[2] [AMBA4 AXI Protocol Specification (22 February 2013, Version E)](https://developer.arm.com/documentation/ihi0022/e)

[3] [AMBA4 AXI Stream Protocol Specification (09 April 2021, Version A)](https://developer.arm.com/documentation/ihi0051/a)

[4] [AMBA4 APB Protocol Specification (13 April 2010, Version C)](https://developer.arm.com/documentation/ihi0024/c)

## License

[Apache License 2.0](LICENSE)

## Contributing

UMI is an open-source project and welcomes contributions. To find out
how to contribute to the project, see our
[Contributing Guidelines.](./CONTRIBUTING.md)

## Issues / Bugs

We use [GitHub Issues](https://github.com/zeroasiccorp/umi/issues)
for tracking requests and bugs.
