Skip to content

Getting Started Guide: Open FPGA Stack for Intel Stratix 10

Last updated: May 13, 2024

1.0 Introduction

This document helps users get started in evaluating Open FPGA Stack (OFS) for Stratix® 10 FPGA targeting the Intel® FPGA PAC D5005. After reviewing the document a user shall be able to:

  • Set up a development environment with all OFS ingredients
  • Build and install the OFS Linux Kernel drivers on the host
  • Build and install the Open Programmable Acceleration Engine Software Development Kit (OPAE SDK) on the host
  • Flash an OFS FIM binary onto the Intel® FPGA PAC D5005
  • Verify the functionality of OFS on an Intel® FPGA PAC D5005 board
  • Know where to find additional information on all OFS ingredients

The following flow charts show a high level overview of the initial bring-up process, split into three sequential diagrams.

Diagram 1: Installing the OPAE SDK

UG_1

Diagram 2: Installing the Linux DFL Drivers

UG_2

Diagram 3: Bringing up the Intel D5005

UG_3

1.1 Intended Audience

The information in this document is intended for customers evaluating the Open FPGA Stack for Stratix® 10 FPGA on the Intel PAC D5005. This document will cover key topics related to initial setup and development, with links for deeper dives on the topics discussed therein.

1.2 Terminology

Term Description
AER Advanced Error Reporting, The PCIe AER driver is the extended PCI Express error reporting capability providing more robust error reporting.
AFU Accelerator Functional Unit, Hardware Accelerator implemented in FPGA logic which offloads a computational operation for an application from the CPU to improve performance. Note: An AFU region is the part of the design where an AFU may reside. This AFU may or may not be a partial reconfiguration region
BBB Basic Building Block, Features within an AFU or part of an FPGA interface that can be reused across designs. These building blocks do not have stringent interface requirements like the FIM's AFU and host interface requires. All BBBs must have a (globally unique identifier) GUID.
BKC Best Known Configuration, The exact hardware configuration Intel has optimized and validated the solution against.
BMC Board Management Controller, Acts as the Root of Trust (RoT) on the Intel FPGA PAC platform. Supports features such as power sequence management and board monitoring through on-board sensors.
CSR Command/status registers (CSR) and software interface, OFS uses a defined set of CSR's to expose the functionality of the FPGA to the host software.
DFL Device Feature List, A concept inherited from OFS. The DFL drivers provide support for FPGA devices that are designed to support the Device Feature List. The DFL, which is implemented in RTL, consists of a self-describing data structure in PCI BAR space that allows the DFL driver to automatically load the drivers required for a given FPGA configuration.
FIM FPGA Interface Manager, Provides platform management, functionality, clocks, resets and standard interfaces to host and AFUs. The FIM resides in the static region of the FPGA and contains the FPGA Management Engine (FME) and I/O ring.
FME FPGA Management Engine, Provides a way to manage the platform and enable acceleration functions on the platform.
HEM Host Exerciser Module, Host exercisers are used to exercise and characterize the various host-FPGA interactions, including Memory Mapped Input/Output (MMIO), data transfer from host to FPGA, PR, host to FPGA memory, etc.
Intel FPGA PAC D5005 Intel FPGA Programmable Acceleration Card D5005, A high performance PCI Express (PCIe)-based FPGA acceleration card for data centers. This card is the target platform for the initial OFS release.
Intel VT-d Intel Virtualization Technology for Directed I/O, Extension of the VT-x and VT-I processor virtualization technologies which adds new support for I/O device virtualization.
IOCTL Input/Output Control, System calls used to manipulate underlying device parameters of special files.
JTAG Joint Test Action Group, Refers to the IEEE 1149.1 JTAG standard; Another FPGA configuration methodology.
MMIO Memory Mapped Input/Output, Users may map and access both control registers and system memory buffers with accelerators.
OFS Open FPGA Stack, A modular collection of hardware platform components, open source software, and broad ecosystem support that provides a standard and scalable model for AFU and software developers to optimize and reuse their designs.
OPAE SDK Open Programmable Acceleration Engine Software Development Kit, A collection of libraries and tools to facilitate the development of software applications and accelerators using OPAE.
PAC Programmable Acceleration Card: FPGA based Accelerator card
PIM Platform Interface Manager, An interface manager that comprises two components: a configurable platform specific interface for board developers and a collection of shims that AFU developers can use to handle clock crossing, response sorting, buffering and different protocols.
PR Partial Reconfiguration, The ability to dynamically reconfigure a portion of an FPGA while the remaining FPGA design continues to function. In the context of Intel FPGA PAC, a PR bitstream refers to an Intel FPGA PAC AFU. Refer to Partial Reconfiguration support page.
RSU Remote System Update, A Remote System Update operation sends an instruction to the Intel FPGA PAC D5005 device that triggers a power cycle of the card only, forcing reconfiguration.
SR-IOV Single-Root Input-Output Virtualization, Allows the isolation of PCI Express resources for manageability and performance.
TB Testbench, Testbench or Verification Environment is used to check the functional correctness of the Design Under Test (DUT) by generating and driving a predefined input sequence to a design, capturing the design output and comparing with-respect-to expected output.
UVM Universal Verification Methodology, A modular, reusable, and scalable testbench structure via an API framework.
VFIO Virtual Function Input/Output, An IOMMU/device agnostic framework for exposing direct device access to userspace.

1.3 Reference Documents

1.4 Component Version Summary

The OFS 2024.1-1 Release targeting the Stratix® 10 FPGA is built upon tightly coupled software and firmware versions. Use this section as a general reference for the versions which comprise this release.

The following table highlights the hardware which makes up the Best Known Configuration (BKC) for the OFS 2024.1-1 release.

Table 1-2: Hardware BKC

Component
1 x Intel® FPGA PAC D5005
1 x Supported Server Model
1 x Intel FPGA Download Cable II (Optional, only required if loading images via JTAG)

The following table highlights the versions of the software which comprise the OFS stack. The installation of the user-space OPAE SDK on top of the kernel-space linux-dfl drivers is discussed in subsequent sections of this document.

Table 1-3: Software Version Summary

Component Version
FPGA Platform Intel® FPGA PAC D5005
OPAE SDK Tag: 2.12.0-5
Kernel Drivers Tag: ofs-2024.1-6.1-2
OFS FIM Source Code Branch: ofs-2024.1-1
Intel Quartus Prime Pro Edition Design Software 23.4 [Intel® Quartus® Prime Pro Edition Linux]
Operating System RHEL 8.6

A download page containing the release and already-compiled FIM binary artifacts that you can use for immediate evaluation on the Intel® FPGA PAC D5005 can be found on the OFS 2024.1-1 official release drop on GitHub.

Note: If you wish to freeze your Red Hat operating system version on the RHEL 8.6, refer to the following solution provided in the Red Hat customer portal.

2.0 OFS Stack Architecture Overview for Reference Platform

2.1 Hardware Components

The OFS hardware architecture decomposes all designs into a standard set of modules, interfaces, and capabilities. Although the OFS infrastructure provides a standard set of functionality and capability, the user is responsible for making the customizations to their specific design in compliance with the specifications outlined in the Shell Technical Reference Manual: OFS for Stratix® 10 PCIe Attach FPGAs.

OFS is a blanket term which can be used to collectively refer to all ingredients of the OFS reference design, which includes the core hardware components discussed below and software.

2.1.1 FPGA Interface Manager

The FPGA Interface Manager (FIM) or 'shell' provides platform management functionality, clocks, resets, and interface access to the host and peripheral features on the acceleration platform. The FIM is implemented in a static region of the FPGA device.

The primary components of the FIM reference design are:

  • PCIe Subsystem
  • Transceiver Subsystem
  • Memory Subsystem
  • FPGA Management Engine
  • AFU Peripheral Fabric for AFU accesses to other interface peripherals
  • Board Peripheral Fabric for master to slave CSR accesses from host or AFU
  • Interface to Board Management Controller (BMC)

The FPGA Management Engine (FME) provides management features for the platform and the loading/unloading of accelerators through partial reconfiguration.

For more information on the FIM and its external connections, please refer to the Shell Technical Reference Manual: OFS for Stratix® 10 PCIe Attach FPGAs, and the Intel FPGA Programmable Acceleration Card D5005 Data Sheet. Below is a high-level block diagram of the FIM.

Figure 2-1 FIM Overview

fim-overview

2.1.2 AFU

An AFU is an acceleration workload that interfaces to the FIM. The AFU boundary in this reference design comprises both static and partial reconfiguration (PR) regions. You can decide how you want to partition these two areas or if you want your AFU region to only be a partial reconfiguration region. A port gasket within the design provides all the PR specific modules and logic required for partial reconfiguration. Only one partial reconfiguration region is supported in this design.

Similar to the FME, the port gasket exposes its capabilities to the host software driver through a DFH register placed at the beginning of the port gasket CSR space. In addition, only one PCIe link can access the port register space.

You can compile your design in one of the following ways:

  • Your AFU resides in a partial reconfiguration (PR) region of the FPGA.
  • Your AFU is a part of the static region (SR) and is a compiled flat design.
  • Your AFU contains both static and PR regions.

The AFU provided in this release is comprised of the following functions:

  • AFU interface handler to verify transactions coming from the AFU region.
  • PV/VF Mux to route transactions to and from corresponding AFU components, including the ST2MM module, PCIe loopback host exerciser (HE-LB), HSSI host exerciser (HE-HSSI), and Memory Host Exerciser (HE-MEM).
  • AXI4 Streaming to Memory Map (ST2MM) Module that routes MMIO CSR accesses to FME and board peripherals.
  • Host exercisers to test PCIe, memory and HSSI interfaces (these can be removed from the AFU region after your FIM design is complete to provide more resource area for workloads).
  • Port gasket and partial reconfiguration support.

For more information on the Platform Interface Manager (PIM) and AFU development and testing, please refer to the [OFS AFU Development Guide].

2.2 OFS Software Overview

2.2.1 Kernel Drivers for OFS

OFS DFL driver software provides the bottom-most API to FPGA platforms. Libraries such as OPAE and frameworks like DPDK are consumers of the APIs provided by OFS. Applications may be built on top of these frameworks and libraries. The OFS software does not cover any out-of-band management interfaces. OFS driver software is designed to be extendable, flexible, and provide for bare-metal and virtualized functionality. An in depth look at the various aspects of the driver architecture such as the API, an explanation of the DFL framework, and instructions on how to port DFL driver patches to other kernel distributions can be found on the DFL Wiki page.

3.0 Intel FPGA PAC D5005 Card Installation and Server Requirements

Currently OFS for Stratix® 10 FPGA targets the Intel® FPGA PAC D5005. Because the Intel® FPGA PAC D5005 is a production card, you must prepare the card in order to receive a new non-production bitstream. For these instructions, please contact an Intel representative.

Board installation guidelines for this platform are detailed in the Board Installation Guide: OFS for Acceleration Development Platforms.

4.0 OFS DFL Kernel Drivers

4.1 OFS DFL Kernel Driver Installation

All OFS DFL kernel driver code resides in the Linux DFL GitHub repository. This repository is open source and does not require any permissions to access. It includes a snapshot of the latest best-known configuration (BKC) Linux kernel with the OFS driver included in the drivers/fpga/* directory. Downloading, configuration, and compilation will not be discussed in this document. Please refer to the Software Installation Guide: OFS for PCIe Attach FPGAs for guidelines on environment setup and build steps for all OFS stack components.

The DFL driver suite can be automatically installed using a supplied Python 3 installation script. This script ships with a README detailing execution instructions on the OFS 2024.1-1 Release Page.

It is recommended you boot into your operating system's native 4.18.x kernel before attempting to upgrade to the dfl enabled 6.1.78 You may experience issues when moving between two dfl enabled 6.1.78 kernels.

This installation process assumes the user has access to an internet connection in order to pull specific GitHub repositories, and to satisfy package dependencies.

5.0 OPAE Software Development Kit

The OPAE SDK software stack sits in user space on top of the OFS kernel drivers. It is a common software infrastructure layer that simplifies and streamlines integration of programmable accelerators such as FPGAs into software applications and environments. OPAE consists of a set of drivers, user-space libraries, and tools to discover, enumerate, share, query, access, manipulate, and reconfigure programmable accelerators. OPAE is designed to support a layered, common programming model across different platforms and devices.

The OPAE SDK source code is contained within a single GitHub repository hosted at the OPAE GitHub. This repository is open source.

5.1 OPAE SDK Installation

This document does not cover the installation process for the OPAE SDK - please refer to the Software Installation Guide: OFS for PCIe Attach FPGAs for official guidelines on software building.

You may choose to use the supplied Python 3 installation script to handle OPAE SDK installation. This script ships with a README detailing execution instructions on the OFS 2024.1-1.

This installation process assumes the user has access to an internet connection in order to pull specific GitHub repositories, and to satisfy package dependencies.

5.2 OPAE Tools Overview

The OPAE SDK user-space tools sit upon the kernel-space DFL drivers. In order to use OPAE SDK functionality the user needs to have installed both the OPAE SDK and Linux DFL driver set. You must have at least one D5005 card with the appropriate FIM present in your system. The steps to read and load a new FIM version are discussed in section 7.1 Programming the OFS FIM. After both the DFL kernel-space drivers have been installed and the FIM has been upgraded, you may proceed to test the OPAE commands discussed below.

This section covers basic functionality of the commonly used OPAE tools and their expected results. These steps may also be used to verify that all OFS software installation has been completed successfully. A complete overview of the OPAE tools can be found on the OPAE GitHub and in your cloned GitHub repo at <your path>/opae-sdk/doc/src/fpga_tools. More commands are listed than are defined in the list below - most of these are called by other tools and do not need to be called directly themselves.

5.2.1 fpgasupdate

The fpgasupdate tool updates the Intel Max10 BMC image and firmware, root entry hash, and FPGA Static Region (SR) and user image (PR). The fpgasupdate will only accept images that have been formatted using PACsign. If a root entry hash has been programmed onto the board, then the image will also need to be signed using the correct keys. Please refer to the [Security User Guide: Intel Open FPGA Stack] for information on created signed images and on programming and managing the root entry hash.

The Intel FPGA PAC ships with a factory and user programmed image for both the FIM and BMC FW and RTL on all cards.

Table 5-1: fpgasupdate Overview

Synopsis:

fpgasupdate [--log-level=<level>] file [bdf]

Description: The fpgasupdate command implements a secure firmware update.

Command args (optional) Description
--log-level Specifies the log-level which is the level of information output to your command tool. The following seven levels are available: state, ioctl, debug, info, warning, error, critical. Setting --log-level=state provides the most verbose output. Setting --log-level=ioctl provides the second most information, and so on. The default level is info.
file Specifies the secure update firmware file to be programmed. This file may be to program a static region (SR), programmable region (PR), root entry hash, key cancellation, or other device-specific firmware.
bdf The PCIe address of the PAC to program. bdf is of the form [ssss:]bb:dd:f, corresponding to PCIe segment, bus, device, function. The segment is optional. If you do not specify a segment, the segment defaults to 0000. If the system has only one PAC you can omit the bdf and let fpgasupdate determine the address automatically.

5.2.2 fpgainfo

Synopsis:

   fpgainfo [-h] [-S <segment>] [-B <bus>] [-D <device>] [-F <function>] [PCI_ADDR]
            {errors,power,temp,fme,port,bmc,mac,phy,security}

Description: Displays FPGA information derived from sysfs files. The command argument is one of the following: errors, power, temp, port, fme, bmc, phy or mac, security. Some commands may also have other arguments or options that control their behavior.

For systems with multiple FPGA devices, you can specify the BDF to limit the output to the FPGA resource with the corresponding PCIe configuration. If not specified, information displays for all resources for the given command.

Command args (optional) Description
--help, -h Prints help information and exits.
--version, -v Prints version information and exits.
-S, --segment PCIe segment number of resource.
-B, --bus PCIe bus number of resource.
-D, --device PCIe device number of resource.
-F, --function PCIe function number of resource.
errors {fme, port, all} --clear, -c First agument to the errors command specifies the resource type to display in human readable format. The second optional argument clears errors for the given FPGA resource.
power Provides total power in watts that the FPGA hardware consumes
temp Provides FPGA temperature values in degrees Celsius
port Provides information about the port
fme Provides information about the FME
bmc Provides BMC sensors information
mac Provides information about MAC ROM connected to FPGA
security Provides information about the security keys, hashes, and flash count, if available.

Note: Your Bitstream ID and PR Interface Id may not match the below examples.

The following examples walk through sample outputs generated by fpgainfo.

$ sudo fpgainfo fme

Open FPGA Stack Platform
Board Management Controller, MAX10 NIOS FW version: 2.0.8
Board Management Controller, MAX10 Build version: 2.0.8
//****** FME ******//
Object Id                        : 0xF000000
PCIe s:b:d.f                     : 0000:3B:00.0
Vendor Id                        : 0x8086
Device Id                        : 0xBCCE
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Ports Num                        : 01
Bitstream Id                     : 288511860124977321
Bitstream Version                : 4.0.1
Pr Interface Id                  : a195b6f7-cf23-5a2b-8ef9-1161e184ec4e
Boot Page                        : user
$ sudo fpgainfo bmc

Open FPGA Stack Platform
Board Management Controller, MAX10 NIOS FW version: 2.0.8
Board Management Controller, MAX10 Build version: 2.0.8
//****** BMC SENSORS ******//
Object Id                        : 0xF000000
PCIe s:b:d.f                     : 0000:3B:00.0
Vendor Id                        : 0x8086
Device Id                        : 0xBCCE
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Ports Num                        : 01
Bitstream Id                     : 288511860124977321
Bitstream Version                : 4.0.1
Pr Interface Id                  : a195b6f7-cf23-5a2b-8ef9-1161e184ec4e
( 1) VCCERAM Voltage                                    : 0.90 Volts
( 2) VCCT Temperature                                   : 29.00 Celsius
( 3) 12v Backplane Voltage                              : 12.17 Volts
( 4) VCCERAM Current                                    : 0.18 Amps
( 5) FPGA Transceiver Temperature                       : 36.50 Celsius
( 6) QSFP1 Supply Voltage                               : 0.00 Volts
( 7) 3.3v Temperature                                   : 29.00 Celsius
( 8) 12v Backplane Current                              : 2.28 Amps
( 9) RDIMM3 Temperature                                 : 25.50 Celsius
(10) VCCR Voltage                                       : 1.12 Volts
(11) Board Inlet Air Temperature                        : 24.50 Celsius
(12) 1.8v Temperature                                   : 27.50 Celsius
(13) 12v AUX Voltage                                    : 12.14 Volts
(14) VCCR Current                                       : 0.55 Amps
(15) RDIMM0 Temperature                                 : 24.50 Celsius
(16) FPGA Core Voltage                                  : 0.88 Volts
(17) VCCERAM Temperature                                : 27.50 Celsius
(18) 12v AUX Current                                    : 1.19 Amps
(19) QSFP0 Temperature                                  : N/A
(20) VCCT Voltage                                       : 1.12 Volts
(21) FPGA Core Current                                  : 11.60 Amps
(22) FPGA Core Temperature                              : 42.50 Celsius
(23) 12v Backplane Temperature                          : 24.00 Celsius
(24) VCCT Current                                       : 0.14 Amps
(25) RDIMM1 Temperature                                 : 24.00 Celsius
(26) 3.3v Voltage                                       : 3.30 Volts
(27) VCCR Temperature                                   : 33.50 Celsius
(28) 1.8v Voltage                                       : 1.80 Volts
(29) 3.3v Current                                       : 0.32 Amps
(30) Board Exhaust Air Temperature                      : 26.00 Celsius
(31) 12v AUX Temperature                                : 25.00 Celsius
(32) QSFP0 Supply Voltage                               : 0.00 Volts
(33) QSFP1 Temperature                                  : N/A
(34) 1.8v Current                                       : 0.54 Amps
(35) RDIMM2 Temperature                                 : 26.00 Celsius

5.2.3 rsu

The rsu performs a Remote System Update operation on a device, given its PCIe address. A rsu operation sends an instruction to the device to trigger a power cycle of the card only. This will force reconfiguration from flash for either the BMC or FPGA.

The Intel FPGA PAC contains a region of flash the user may store their FIM image. After an image has been programmed with fpgasupdate the user may choose to perform rsu to update the image on the device.

Note: The D5005 platform only supports storing and configuring a single user image from flash for the FPGA. It does not include support for the user1/user2 partitions as shown in other OFS related acceleration boards.

rsu Overview

Synopsis

rsu [-h] [-d] {bmc,bmcimg,retimer,sdm,fpgadefault} [PCIE_ADDR]
rsu bmc --page=(user) [PCIE_ADDR]
rsu retimer [PCIE_ADDR]
rsu sdm [PCIE_ADDR]

Perform RSU (remote system update) operation on PAC device given its PCIe address. An RSU operation sends an instruction to the device to trigger a power cycle of the card only. This will force reconfiguration from flash for either BMC, Retimer, SDM, (on devices that support these) or the FPGA.

Note: As a result of using the rsu command, the host rescans the PCI bus and may assign a different Bus/Device/Function (B/D/F) value than the originally assigned value.

5.2.4 PACsign

PACSign is an OPAE utility which allows users to insert authentication markers into bitstreams targeted for the platform. All binary images must be signed using PACSign before fpgasupdate can use them for an update. Assuming no Root Entry Hash (REH) has been programmed on the device, the following examples demonstrate how to prepend the required secure authentication data, and specify which region of flash to update. More information, including charts detailing the different certification types and their required options, are fully described in the PACsign python/pacsign/PACSign.md OPAE GitHub on GitHub.

Table 5-4: PACSign Overview

Synopsis:

PACSign [-h] {FIM,SR,SR_TEST,BBS,BMC,BMC_FW,BMC_FACTORY,AFU,PR,PR_TEST,GBS,FACTORY,PXE,THERM_SR,THERM_PR} ...

PACSign <CMD> [-h] -t {UPDATE,CANCEL,RK_256,RK_384} -H HSM_MANAGER [-C HSM_CONFIG] [-s SLOT_NUM] [-r ROOT_KEY] [-k CODE_SIGNING_KEY] [-d CSK_ID] [-R ROOT_BITSTREAM] [-S] [-i INPUT_FILE] [-o OUTPUT_FILE] [-b BITSTREAM_VERSION] [-y] [-v]

Description: The PACSign utility inserts authentication markers into bitstreams.

Command args (optional) Description
(required) -t, --cert_type TYPE The following operations are supported: UPDATE, CANCEL, RK_256, RK_348
(required) -H, --HSM_manager MODULE The module name for a module that interfaces to a HSM. PACSign includes both the openssl_manager and pkcs11_manager to handle keys and signing operations.
-C, --HSM_config CONFIG The argument to this operation is passed verbatim to the specified HSM. For pkcs11_manager, this option specifies a JSON file describing the PKCS #11 capable HSM’s parameters.
-r, --root_key KEY_ID The key identifier that the HSM uses to identify the root key to be used for the selected operation.
-k, --code_signing_key KEY_ID The key indentifier that the HSM uses to identify the code signing key to be used for the selected operation
-d, --csk_id CSK_NUM Only used for type CANCEL. Specifies the key number of the code signing key to cancel.
-s, --slot_num For bitstream types with multiple slots (i.e. multiple ST regions), this option specifies which of the slots to which this bitstream is to be acted upon
-b, --bitstream_version VERSION User-formatted version information. This can be any string up to 32 bytes in length.
-S, --SHA384 Used to specify that PACSign is to use 384-bit crypto. Default is 256-bit
-R, --ROOT_BITSTREAM ROOT_BITSTREAM Valid when verifying bitstreams. The verification step will ensure the generated bitstream is able to be loaded on a board with the specified root entry hash programmed.
-i, --input_file FILE Only to be used for UPDATE operations. Specifies the file name containing data to be signed
-o, --output_file FILE Specifies the file name for the signed output bitstream.
-y, --yes Silently answer all queries from PACSign in the affirmative.
-v, --verbose Can be specified multiple times. Increases the verbosity of PACSign. Once enables non-fatal warnings to be displayed. Twice enables progress information. Three or more occurrences enables very verbose debugging information.
-h Prints help information and exits
{FIM, SR, SR_TEST, BBS, BMC, BMC_FW, BMC_FACTORY, AFU, PR, PR_TEST, GBS, FACTORY, PXE, THERM_SR, THERM_PR} Bitstream type identifier.

PACSign can be run on images that have previously been signed. It will overwrite any existing authentication data.

The following example will create an unsigned SR image from an existing signed SR binary update image.

PACSign SR -t UPDATE -s 0 -H openssl_manager -i d5005_page1_unsigned.bin -o new_image.bin
#output
No root key specified.  Generate unsigned bitstream? Y = yes, N = no: y
No CSK specified.  Generate unsigned bitstream? Y = yes, N = no: y
No root entry hash bitstream specified.  Verification will not be done.  Continue? Y = yes, N = no: y
2022-07-20 10:13:54,954 - PACSign.log - WARNING - Bitstream is already signed - removing signature blocks

5.2.5 bitstreaminfo

Displays authentication information contained with each provided file on the command line. This includes any JSON header strings, authentication header block information, and a small portion of the payload. The binary is installed by default at /usr/bin/bitstreaminfo.

5.2.6 hssi

The hssi application provides a means of interacting with the 10G and with the 100G HSSI AFUs. In both 10G and 100G operating modes, the application initializes the AFU, completes the desired transfer as described by the mode-specific options. Only the hssi_10g MODE is currently supported. An example of this command's output can be found in section 5.2.9 Running the Host Exerciser Modules. The binary is installed by default at /usr/bin/hssi.

5.2.7 opae.io

Opae.io is a interactive Python environment packaged on top of libopaevfio.so, which provides user space access to PCIe devices via the vfio-pci driver. The main feature of opae.io is its built-in Python command interpreter, along with some Python bindings that provide a means to access Configuration and Status Registers (CSRs) that reside on the PCIe device. opae.io has two operating modes: command line mode and interactive mode. An example of this command's output can be found in section 5.2.9 Running the Host Exerciser Modules. The binary is installed by default at /usr/bin/opae.io.

5.2.8 host_exerciser

The host exerciser is used to exercise and characterize the various host-FPGA interactions eg. MMIO, Data transfer from host to FPGA , PR, host to FPGA memory etc. An example of this command's output can be found in section 5.2.9 Running the Host Exerciser Modules. The binary is installed by default at /usr/bin/host_exerciser. For more information refer to - Host Exerciser

5.2.9 Running the Host Exerciser Modules

The reference FIM and unchanged compilations contain Host Exerciser Modules (HEMs). These are used to exercise and characterize the various host-FPGA interactions, including Memory Mapped Input/Output (MMIO), data transfer from host to FPGA, PR, host to FPGA memory, etc.

Note: Before continuing, if huge pages are not set refer to section 4.2, step 7.

There are three HEMs present in the OFS FIM - HE-LPBK, HE-HSSI, and HE-MEM. These exercisers are tied to three different VFs that must be enabled before they can be used. The user should enable the VF for each HEM using the below steps:

1. Determine the BDF of the Intel® FPGA PAC D5005 card.

The PCIe BDF address is initially determined when the server powers on. The user can determine the addresses of all Intel® FPGA PAC D5005 boards using lspci:

lspci -d :bcce

3b:00.0 Processing accelerators: Intel Corporation Device bcce (rev 01)

Note: Before continuing, if you updated your OFS installation, please also update your PAC FIM to run HEMs.

2. Enable three VFs.

In this example, the BDF address is 0000:3b:00.0. With this information the user can now enable three VFs with the following:

sudo pci_device 0000:3b:00.0 vf 3

3. Verify that all three VFs have been created.

lspci -s 3b:00

3b:00.0 Processing accelerators: Intel Corporation Device bcce (rev 01)
3b:00.1 Processing accelerators: Intel Corporation Device bccf (rev 01)
3b:00.2 Processing accelerators: Intel Corporation Device bccf (rev 01)
3b:00.3 Processing accelerators: Intel Corporation Device bccf (rev 01)

4. Bind the 3 VFs to the vfio-pci driver.

sudo opae.io init -d 0000:3b:00.1 $USER

opae.io 0.2.5
Unbinding (0x8086,0xbccf) at 0000:3b:00.1 from dfl-pci
Binding (0x8086,0xbccf) at 0000:3b:00.1 to vfio-pci
iommu group for (0x8086,0xbccf) at 0000:3b:00.1 is 142
Assigning /dev/vfio/142 to $USER:$USER
Changing permissions for /dev/vfio/142 to rw-rw----


sudo opae.io init -d 0000:3b:00.2 $USER

opae.io 0.2.5
Unbinding (0x8086,0xbccf) at 0000:3b:00.2 from dfl-pci
Binding (0x8086,0xbccf) at 0000:3b:00.2 to vfio-pci
iommu group for (0x8086,0xbccf) at 0000:3b:00.2 is 143
Assigning /dev/vfio/143 to $USER:$USER
Changing permissions for /dev/vfio/143 to rw-rw----


sudo opae.io init -d 0000:3b:00.3 $USER

opae.io 0.2.5
Unbinding (0x8086,0xbccf) at 0000:3b:00.3 from dfl-pci
Binding (0x8086,0xbccf) at 0000:3b:00.3 to vfio-pci
iommu group for (0x8086,0xbccf) at 0000:3b:00.3 is 144
Assigning /dev/vfio/144 to $USER:$USER
Changing permissions for /dev/vfio/144 to rw-rw----

5. Check that the accelerators are present using fpgainfo. Note your port configuration may differ from the below.

$ sudo fpgainfo port

//****** PORT ******//
Object Id                        : 0xF000000
PCIe s:b:d.f                     : 0000:3B:00.0
Vendor Id                        : 0x8086
Device Id                        : 0xBCCE
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
//****** PORT ******//
Object Id                        : 0x603B000000000000
PCIe s:b:d.f                     : 0000:3B:00.3
Vendor Id                        : 0x8086
Device Id                        : 0xBCCF
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Accelerator GUID                 : 823c334c-98bf-11ea-bb37-0242ac130002
//****** PORT ******//
Object Id                        : 0x403B000000000000
PCIe s:b:d.f                     : 0000:3B:00.2
Vendor Id                        : 0x8086
Device Id                        : 0xBCCF
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Accelerator GUID                 : 8568ab4e-6ba5-4616-bb65-2a578330a8eb
//****** PORT ******//
Object Id                        : 0x203B000000000000
PCIe s:b:d.f                     : 0000:3B:00.1
Vendor Id                        : 0x8086
Device Id                        : 0xBCCF
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Accelerator GUID                 : 56e203e9-864f-49a7-b94b-12284c31e02b

Table 5-5 VF to HEM Mappings

VF BDF HEM
BBBB:DD.1 HE-LB
BBBB:DD.2 HE-MEM
BBBB:DD.3 He-HSSI

HE-MEM / HE-LB

HE-LB is responsible for generating traffic with the intention of exercising the path from the AFU to the Host at full bandwidth. HE-MEM is used to exercise the DDR interface; data read from the host is written to DDR, and the same data is read from DDR before sending it back to the host. HE-MEM uses external DDR memory (i.e. EMIF) to store data. It has a customized version of the AVMM interface to communicate with the EMIF memory controller. Both exercisers rely on the user-space tool host_exerciser. The following commands are supported by the HE-LB/HE-MEM OPAE driver program. They may need to be run using sudo privileges, depending on your server configuration.

Basic operations:

$ sudo host_exerciser lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5342
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.067 GB/s
    Test lpbk(1): PASS

$ sudo host_exerciser --mode lpbk lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5358
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.058 GB/s
    Test lpbk(1): PASS

$ sudo host_exerciser --mode write lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 0
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 2592
    Total number of Reads sent: 0
    Total number of Writes sent: 1024
    Bandwidth: 6.321 GB/s
    Test lpbk(1): PASS

$ sudo host_exerciser --mode trput lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 512
    Host Exerciser numWrites: 513
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 3384
    Total number of Reads sent: 512
    Total number of Writes sent: 512
    Bandwidth: 4.842 GB/s
    Test lpbk(1): PASS

Number of cachelines per request 1, 2, and 4. The user may replace --mode lpbk with read, write, trput. The target lpbk can be replaced with mem:

$ sudo host_exerciser --mode lpbk --cls cl_1 lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5475
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 2.993 GB/s
    Test lpbk(1): PASS


$ sudo host_exerciser --mode lpbk --cls cl_2 lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5356
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.059 GB/s
    Test lpbk(1): PASS


$ sudo host_exerciser --mode lpbk --cls cl_4 lpbk

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1025
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 4481
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.656 GB/s
    Test lpbk(1): PASS

Interrupt tests (only valid for mode mem):

$ sudo host_exerciser --interrupt 0 mem

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
Using Interrupts
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1026
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5140
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.188 GB/s
    Test mem(1): PASS

$ sudo host_exerciser --interrupt 1 mem

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
Using Interrupts
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1026
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5079
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.226 GB/s
    Test mem(1): PASS


$ sudo host_exerciser --interrupt 2 mem

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
Using Interrupts
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1026
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 5525
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.439 GB/s
    Test mem(1): PASS


$ sudo host_exerciser --interrupt 3 mem

    starting test run, count of 1
API version: 1
AFU clock: 250 MHz
Allocate SRC Buffer
Allocate DST Buffer
Allocate DSM Buffer
Using Interrupts
    Host Exerciser Performance Counter:
    Host Exerciser numReads: 1024
    Host Exerciser numWrites: 1026
    Host Exerciser numPendReads: 0
    Host Exerciser numPendWrites: 0
    Host Exerciser numPendEmifReads: 0
    Host Exerciser numPendEmifWrites: 0
    Number of clocks: 4735
    Total number of Reads sent: 1024
    Total number of Writes sent: 1024
    Bandwidth: 3.460 GB/s
    Test mem(1): PASS

HE-HSSI

HE-HSSI is responsible for handling client-side ethernet traffic. It wraps the 10G ethernet AFU and includes a 10G traffic generator and checker. The user-space tool hssi exports a control interface to the HE-HSSI's AFU's packet generator logic. Context sensitive information is given by the hssi --help command. Help for the 10G specific test is given by hssi hssi_10g --help Example useage:

$ sudo hssi --pci-address 3b:00.3 hssi_10g --eth-ifc s10hssi0 --eth-loopback on --he-loopback=off  --num-packets 100

10G loopback test
  port: 0
  eth_loopback: on
  he_loopback: off
  num_packets: 100
  packet_length: 64
  src_address: 11:22:33:44:55:66
    (bits):  0x665544332211
  dest_address: 77:88:99:aa:bb:cc
    (bits): 0xccbbaa998877
  random_length: fixed
  random_payload: incremental
  rnd_seed0: 5eed0000
  rnd_seed1: 5eed0001
  rnd_seed2: 25eed
  eth: s10hssi0

6.0 Compiling OFS FIM

Pre-Compiled FIM binaries are at OFS 2024.1-1 release page and to compile the OFS FIM for Intel® FPGA PAC D5005 follow the below steps :

1) Compile OFS FIM manually - Steps are provided in the developer guide to compile FIM and generate binaries. Refer to Shell Technical Reference Manual: OFS for Stratix® 10 PCIe Attach FPGAs.

2) Compile OFS FIM using evaluation script - The script guides you to the steps required for compilation via selecting options from the menu. Refer to Automated Evaluation User Guide: OFS for Stratix® 10 PCIe Attach FPGAs.

7.0 Programming the OFS FIM and BMC

Instructions surrounding the compilation and simulation of the OFS FIM have fully moved into the Shell Developer Guide: OFS for Stratix® 10 PCIe Attach FPGAs.

7.1 Programming the OFS FIM

In order to program the OFS FIM, both the OPAE SDK and DFL drivers need to be installed on the host system. Please complete the steps in sections 4.0 OFS DFL Kernel Drivers and 5.0 OPAE Software Development Kit. The OFS FIM version can be identified using the OPAE tool fpgainfo. A sample output of this command is included below.

$ sudo fpgainfo fme

Intel FPGA Programmable Acceleration Card D5005
Board Management Controller, MAX10 NIOS FW version: 2.0.8
Board Management Controller, MAX10 Build version: 2.0.8
//****** FME ******//
Object Id                        : 0xF000000
PCIe s:b:d.f                     : 0000:3B:00.0
Vendor Id                        : 0x8086
Device Id                        : 0xBCCE
SubVendor Id                     : 0x8086
SubDevice Id                     : 0x138D
Socket Id                        : 0x00
Ports Num                        : 01
Bitstream Id                     : 288511860124977321
Bitstream Version                : 4.0.1
Pr Interface Id                  : a195b6f7-cf23-5a2b-8ef9-1161e184ec4e
Boot Page                        : user

Use the value under PR Interface ID to identify that FIM that has been loaded. Refer to the table below for a list of previous FIM releases:

Table 7-1 Previous FIM Releases

PR Release PR Interface ID
2023.2 edad864c-99d6-5831-ab67-62bfd81ec654
2022.2 (tag 1.3.0) bf531bcf-a896-5171-ab31-601a4ab754b6
2022.1 Beta (tag: 1.2.0-beta) 2fae83fc-8568-53aa-9157-8f75e9c0ba92
OFS 2.1 Beta (tag: 1.1.0-beta) 99160d37e42a 3f8b586f-c275-594c-92e2-d9f2c23e94d1
OFS 1.0 (tag: ofs-1.0.0) b5f6a71e-daec-59c3-a43a-85567b51fd3f
Intel Acceleration Stack for Intel® FPGA PAC D5005 2.0.1 9346116d-a52d-5ca8-b06a-a9a389ef7c8d

If the user's card does not report a PR Interface ID which matches the above table, then a new FIM will need to be programmed.

7.1.1 Programming the FIM

1. Download the file d5005_page1_unsigned.bin from OFS 2024.1-1 release page.

2. Run PACSign to create an unsigned image with added header for use by fpgasupdate

$ PACSign SR -y -v -t UPDATE -s 0 -H openssl_manager -i d5005_page1_unsigned.bin -o d5005_PACsigned_unsigned.bin

3. Run fpgasupdate to load the image into the user location of the Intel® FPGA PAC D5005 FPGA flash, NOTE: use "sudo fpgainfo fme" command to find the PCIe address for your card.

$ sudo fpgasupdate d5005_PACsigned_unsigned.bin 3B:00.0

4. Run RSU command.

$ sudo rsu bmcimg 0000:3B:00.0

7.2 Programming the BMC

1. Download intel-fpga-bmc images(To download OFS Stratix 10 BMC binaries contact Intel Technical Sales Representative)

2. The file unsigned_bmc_fw.bin has the newly binary format. This bitstream is programmed with remote system update (RSU) and the bitstream must be signed with PACSign tool to generate.

3. Run PACSign to create an unsigned image with added header for use by fpgasupdate

$ PACSign BMC -y -v -t UPDATE -s 0 -H openssl_manager -i unsigned_bmc_fw.bin -o PACsigned_unsigned_bmc_fw.bin

2022-04-22 03:07:05,626 - PACSign.log - INFO - OpenSSL version "OpenSSL 1.1.1k  FIPS 25 Mar 2021" matches "1.1.1"
2022-04-22 03:07:05,648 - PACSign.log - INFO - Bitstream not previously signed
2022-04-22 03:07:05,648 - PACSign.log - INFO - platform value is '688128'
2022-04-22 03:07:05,745 - PACSign.log - INFO - Starting Block 0 creation
2022-04-22 03:07:05,745 - PACSign.log - INFO - Calculating SHA256
2022-04-22 03:07:05,747 - PACSign.log - INFO - Calculating SHA384
2022-04-22 03:07:05,749 - PACSign.log - INFO - Done with Block 0
2022-04-22 03:07:05,749 - PACSign.log - INFO - Starting Root Entry creation
2022-04-22 03:07:05,749 - PACSign.log - INFO - Calculating Root Entry SHA
2022-04-22 03:07:05,749 - PACSign.log - INFO - Starting Code Signing Key Entry creation
2022-04-22 03:07:05,749 - PACSign.log - INFO - Calculating Code Signing Key Entry SHA
2022-04-22 03:07:05,749 - PACSign.log - INFO - Code Signing Key Entry done
2022-04-22 03:07:05,749 - PACSign.log - INFO - Starting Block 0 Entry creation
2022-04-22 03:07:05,749 - PACSign.log - INFO - Calculating Block 0 Entry SHA
2022-04-22 03:07:05,749 - PACSign.log - INFO - Block 0 Entry done
2022-04-22 03:07:05,749 - PACSign.log - INFO - Starting Block 1 creation
2022-04-22 03:07:05,750 - PACSign.log - INFO - Block 1 done
2022-04-22 03:07:05,757 - PACSign.log - INFO - Writing blocks to file
2022-04-22 03:07:05,758 - PACSign.log - INFO - Processing of file 'PACsigned_unsigned_bmc_fw.bin' complete

4. Run fpgasupdate to perform an upgrade of the BMC.

$ sudo fpgasupdate PACsigned_unsigned_bmc_fw.bin 3B:00.0

[2022-04-22 03:08:34.15] [WARNING ] Update starting. Please do not interrupt.
[2022-04-22 03:08:34.15] [INFO    ] updating from file pacsign_unsigned_bmc_fw.bin with size 819968
[2022-04-22 03:08:34.15] [INFO    ] waiting for idle
[2022-04-22 03:08:34.15] [INFO    ] preparing image file
[2022-04-22 03:09:02.18] [INFO    ] writing image file
(100%) [████████████████████] [819968/819968 bytes][Elapsed Time: 0:00:13.01]
[2022-04-22 03:09:15.20] [INFO    ] programming image file
(100%) [████████████████████][Elapsed Time: 0:00:29.03]
[2022-04-22 03:09:44.24] [INFO    ] update of 0000:3B:00.0 complete
[2022-04-22 03:09:44.24] [INFO    ] Secure update OK
[2022-04-22 03:09:44.24] [INFO    ] Total time: 0:01:10.08

Notices & Disclaimers

Intel® technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Performance varies by use, configuration and other factors. Your costs and results may vary. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that you may publish an unmodified copy. You may create software implementations based on this document and in compliance with the foregoing that are intended to execute on the Intel product(s) referenced in this document. No rights are granted to create modifications or derivatives of this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. You are responsible for safety of the overall system, including compliance with applicable safety-related requirements or standards. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission of the Khronos Group™.