Вы находитесь на странице: 1из 11

SECURITY IP

White Paper

Hardware Security
for AI Accelerators

© Rambus Inc.
Introduction
The rapid growth of artificial intelligence and machine learning (AI/ML) in applications spanning
virtually every industry is driving the development of dedicated accelerator hardware, and its
broad deployment across data centers and the network edge. The virtuous cycle of greater AI/ML
processing power enabling new applications which then spur demand for more processing power
is in full swing.

The accelerating value creation of AI/ML combined with its wider deployment raises the
motivation for and the risk of attacks. Protecting AI/ML assets, be they hardware, software
or data, is increasingly mission critical. This white paper will discuss the threats, fundamental
security techniques, and use models that illustrate safeguarding these assets.

AI Assets
For our use cases, we’ll be examining both edge and data center (server) devices. An edge device
may contain an ML accelerator for local inference. Alternatively, it may be a simple edge device
which collects inputs and transmits this data to the cloud for inferencing. Our example server
device has an accelerator card with a dedicated ML accelerator chip. This could be performing
training, or inference in the cloud as a counterpart to the simple edge device.

Across these devices, the AI assets needing protection start with the AI accelerator. Attackers
could attempt to tamper with the accelerator hardware to deny its usage or bypass its security.
This is particularly true in the case of a smart edge device that’s outside the hardened data
center environment. In addition, both edge and server ML accelerators can be subject to
attempts to tamper with their firmware, again, to deny usage or bypass security.

Training data is another important asset. An attacker can tamper with training data to distort
the resulting model. This is called an AI poisoning attack. In addition, an attacker could attempt
to steal the training data. For our use cases, we assume that training occurs only on in the data
center.

The inference model is another one of our assets. An attacker could modify or replace the
inference model to induce incorrect behavior, or they could attempt to steal the inference model
itself. An attacker could also tamper with the input data to produce misclassifications. Attackers
could steal the input data, which could have privacy implications. Finally, the inference results
themselves require protection as they drive action in the physical world which if tampered with
could cause property loss, injury or loss of life.

Hardware Security for AI Accelerators 2


Threat Model
The threats to our AI-powered devices are many and growing. First, let’s consider the threat
model for edge devices. We assume attackers will be able to gain physical access to these devices
deployed in the field. They are able to handle, control, and disassemble edge devices and
examine all their components. If they can do that, they can read the device’s flash memory and
potentially alter its contents.

Attackers can run malicious firmware on the CPU and can read the contents of SRAM. Attackers
can monitor, intercept and change network traffic, and they can use side-channel attacks like
power analysis, EM analysis, and fault injection on the device.

The threat model for servers and their ML accelerator cards differs from that of edge devices
in that servers are in a hardened data center environment. For our data center use cases, we
assume attackers won’t have physical access to the hardware. However, potential threats are still
numerous and complex.

Attackers can subvert the host CPU hypervisor and access any process or memory region. They
can read the flash in both the host and the accelerator, as well as the contents of SSDs. Beyond
reading, attackers could attempt to can change the contents of flash or SSDs. Attackers could
run malicious software on the host CPU and malicious firmware on the accelerator CPU. Further,
attackers can read SRAM and DRAM contents on the host and the accelerator. They can monitor,
intercept, and change network and bus traffic.

Security Measures
In our use cases, we’ll be employing a number of security techniques to safeguard AI/ML assets.
These are described in the paragraphs below:

Encryption: Unencrypted data (plain text) is converted to encrypted data (ciphertext) using an
encryption algorithm. The encryption algorithm performs alphabetic substitution of the plain
text using a secret key and a block cipher (an enormous “alphabet” that’s practically infeasible to
work through by brute force). Changing the key alters the ciphertext generated by the encryption
algorithm. Knowledge of the key is needed to decrypt the ciphertext to plain text. AES (Advanced
Encryption Standard) is an example of an encryption algorithm.

Hashing: A hash function creates a fixed-length “fingerprint” for any set of data (message). Quick
to compute, a hash function is deterministic, so the same input produces the same output hash
value. Comparing the hash of a message with an earlier hash can determine if the message is
unaltered. Matching hashes confirm the integrity of the message. Properties of an ideal hash
function are that no two data sets produce the same fingerprint, and it is infeasible to determine
the message from the hash. SHA-2 and SHA-3 (Secure Hash Algorithm) are examples of hash
functions.

Signing: Signer and verifier share a secret key that is used to cryptographically create a signature
in symmetric key authentication. The signature is sent with the message. Upon receipt, the
verifier recomputes the signature using the key, and if it matches the one sent, confirms
authenticity of the message sent. HMAC (Key-Hash Authentication Code) is one of the most
commonly used signing algorithms.

System Monitoring: A set of processes which check the operation of system hardware and
software against a set of normal parameters. Operation outside the norm signals the system may
be under attack.

Hardware Security for AI Accelerators 3


Hardware Root of Trust
The key enabling security solution across all of our use cases is a hardware root of trust (HRT).
The HRT is a digital core, also called soft IP, that can be incorporated into the design of a chip
such as an ML accelerator. Depending on the application, the HRT could range from a lightweight
state machine, to a programmable secure co-processor. For example, a high-volume AI-powered
edge device might use the state machine root of trust, whereas a server ML accelerator could
require the programmable secure co-processor HRT.

The HRT is the foundation for security. It contains the keys and other secure data needed for
authentication and encryption. A baseline function of the root of trust is that it verifies we have
authentic, and untampered boot code with which to start up the system. Ideally, the HRT should
be purpose-built for security where complexity is minimized so it can be hardened against attack.
It siloes cryptographic operations away from main processing so the main processor(s) in the ML
accelerator can be optimized for performance while the HRT is kept simple for security

Further, it should offer a full feature set for executing complex algorithms and cryptographic
protocols. Using a layered security model, it should provide the robust security of hardware with
the flexibility of software. And it should include strong anti-tamper features to guard against side-
channel attacks such as differential power analysis or fault injection.

Use Cases
Protecting ML Accelerator Availability: Firmware

Asset Threat Solution


ML accelerator Attacker attempts to tamper Secure boot and firmware
with accelerator firmware to protection
deny or disrupt usage or
bypass security

In our first use case, we guard against an attacker attempting to tamper with the ML accelerator
firmware to deny or disrupt usage or bypass security. The solution entails secure boot and
firmware protection.

Accelerator Card(s)

ASIC
DRAM
5 CPU ML Accelerator

1
Hardware 3
PCIe 2 Root of Trust
4 Flash

Hardware Security for AI Accelerators 4


The HRT embedded in the accelerator ASIC has its own robust secure boot functionality. Once it
boots securely, it can ensure that other CPUs in the system boot securely as well. Our solution
operates as follows:
1. The HRT holds the accelerator CPU in reset.
2. The HRT boots itself securely.
3. It verifies its signed hash of the boot code in flash.
4. It checks the hash in the boot image ensuring it matches the signed value.
5. If the signed value is correct, the HRT will release the reset and allow the CPU to boot.

The HRT can also monitor firmware updates in a similar manner and also provide rollback
protection using the OTP.

Protecting ML Accelerator Availability: Hardware

Asset Threat Solution


ML accelerator Attacker attempts to tamper System monitoring
with accelerator firmware to
deny or disrupt usage or
bypass security

Now let’s look protecting the ML accelerator hardware from a tamper attack aiming to deny
usage or bypass security. The solution in this case is system monitoring.

Edge device
1
SoC
3 Sensors
SRAM Hardware 2
Root of ML
Trust Accelerator
CPU U/I
5

Actuators

6 4
Flash
Network
Memory

Hardware Security for AI Accelerators 5


Here we’ll look at the edge device case as they are deployed outside hardened data centers
where they are more vulnerable to physical attacks. Operationally system monitoring works as
follows:

1. The HRT can monitor system status and memory contents and detect tampering activity.
The HRT can also detect attacks like fault injection. It can monitor test and debug
logic, hardware configuration, and other hardware status in the SoC using dedicated
connections into the ML accelerator logic.
2. It can also monitor the ML accelerator operation, ensuring it’s operating only when
expected.
3. It can periodically hash known SRAM state to detect tampering.
4. It can periodically hash invariant flash data and ensure it’s not changing.
5. Internal logic in the HRT can detect physical attacks like fault injection.
6. The HRT also monitors network traffic, looking for anomalous traffic that might indicate
an attack or a compromised software stack.

Protecting Training Data

Asset Threat Solution


Training data Attacker attempts to tamper Sign training data and
with training data to distort authenticate before use
resulting model
Training data Attacker attempts to steal Encrypt training data when
training data not in use

In this use case, we provide protection against both tampering and theft of training data.
The solution employs signing and authenticating training data before its use. Training data is
encrypted when not in use to protect it from theft.

Accelerator Card(s)
2
ASIC 5
DRAM
CPU ML Accelerator

3
Hardware
PCIe
Root of Trust Flash
4

Host

PCIe
1
SSD CPU DRAM

Network

Hardware Security for AI Accelerators 6


Training data should come from a trusted source, and be signed and encrypted. Protection of
training data works as follows:

1. The signed encrypted training data is stored in the SSD on the host. Even if the host is
compromised, decrypted training data is never present on the host and protected from
theft.
2. The signed encrypted training data is sent over the PICe interface to the accelerator
card and stored in DRAM.
3. The HRT decrypts the training data and hashes the encrypted data.
4. It then verifies the signature and compares hashes.
5. If the hashes match, the verified training data is sent to the ML accelerator.

Protecting Inference Models

Asset Threat Solution


Inference model Attacker attempts to modify Sign inference model and
or replace inference model authenticate before use
to induce incorrect behavior
Inference model Attacker attempts to steal Encrypt models when not
trained inference model in use

In this use case we use similar techniques to protect the inference model from tampering,
replacement or theft. Once training is complete, the inference model should be signed and
encrypted.

Edge device

SoC
Sensors
SRAM
2
Hardware ML
Root of Accelerator
CPU Trust U/I

3 4

Actuators

1
Flash
Network
Memory

Hardware Security for AI Accelerators 7


This use model applies to the edge device and works as follows:
1. The signed and encrypted inference model is stored in flash. Even if the flash is read by
an attacker, they can’t access the inference model.
2. The HRT reads the inference model from flash, decrypts it and hashes the decrypted
data.
3. The HRT verifies the signature and compares hashes.
4. If the hashes match, the inference model is loaded into the accelerator. The decrypted
inference model isn’t kept in memory unless it’s needed. When not in use, it can be
flushed out of memory so it’s not accessible to an attacker.

Protecting Input Data Integrity

Asset Threat Solution


Input data Attacker attempts to tamper Authenticate communication
with input data to produce with source of input data
misclassification

Integrity of input data must also be protected. This can be done by authenticating
communication with the source of input data.

Accelerator Card(s)
ASIC
4 DRAM
CPU ML Accelerator

Edge device
4
SoC
2 Sensors Hardware 2
SRAM PCIe Flash
Root of Trust
Hardware
Root of
CPU Trust U/I

Host
Actuators PCIe

1 SSD CPU DRAM


Flash
Network
Memory Network
3

Here we examine a case of a simple edge device that doesn’t have an ML accelerator. Instead it
relies on inference being done on a server in the cloud.
1. In this case, an HRT in the edge device and an HRT in the accelerator can mutually
authenticate and provide a secure communication channel to protect the input data
integrity.
2. The host communicates with the edge device over the network interface and bridges
connection over PCIe, enabling the two HRTs to communicate.
3. A mutual authentication protocol such as MACsec using pre-provisioned keys and IDs
ensures the edge device is legitimate as is the server. All input data going from the edge
device to the accelerator passes through the secure channel with data encryption.
4. Data from sensors on the edge device are subjected to an integrity check to ensure that
the input data is not tampered with when transmitted to the AI accelerator.

Hardware Security for AI Accelerators 8


Protecting Input Data Confidentiality

Asset Threat Solution


Input data Attacker attempts to steal Encrypt user data in transit
input data which may have to accelerator
privacy implications

Since input data can carry confidential information, it too must be safeguarded to protect the
privacy of users. A single host may be handling multiple workloads, and compromising one
application could give an attacker access to private data on another workload if not protected.
The solution is to encrypt input data in transit to the accelerator.

Accelerator Card(s)
ASIC
5 DRAM
CPU ML Accelerator

2
Hardware
PCIe Flash
Root of Trust
4

3
Host
PCIe

SSD CPU DRAM

1 Network

1. Data communicated over the network interface can be encrypted using keys mutually
established between the sending device and HRT in the accelerator.
2. Each workload can have a different key, managed by the HRT.
3. The encrypted input data is sent from the host to the accelerator. The host never sees
the decrypted input data, so a compromise of the host does not compromise user
privacy.
4. Based on the workload ID supplied by the host, the HRT derives the correct key and
decrypts the data.
5. The decrypted input data is then used by the accelerator for inference.

Hardware Security for AI Accelerators 9


Protecting Inference Results

Asset Threat Solution


Inference results Attacker attempts to tamper Authenticate and secure
with inference results communication between
accelerators and other
system components

In the final use case, we protect inference results from tampering. Here we’ll use authentication
and a secure communication channel between the edge device and the server when inference is
performed.

Accelerator Card(s)
ASIC
4 DRAM
CPU ML Accelerator

Edge device
SoC
2 Sensors Hardware 2
SRAM PCIe Flash
Root of Trust
Hardware
Root of
CPU Trust U/I
4
Host
Actuators PCIe

1 SSD CPU DRAM


Flash
Network
Memory Network
3

This works similar to the case for protecting data integrity:


1. The HRT in the edge device and an HRT in the accelerator mutually authenticate and
provide a secure communication channel
2. The host communicates with the edge device over the network interface and bridges
connection over PCIe, enabling the two HRTs to communicate.
3. A mutual authentication protocol such as MACsec using pre-provisioned keys and IDs
ensures the edge device is legitimate as is the server. Inference results from the ML
accelerator pass through the secure channel with data encryption.
4. On the edge device, the inference results are subjected to an integrity check to ensure
that they were not tampered with during the transit from the ML accelerator.

Hardware Security for AI Accelerators 10


Rambus Solutions
Rambus offers a broad portfolio of solutions to secure data at rest (when it is stored or processed
on a device) and data in motion (when it is communicated between devices). Two featured
products that support the AI/ML use cases presented are the CryptoManager Root of Trust
secure co-processor and the 800G MACsec Protocol Engine described below.

CryptoManager Root of Trust

The CryptoManager Root of Trust RT-630 is a fully programmable hardware security core
offering security-by-design for AI/ML applications. It protects against a wide range of hardware
and software attacks through state-of-the-art anti-tamper and security techniques. It is built on a
custom 32-bit siloed and layered secure co-processor, along with dedicated secure memories.

800G MACsec Protocol Engine

The 800G MACsec Protocol Engine supports aggregate bandwidth of 100 to 800 Gbps over as
many as 64 channels. It provides line-rate operation, so there’s no sacrifice in performance to
achieve robust Layer 2 MACsec security between networked devices. Supporting all IEEE MACsec
standards, it has options for Cisco extensions and IPsec ESP AES-GCM protocol.

For more information on the complete portfolio


of Rambus security solutions, please, visit
rambus.com/security

© Rambus Inc. • rambus.com

Вам также может понравиться