Friday, November 11, 2011

Exciting Crypto Advances with the T4+ processor and Oracle Solaris 11

I'm sure you all heard about the T4 launch in September, announcing the latest and greatest in the SPARC hardware line. These systems add a number of new features, but I'm going to focus on the ones that are related to cryptography.

UPDATE 4/2016: Everything in this document additionally applies to Oracle Solaris 11.1, 11.2, and 11.3, and all of the Oracle SPARC chips we've released since T4! This includes our latest launch of Oracle SPARC M7/T7. While the underlying crypto instructions have been very stable we, of course, have continued to tune performance and tweak mode support.  Since 11.2 we have additionally supported Camelia, which is also optimized by Oracle SPARC T4 and newer platforms! I've updated the document throughout to note T4+.

The Cryptographic Framework feature of Oracle Solaris was first included with Oracle Solaris 10.
Our focus was always to provide highly optimized algorithms to the rest of Oracle Solaris, so that the entire operating system could take advantage of the best cryptographic performance available.

At that time of the initial release of Oracle Solaris 10, there were no standard CPUs with cryptographic cores, but as the SPARC T series chips were developed, we always made sure to have a driver plugged into the Cryptographic Framework that would give the Cryptographic Framework consumers access to these devices.

But things have changed with T4+. These chip sets have made crypto a part of the core instruction set, accessible via nonprivileged instructions. That means, there are no drivers required to enable hardware assistance for cryptographic operations. Applications just access these instructions just like any other basic CPU instruction. That's right, crypto is now just a basic service provided by the CPU.

What does this mean? Well, before, in order for an application to access hardware crypto on a T3 system, the stack would look something like this: application -> libpkcs11 -> pkcs11_kernel -> IOCTL interface -> n2cp (7D) -> hypervisor -> crypto unit.

Now the stack will look more like this: application -> libpkcs11 -> pkcs11_softtoken -> CPU.

The one notable exception for this is the hardware random number generator (HW RNG), which still is only directly accessible via hyper-privileged registers through the n2rng driver. You can access this via /dev/random and /dev/urandom, as well as through the Cryptographic Framework's libpkcs11. See random(7D), n2rng(7D), and libpkcs11(3LIB) for more details.

With all of these changes, we're able to even more highly optimize the performance of cryptography on Oracle Solaris 11 and newer.

Algorithms Included

A primary goal of the Cryptographic Framework is to provide Oracle Solaris with highly optimized algorithms, and we made no exception for this release.

In Oracle Solaris 10 Update 10 (08/11), AES, DES, DES3, MD5, SHA1, SHA2 (SHA256, SHA384, SHA512), RSA, and DSA are all accelerated by T4+ crypto instructions for all supported modes of operation. To access these via libpkcs11 (3LIB), you'd use the standard PKCS#11 mechanisms listed below [1].

If you additionally download patch 147159 for Oracle Solaris 10 Update 10, you'll get further optimizations for AES-ECB, AES-CBC, AES-CTR, AES-CFB128, and MD5, SHA1, and SHA2.

In Oracle Solaris 11, we have all of those optimizations, plus optimizations for DES and 3DES, as well as optimizations and support for AES-CCM and AES-GCM.

To access these optimizations on Solaris 11, you need change nothing. We've made all of the code changes necessary in the Cryptographic Framework for you. Your applications that use the Cryptographic Framework (see Consumers section below for many examples), will take advantage of our optimizations and the T4 hardware right out of the box.

OpenSSL engine

UPDATE 4/2016: The OpenSSL T4 engine no longer exists, since our friends at OpenSSL have inlined all of the T4+ instructions into the main source tree! Thank you! Misaki wrote up a great blog describing this.

In Oracle Solaris 11 on a T4 system, you'll notice a new OpenSSL engine called t4. The t4 engine allows OpenSSL to access the optimized T4 crypto instructions directly, without needing to go through PKCS#11. The t4 engine is on by default, if the processor below supports those instructions. Nothing for you to do.

If you're still running Oracle Solaris 10 Update 10, you'll still need to set up your application to go through the pkcs11 engine, and make sure you apply patch 147707.

For example, if you're using Apache Web Server on Oracle Solaris 10 Update 10, or on Oracle Solaris 11 (in order to get the RSA accelerations) you'll need to set this line in your ssl.conf:
SSLCryptoDevice pkcs11

Consumers and Performance

The consumers of the Cryptographic Framework includes: ZFS, IPsec, IKE, kerberos (user and kernel), libsasl, KSSL (in Kernel SSL), OpenSSL, SSH, Java JCE, libsnmp, lofi(7D), and the Oracle DB (11.2.0.3). As well as anything that accesses libpkcs11(3LIB).

Just a note about the Java, T4 and newer processors are treated the same way as on T2, T3 and Intel - you need to go through the Java JCE provider.  UPDATE 4/2016: Java has started taking advantage of SPARC T4+ crypto acceleration directly. Currently in JDK8u40, Java accelerates generic AES, SHA1 and SHA2.  Keeping up-to-date on JDK8 patches will provide the best out-of-the-box performance.

And the Oracle Database? Uses our optimized T4 functions right out of the box (v 11.2.0.3 and newer).

Do you want to see just how much our performance optimizations get you on T4? Click on any of the hyperlinked consumers above to see their specific performance gains on T4, or navigate on over to BestPerf to see the latest and greatest numbers.


With the exception of the extra steps required on Oracle Solaris 10 Update 10 for OpenSSL to obtain access to the optimized functions that use the T4+ instructions, there is nothing for the administrator to do to get access to this acceleration. It simply works right out of the box.

How do I know if I'm using this?

Accessing these instructions does not require a driver, so there are no kstats to indicate how often any of these instructions are being used. At this time, it is not possible to obtain data from the Operating System regarding execution counts for nonprivileged cryptographic instructions.

UPDATE 4/2016: There is a hardware counter, but it also includes a bunch of floating point operations as well. Dan Anderson wrote a blog about detection that has been updated since we removed the OpenSSL T4 engine (in favor of simpler inlined instructions).

[1] PKCS#11 mechanisms used for accessing T4+ crypto instructions via libpkcs11 (3LIB) in Oracle Solaris 10 Update 10 and Oracle Solaris 11:

CKM_DES_CBC, CKM_DES_CBC_PAD, CKM_DES_ECB, CKM_DES_KEY_GEN, CKM_DES_MAC_GENERAL, CKM_DES_MAC, CKM_DES3_CBC, CKM_DES3_CBC_PAD, CKM_DES3_ECB, CKM_DES2_KEY_GEN, CKM_DES3_KEY_GEN, CKM_AES_CBC, CKM_AES_CBC_PAD, CKM_AES_ECB, CKM_AES_KEY_GEN, CKM_BLOWFISH_CBC, CKM_BLOWFISH_KEY_GEN, CKM_SHA_1, CKM_SHA_1_HMAC, CKM_SHA_1_HMAC_GENERAL, CKM_SHA256, CKM_SHA256_HMAC, CKM_SHA256_HMAC_GENERAL, CKM_SHA384, CKM_SHA384_HMAC, CKM_SHA384_HMAC_GENERAL, CKM_SHA512, CKM_SHA512_HMAC, CKM_SHA512_HMAC_GENERAL, CKM_SSL3_SHA1_MAC, CKM_MD5, CKM_MD5_HMAC, CKM_MD5_HMAC_GENERAL, CKM_SSL3_MD5_MAC, CKM_RC4, CKM_RC4_KEY_GEN, CKM_DSA, CKM_DSA_SHA1, CKM_DSA_KEY_PAIR_GEN, CKM_RSA_PKCS, CKM_RSA_PKCS_KEY_PAIR_GEN, CKM_RSA_X_509, CKM_MD5_RSA_PKCS, CKM_SHA1_RSA_PKCS, CKM_SHA256_RSA_PKCS, CKM_SHA384_RSA_PKCS, CKM_SHA512_RSA_PKCS, CKM_DH_PKCS_KEY_PAIR_GEN, CKM_DH_PKCS_DERIVE, CKM_MD5_KEY_DERIVATION, CKM_SHA1_KEY_DERIVATION, CKM_SHA256_KEY_DERIVATION, CKM_SHA384_KEY_DERIVATION, CKM_SHA512_KEY_DERIVATION, CKM_PBE_SHA1_RC4_128, CKM_PKCS5_PBKD2, CKM_SSL3_PRE_MASTER_KEY_GEN, CKM_TLS_PRE_MASTER_KEY_GEN, CKM_SSL3_MASTER_KEY_DERIVE, CKM_TLS_MASTER_KEY_DERIVE, CKM_SSL3_MASTER_KEY_DERIVE_DH, CKM_TLS_MASTER_KEY_DERIVE_DH, CKM_SSL3_KEY_AND_MAC_DERIVE, CKM_TLS_KEY_AND_MAC_DERIVE, CKM_TLS_PRF.

UPDATE 4/2016: As of Oracle Solaris 11.2, we also include the following hardware assisted mechanisms:  CKM_CAMELLIA_CBC, CKM_CAMELLIA_CBC_PAD, CKM_CAMELLIA_CTR, CKM_CAMELLIA_ECB, CKM_CAMELLIA_KEY_GEN.

This post syndicated from Thoughts on security, beer, theater and biking!

1 comment:

  1. Useful, concise information on T4 Crypto--thanks.
    Here's my blog on the OpenSSL T4 engine:
    http://blogs.oracle.com/DanX/entry/sparc_t4_openssl_engine

    ReplyDelete