Comprehensive Server Certification

Summary

Starting in April 2018 with the initial certification efforts for Ubuntu Server 18.04 LTS, the requirements for Ubuntu Server Certification were changed to a model that increases coverage to include storage devices, network devices, NVMe and other Vendor Approved Options.  This is an overview of the changes to help you better understand the requirements for Server Certification

Questions and Answers

What changed?

Starting with 18.04 LTS, Ubuntu Server Certification is no longer based on a per-config/SKU model. Server Certification now requires certification testing of most Vendor Approved Options for any given server model. This means most storage options, most network options, and other Vendor Approved Options will need to be tested at least once per release on at least one server in the server line.

When did this change occur?

This change became effective for all Ubuntu Server Certifications starting with 18.04 LTS.

What is the process?

Canonical will work with you to design a testing matrix that will minimize the amount of testing required while maximizing the coverage of server models and Vendor Approved Options. Status and HCL posting will be incremental, so we will list each certified server and that server’s vendor approved options. For transparency, any untested option will be listed as unsupported by Ubuntu, and as options are tested their status will change from unsupported to supported for that Ubuntu Server LTS release.

In order to build this matrix, we will require a list of Vendor Approved Options for each server model to be certified, including vendor part numbers (or FRU numbers if used) and a marketing name for the options (e.g. Intel i350 4-Port 1Gb PCIe Network device). It is preferred if this data can be provided in a spreadsheet for each model, however, many vendors provide this data via PDF configuration guides or static web pages, and those are also acceptable.

Sample Test Plan

You have Four models with the following options:

Server1: RAID_A, RAID_B, NIC_1, NIC_2, NIC_3, HBA_1
Server2: RAID_A, RAID_C, NIC_2, NIC_3, NIC_4
Server3: RAID_B, NIC_1, NIC_2, HBA_1
Server4: NO RAID, NIC_3, NIC_4, NIC_5 HBA_1

So potentially you could test like this:

Server1 (RAID_A, NIC_1, NIC_2, NIC_3, HBA_1)
Server2 (RAID_C, NIC_4) *since RAID_A, NIC_2 and NIC_3 are already covered
Server3 (RAID_B) *since NIC_1, NIC_2 and HBA_1 are already covered
Server4 (NIC_5) since NIC_3, NIC_4 and HBA_1 are already covered

Plus, later if you introduce in Q3:

Server5: RAID_B, RAID_C, NIC_1, NIC_3, NIC_5, HBA_1, HBA_2

You would only need to test:

Server5 (HBA_2) *since all other orderable options have been previously tested

Ultimately, while we ask that you test the orderable options, the above is only a rough sample test plan and you can test the options in any order, however works best for you.

Will we have to test every single option?

Not for all classes of devices. We do not, for example, require testing of different SATA or SAS HDD/SSD models. Conversely if a server has options for both Xeon and Intel Core CPUs then one of each will need to be tested.  For network controllers, using a NIC offered in 4, 2, or 1 port variants as an example, only the 4 port variant would need to be tested and that result would validate all three variants across the server line.

CPUs

Changes in CPU, as always, require separate certification as that change defines a separate system. So if, for example, a server model is sold with Skylake CPUs and later refreshed with Cascade Lake CPUs, the Skylake and Cascade Lake versions are treated as completely separate systems and must be certified separately.

Hard Drives

For traditional drives (HDD and SSD), you do not need to test every specific HDD or SSD model offered. We do suggest, however, you put more than one in each SUT for JBOD setups.

NVMe

One NVMe model from each manufacturer must be tested (e.g. Samsung, Intel, Micron). NVMes generally “just work” these days, but we have found cases periodically where a new NVMe is not supported by current drivers and need to identify those.

RAID Cards

For RAID cards, you will not need to test models that are identical save for a cache battery. So if you have RAID A and RAID B, which are the same card but one is sold with a Cache Backup Battery, you will only need to test one of them, however, we’ll need to know which is which.

NICs

For NICs, you must test them at maximum advertised speed. For cards offered in multiple port configurations (e.g. the same 1Gb NIC available in 4, 2 and 1 port configs) you only need to test the biggest card, the one with the most ports.

Likewise, the same chipset across the same system formfactor does not need to be tested, such as a 10Gb chipset on a 2 port SFP+ PCIe card and an OCP Mezzanine Card. However, when the form factor of the server changes, those devices DO need to be tested separately. For example, a 10Gb NIC controller in PCIe and OCP Mezzanine form factors on a rack server vs the same 10Gb controller on a daughter card that plugs into a Blade that then talks across an internal chassis fabric.

HBAs

HBAs will follow the same rules as NICs

Other Devices

Any other device not already listed must be tested.  However, if you feel there is a reason it is unnecessary, please contact your Partner Engineer and we’ll work with you on a solution.

How will this affect Certifications?

Failure to test Vendor Approved Options may lead to delays in publishing additional server certifications and possibly revocation of existing certifications in extreme circumstances. We require this testing to validate that the various options our mutual customers can order work correctly with Ubuntu.

Why is this being implemented?

The previous process left gaps in test coverage in areas where issues are commonly reported to support teams at both Canonical and our OEM partners.  Increasing this test coverage will allow us to find these issues before they become problems for our mutual customers, reducing support costs and increasing customer satisfaction. This is a direct result of multiple issues in the field where a server model had been certified, but a customer was then sold an untested optional component that failed to work when deployment was attempted.

Further questions?

Please feel free to reach out to the Hardware Certification team any time for more information or if you have any questions at all.

We can be reached here: server-certification@canonical.com

Additionally you may contact your Partner Engineer directly, or the team in general here: tpp@canonical.com

Finally, if you are not already subscribed to the hwcert-announce mailing list, you should consider subscribing. ‘hwcert-announce’ is a low-traffic announcement list for Canonical Hardware Certification programmes and will give you the latest info on updated test tools, policy changes, release specific information and so forth.