NVMe
NVM Express (NVMe) devices are flash memory chips connected to a system via the PCI-E bus. They are among the fastest memory chips available on the market, faster than Solid State Drives (SSD) connected over the SATA bus.
Installation
Kernel
NVM Express block device (CONFIG_BLK_DEV_NVME
) must be activated to gain NVMe device support:
Device Drivers --->
NVME Support --->
<*> NVM Express block device Search for <code>CONFIG_BLK_DEV_NVME</code> to find this item.
Devices will show up under /dev/nvme*.
These are the defaults on other GNU/Linux distributions.
<*> NVM Express block device
[*] NVMe multipath support
[*] NVMe hardware monitoring
<M> NVM Express over Fabrics FC host driver
<M> NVM Express over Fabrics TCP host driver
<M> NVMe Target support
[*] NVMe Target Passthrough support
<M> NVMe loopback device support
<M> NVMe over Fabrics FC target driver
< > NVMe over Fabrics FC Transport Loopback Test driver (NEW)
<M> NVMe over Fabrics TCP target support
Emerge
User space tools are available via:
root #
emerge --ask sys-apps/nvme-cli
Configuration
Partition tables and formatting can be performed the same as any other block device.
Identifying the device
There are minor differences in the naming scheme for devices and partitions when compared to SATA devices.
NVMe partitions generally show a p before the partition number. NVMe devices also include namespace support, using a n before listing the namespace. Therefore the first device in the first namespace with one partition will be at the following location: /dev/nvme0n1p1. The device name is nvme0, in namespace 1, and partition 1.
Usage
I/O testing
Hdparm can be used to get the raw read/write speed of a NVMe device. Passing the -t
option instructs hdparm to perform timings of device reads, -T
performs timings of cache reads, and --direct
bypasses the page cache and causes reads to go directly from the drive into hdparm's buffers in raw mode:
root #
hdparm -tT --direct /dev/nvme0n1
Performance and maintenance
Since NVMe devices share the flash memory technology basis with common SSDs, the same performance and longevity considerations apply. For details consult the SSD article.
Kernel I/O scheduler
Thanks to very fast random access times provided by NVMe devices, it is recommended to use the simplest kernel I/O scheduling strategy available[1][2]. Recent kernels name this strategy as none.
Performance impact of the I/O scheduler can vary among different workloads. Always benchmark your particular workload in order to achieve the optimal performance.
Name of the currently used I/O scheduler can be obtained from sysfs. For example, a /dev/nvme0n1 device using the none scheduler would look as:
user $
cat /sys/block/nvme0n1/queue/scheduler
none
In case of having multiple I/O schedulers available in the kernel:
user $
cat /sys/block/nvme0n1/queue/scheduler
[none] mq-deadline kyber bfq
It is possible to change the scheduler by writing name of the desired scheduler to the sysfs file:
root #
echo "none" > /sys/block/nvme0n1/queue/scheduler
This can also be achieved automatically by udev rules:
# Set scheduler for NVMe devices
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"
See also
- SSD — provides guidelines for basic maintenance, such as enabling discard/trim support, for SSDs (Solid State Drives) on Linux.
External resources
- https://medium.com/@metebalci/a-quick-tour-of-nvm-express-nvme-3da2246ce4ef - An excellent article describing the differences in recent disk drive technology, but focusing on NVMe.
- https://wiki.archlinux.org/index.php/NVMe
- Switching Scheduler — The Linux Kernel documentation
References
- ↑ Kernel/Reference/IOSchedulers - Ubuntu Wiki, Ubuntu Wiki. Retrieved on June 6, 2021
- ↑ Linux 5.6 I/O Scheduler Benchmarks: None, Kyber, BFQ, MQ-Deadline, Phoronix. Retrieved on June 6, 2021