Openzfs l2arc

3/22/2023

The values possible for these options are 'all', 'off', and 'metadata.' You can use this to selectively decide whether you want caching on your different layers of ARC 'primarycache' is RAM, and 'secondarycache' is l2arc. There are two tunables at the zfs dataset level which determine what ends up in the ARC: 'primarycache' and 'secondarycache'. When I added l2arc to my system and turned on dedupe, I paid very close attention to my cache usage. Unlike the in-memory ARC, we have some visibility into the l2arc directly provided by the zpool iostat utility: zpool iostat -v We really want the DDT in RAM, but having it on SSD will prevent the system from being completely useless if memory is exhausted, so it's a great idea to have SSD l2arc devices assigned to any pool that you want to dedupe. One thing you can do to potentially save yourself a world of pain is to ensure you have SSDs for l2arc. ARC data in general is not exposed to the user by tools that come with OI or Solaris indeed, one must use third party tools such as arc_summary, arcstat, or sysstat to see what's going on at all. One frustrating aspect of this scenario is that it's very difficult to see what your DDT is really doing. There are calculations you can make based on data retrieved from undocumented commands, but as a starting point you should count on at least 1 GB of ARC per 1 TB of data. So, how much memory do you need in order to effectively use dedupe? The common answer on mailing lists or IRC is "as much as you can afford," and in practice that's probably the best advice you'll get. So heed this warning: if you turn on dedupe without enough RAM to cache the DDT in ARC, your write speeds can decrease by an order of magnitude (or more). The nature of the table makes this even more of a disaster, since it's a whole lot of small, random I/O - which is something normal hard drives are very bad at. If your ARC can't fit the entire DDT, then every single time you try to write or read data, zfs will have to retrieve the DDT from spinning rust. The reason you need to be thinking about ARC when the DDT is in play is this: the DDT is stored in the ARC. Any writes with dedupe enabled will require lookups to this table first. The dedupe table is where the magic happens ZFS uses the table to identify duplicated blocks. When you turn on dedupe, you add a massive chunk of metadata known as the dedupe table ("DDT") into the equation. Before you even think about turning dedupe on, you'd better start thinking about the size of your ARC. There is, however, a critical exception to this: ARC becomes absolutely vital when dedupe is enabled.

Adding additional RAM or assigning additional l2arc drives enables the ARC to cache more a nice bonus for sure, but it's not the end of the world if you run out of cache. In most cases you can just ignore the ARC, and happily reap the benefits of faster reads from cache. It won't be as fast as RAM, of course, but it's still potentially much faster than spinning rust, especially for random I/O. You can, for example, attach a fast SSD to a pool as l2arc, and ZFS will start using it as secondary cache. Now, in addition to the ARC that sits in RAM, ZFS also has a facility to use level two adaptive replacement cache ("l2arc") on other "fast" storage. Ben Rockwood wrote a good introduction to the ARC and a tool you can use to examine its state, so if you're interested in more details be sure to check that out. When you start using zfs, this all happens behind the scenes in memory Solaris or OpenIndiana dedicates a chunk of RAM for the ARC, and it reduces the size of the ARC when memory pressure demands it. ZFS uses what's known as " adaptive replacement cache" (almost always just called "the arc") to hold both metadata and filesystem data in fast storage, which can dramatically speed up read operations for cached objects.

0 Comments

Openzfs l2arc

Leave a Reply.

Author

Archives

Categories