FreeNAS Successfully Recovered From Failed Drive
One of the major reasons to set up a FreeNAS machine with ZFS volume is to ensure that network storage data is always available in a redundant manner to ensure everything will still be OK after a hard drive inevitably fails. But before theory can be put into practice, we had to wait for a drive to actually fail. In that sense, today is our lucky day.
A drive failed to respond to some commands and returned errors in the wee hours of the morning. Although it appears to have since recovered and is functioning, we shouldn't be comforted by the momentary blip - it is almost certainly a sign of things to come if we continue to use this drive. So we'll replace it instead of waiting for a catastrophic failure.
The instructions to replace a failing drive is covered in the FreeNAS manual. Following the procedures, the drive was taken offline in the Storage/Volume Status screen.
Then we go to Storage/View Disks screen to retrieve the identifying serial number. This ensures that we remove the correct physical drive from the computer by comparing this serial number against the number on the physical label on the drive.
Since this FreeNAS machine does not have hot swap capability, it then had to be shut down for the actual drive replacement. Once the machine restarts, we go back into Storage/Volume Status and select "Replace". (The button next to "Offline" we clicked earlier.) If there's any existing data on the replacement drive, FreeNAS will double-check to make sure it's OK for the replacement drive to be overwritten.
And after that... we wait for the data from the remaining good drive to be replicated to the newly installed replacement drive.
This procedure will take several hours and this time is technically a window of vulnerability - if the remaining good drive fails during this time we'll lose data. To guard against this, ZFS allows even deeper redundancy by using more than two hard drives. In the case of this server, the data is not critical enough to warrant such protection and we'll just cross our fingers the remaining drive does not fail during the recovery process.