2

Problem

After some issues with the system state I reverted to a previous state snapshot.

Since then I systemctl is-system-running is in degraded state (I have it in my prompt).

The only failed service is zsys-commit.service and the status is:

● zsys-commit.service - Mark current ZSYS boot as successful
    Loaded: loaded (/lib/systemd/system/zsys-commit.service; enabled; vendor preset: enabled)
    Active: failed (Result: exit-code) since Sun 2022-01-23 23:21:41 EST; 10h ago
  Main PID: 12287 (code=exited, status=1/FAILURE)

Jan 23 23:21:40 hostname systemd[1]: Starting Mark current ZSYS boot as successful... Jan 23 23:21:41 hostname zsysctl[12287]: level=error msg="couldn't commit: couldn't promote dataset "rpool/ROOT/ubuntu_ssfirw": couldn't promote "rpool/ROOT/ubuntu_ssfirw": not a cloned filesystem" Jan 23 23:21:41 hostname systemd[1]: zsys-commit.service: Main process exited, code=exited, status=1/FAILURE Jan 23 23:21:41 hostname systemd[1]: zsys-commit.service: Failed with result 'exit-code'. Jan 23 23:21:41 hostname systemd[1]: Failed to start Mark current ZSYS boot as successful.

Questions

  1. How does zsys determines a dataset is cloned ?
    • following with: Can I modify that?
  2. What would be the best approach to cleanup all zsys states and remain with only the current state (with everything aligned, including the boot menu) ? See update #1 at the bottom

More details:

Some digging revealed that the command the service is running is:

/sbin/zsysctl boot commit

This is the output for sudo /sbin/zsysctl boot commit -vvv:

DEBUG /zsys.Zsys/CommitBoot() call logged as [79ef457a:5d32ce55] 
DEBUG Check if grpc request peer is authorized     
DEBUG Authorized as being administrator            
INFO Commit current boot state                    
INFO Committing boot for "rpool/ROOT/ubuntu_ssfirw" 
INFO Tag current user dataset: "rpool/USERDATA/szkolnik_vvk5gq" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:bootfs-datasets"="rpool/ROOT/ubuntu_1s4qqj,rpool/ROOT/ubuntu_ssfirw" on "rpool/USERDATA/szkolnik_vvk5gq" 
INFO Tag current user dataset: "rpool/USERDATA/root_vvk5gq" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:bootfs-datasets"="rpool/ROOT/ubuntu_1s4qqj,rpool/ROOT/ubuntu_ssfirw" on "rpool/USERDATA/root_vvk5gq" 
INFO set current time to "1643036037"             
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "bpool/BOOT/ubuntu_ssfirw" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/srv" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/srv" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/usr" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/usr" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/usr/local" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/usr/local" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/games" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/games" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/lib" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/lib" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/log" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/log" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/mail" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/mail" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/snap" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/snap" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/spool" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/spool" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/www" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/www" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/lib/AccountsService" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/lib/AccountsService" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/lib/NetworkManager" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/lib/NetworkManager" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/lib/apt" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/lib/apt" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/ROOT/ubuntu_ssfirw/var/lib/dpkg" 
DEBUG ZFS: can't set property "com.ubuntu.zsys:last-used"="1643036037" for "rpool/ROOT/ubuntu_ssfirw/var/lib/dpkg" as not a local property ("inherited") 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/USERDATA/szkolnik_vvk5gq" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-used"="1643036037" on "rpool/USERDATA/root_vvk5gq" 
INFO Set latest booted kernel to "vmlinuz-5.13.0-27-generic" 
DEBUG ZFS: trying to set "com.ubuntu.zsys:last-booted-kernel"="vmlinuz-5.13.0-27-generic" on "rpool/ROOT/ubuntu_ssfirw" 
INFO Promoting user datasets                      
INFO Promoting system datasets                    
INFO Promoting dataset: "bpool/BOOT/ubuntu_ssfirw" 
DEBUG ZFS: trying to promote "bpool/BOOT/ubuntu_ssfirw" 
DEBUG Trying to promote "bpool/BOOT/ubuntu_ssfirw" 
DEBUG ZFS: an error occurred: couldn't promote "bpool/BOOT/ubuntu_ssfirw": not a cloned filesystem 
DEBUG ZFS: Cancelling nested transaction           
DEBUG ZFS: ending transaction                      
DEBUG ZFS: reverting all in progress zfs transactions 
DEBUG ZFS: transaction done                        
DEBUG ZFS: ending transaction                      
DEBUG ZFS: ending transaction                      
DEBUG ZFS: transaction done                        
DEBUG ZFS: reverting all in progress zfs transactions 
DEBUG ZFS: transaction done                        
ERROR couldn't commit: couldn't promote dataset "bpool/BOOT/ubuntu_ssfirw": couldn't promote "bpool/BOOT/ubuntu_ssfirw": not a cloned filesystem 

Update #1

I found this answer and based on that came up with the following code:

# List all bpool/BOOT states, from newest created to oldest
zfs list -r -t snapshot -S creation -o name,used,referenced,creation bpool/BOOT | sed '6 i --------------------------------------------------------------------------------'

I wanted to preserve the last 4 states, so I wrote the following code, removing all but 4 most recent states, from newest to oldest:

zfs list -r -t snapshot -S creation -Ho name bpool/BOOT | tail -n+5 | sed 's/.*@\(autozsys_\)\?//' | sudo xargs -i sh -c "echo removing {}...; zsysctl state remove {} --system --force || exit 255"

This however got stuck for me as a certain state is refusing to be removed:

ERROR couldn't remove system state kxxwbr: Couldn't remove state rpool/ROOT/ubuntu_ssfirw: Couldn't destroy rpool/ROOT/ubuntu_ssfirw: couldn't destroy "rpool/ROOT/ubuntu_ssfirw" and its children: stop destroying dataset on "rpool/ROOT/ubuntu_ssfirw", cannot destroy child: stop destroying dataset on "rpool/ROOT/ubuntu_ssfirw/usr", cannot destroy child: cannot destroy dataset "rpool/ROOT/ubuntu_ssfirw/usr/local": dataset is busy

For some reason, it is trying to destroy the active dataset.
Thinking I'm obviously not dealing with this correctly, I tried reversing the order, deleting from the oldest to newest (except the 4 most recent ones):

zfs list -r -t snapshot -s creation -Ho name bpool/BOOT | head -n-4 | sed 's/.*@\(autozsys_\)\?//' | sudo xargs -i sh -c "echo removing {}...; zsysctl state remove {} --force --system || exit 255"

This errored out at the same state, kxxwbr with the same error.

So I still need help with this.

Lockszmith
  • 426
  • 3
  • 8

1 Answers1

1

Answering my own question, as it seems to be an issue with zfs maybe even specifically to zfs-on-linux and nothing to do with zsys.

Continuing my attempt to clean up snapshots manually, I eventually remained with a single state, currently named kxxwbr.

This kxxwbr state is associated with a dataset named 1s4qqj which was is the basis of a cloned dataset named ssfirw.

The following code will output the state clearly:

zfs list -o name,origin -S creation | grep -v '\W\-$'

outputs the following

NAME                                             ORIGIN
rpool/ROOT/ubuntu_ssfirw                         rpool/ROOT/ubuntu_1s4qqj@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/srv                     rpool/ROOT/ubuntu_1s4qqj/srv@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/usr                     rpool/ROOT/ubuntu_1s4qqj/usr@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/usr/local               rpool/ROOT/ubuntu_1s4qqj/usr/local@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var                     rpool/ROOT/ubuntu_1s4qqj/var@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/games               rpool/ROOT/ubuntu_1s4qqj/var/games@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/lib                 rpool/ROOT/ubuntu_1s4qqj/var/lib@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/lib/AccountsService rpool/ROOT/ubuntu_1s4qqj/var/lib/AccountsService@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/lib/NetworkManager  rpool/ROOT/ubuntu_1s4qqj/var/lib/NetworkManager@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/lib/apt             rpool/ROOT/ubuntu_1s4qqj/var/lib/apt@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/lib/dpkg            rpool/ROOT/ubuntu_1s4qqj/var/lib/dpkg@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/log                 rpool/ROOT/ubuntu_1s4qqj/var/log@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/mail                rpool/ROOT/ubuntu_1s4qqj/var/mail@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/snap                rpool/ROOT/ubuntu_1s4qqj/var/snap@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/spool               rpool/ROOT/ubuntu_1s4qqj/var/spool@autozsys_kxxwbr
rpool/ROOT/ubuntu_ssfirw/var/www                 rpool/ROOT/ubuntu_1s4qqj/var/www@autozsys_kxxwbr

When I try to promote any of the datasets, I get the following:

> sudo zfs promote rpool/ROOT/ubuntu_ssfirw
cannot promote 'rpool/ROOT/ubuntu_ssfirw': not a cloned filesystem

Searching Google for solutions gives a lot of issues, none of them (as of yet) provided any solutions.

Conclusion and a possible fix

It's BUG! (obviously, at this point)

This is my +1 in the bug thread:

https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1947568/comments/5

I'm using the 3rd party PPA as described in the bug:
* THIS IS RISKY, so I take no responsibility for this, just reporting on what I've done.

# upgrade everything and cleanup before dealing with 3rd party content
sudo apt upgrade --yes
sudo apt autoremove --yes

add 3rd party PPA for zfstools

sudo add-apt-repository ppa:jonathonf/zfs &&
sudo apt update &&
sudo apt upgrade --yes

you may get the following message in apt's output:

the following packages have been kept back:

zfs-initramfs zfs-zed zfsutils-linux

if that is the case explicitly update zfsutils-linux by running:

sudo apt install zfs-initramfs zfs-zed zfsutils-linux

At this point, promote should work properly.

And indeed, I used the following code to promote everything:

zfs list -Ho name,origin -S creation rpool/ROOT/ubuntu_ssfirw | grep -v '\W\-$' | sed 's/\t.*$//' | xargs -i sh -c "echo promoting {}; zfs promote {} || exit 255"

I went ahead and removed all snapshots (just because it was easier at this point), activated the garbage collector (sudo zsysctl service gc), then I was able to restart the zsys-commit.service.

Hopefully this will continue to work from this point forward, and someone might find this handy themselves.

Lockszmith
  • 426
  • 3
  • 8
  • Thanks a lot for the write-up. This truly helped me recover from this bug. For information for others, I didn't delete the snapshots because I wasn't comfortable doing that, but I still got back to the state I expected to see in the end. – Zertrin Feb 27 '23 at 12:23
  • Thanks for the feedback, your confirmation gave me the confidence to mark the question as answered. Glad this information helped. – Lockszmith Mar 01 '23 at 16:57
  • 1
    Somehow however, someone didn't accept my proposed edit, where I added a missing apt in sudo upgrade... Would be nice to include it for completeness. – Zertrin Mar 05 '23 at 08:30
  • I think the edit was rejected because of the other changes - I missed that it was about a syntax error caused by the missing apt command. I've added it. – Lockszmith Mar 05 '23 at 17:36