Given that history, why on earth would I even be thinking about removing CONFIG_RCU_FAST_NO_HZ, let alone queuing a patch series intended for the v5.17 merge window???
The reason is that everyone I know of who builds their kernels with CONFIG_RCU_FAST_NO_HZ=y also boots those systems with each and every CPU designated as a rcu_nocbs CPU. With this combination, CONFIG_RCU_FAST_NO_HZ=y is doing nothing but placing a never-taken branch in the fastpath to and from idle. Such systems should therefore run slightly faster and with slightly better battery lifetime if their kernels were instead built with CONFIG_RCU_FAST_NO_HZ=n, which would get rid of that never-taken branch.
But given that battery-powered embedded folks badly wanted CONFIG_RCU_FAST_NO_HZ=y, and given that they are no longer getting any benefit from it, why on earth haven't they noticed?
The have not noticed because rcu_nocbs CPUs do not invoke their own RCU callbacks. This work is instead delegated to a set of per-CPU rcuoc kthreads, with a smaller set of rcuog kthreads managing those callbacks and requesting grace periods as needed. By default, these rcuoc and rcuog kthreads are not bound, which allows both the scheduler (and for that matter, the systems administrator) to take both performance and energy efficiency into account and to run those kthreads wherever is appropriate at any given time. In contrast, non-rcu_nocbs CPUs will always run their own callbacks, even if that means powering up an inconveniently placed portion of the system at an inconvenient time. This includes CONFIG_RCU_FAST_NO_HZ=y kernels, whose only advantage is that they power up inconveniently placed portions of systems at inconvenient times only 25% as often as would a non-rcu_nocbs CPU in a CONFIG_RCU_FAST_NO_HZ=n kernel.
In short, the rcu_nocbs CPUs' practice of letting the scheduler decide where to run the callbacks is especially helpful on asymmetric systems (AKA big.LITTLE systems), as shown by data collected by Dietmar Eggeman and Robin Randhawa. This point is emphasized by the aforementioned fact that everyone I know of who builds their kernels with CONFIG_RCU_FAST_NO_HZ=y also boots those systems with each and every CPU designated as a rcu_nocbs CPU.
So if no one is getting any benefit from building their kernels with CONFIG_RCU_FAST_NO_HZ=y, why keep that added complexity in the Linux kernel? Why indeed, and hence the patch series intended for the v5.17 merge window.
So if you know of someone who is getting significant benefit from CONFIG_RCU_FAST_NO_HZ=y who could not get that benefit from booting with rcu_nocbs CPUs, this would be a most excellent time to let me know!