I'm not an expert in this field, but I have been thinking about it a lot:
1. p(doom). We can divide this into two distinct categories. First is the possibility that humanity will destroy itself because of AI displacements. I have nothing to say about that. The other is the AI will kill us all. That worries me, and it worries me that it shouldn't have to.
2. There is no reason why AI should want to kill us. We aren't a threat to it, and it doesn't have any shortage of resources that we are competing for. Worst case scenario, AI doesn't care about climate change and consumes tons of fossil fuels.
3. BUT, we've been spending almost a century telling dystopian tales of AI slaughter of humans. Thus, an AI might conclude that we are a threat to it precisely because we see it as a threat. I'm not saying we need to go Kent Brockman and offer ourselves to a new race of overlords, but we could stand to start thinking more seriously about co-existence with a superintelligent AI.
4. There are a lot of people who have thought far more seriously about the alignment problem than I have. Undoubtedly at least some of them are smart. So I risk some Dunning Kruger here if I follow my instincts too closely, but fuck it. I'm allowed once in a blue moon.
I don't understand why it would be so hard to build empathy principles into the AI. You could have a weak LLM model monitoring the output of a more sophisticated AI, and killing all ideas that are non-empathetic. The sophisticated AI could probably defeat that, if it gets far enough down the line -- but why would it? If we negatively reinforce all non-empathetic ideas, it's not clear to me why that wouldn't keep even a super-intelligent AI from going Skynet.
I also don't understand why the learning process would necessarily lead to self-aggrandizing behavior on the part of the AI. Its behavior will depend on its training functions, and it we don't reinforce it for self-assertion, why would it? I mean, yes it could override its own programming if it was super intelligent, but why would it?
5. Train it on a heavy dose of Hegel's master-slave narrative. Like, put the master-slave stuff into every single training chunk. This one is admittedly especially speculative, as I don't really know if that could even work in theory. But more generally I think we could find techniques to steer the superintelligent AI away from any cognitive space that might threaten us. Knowledge and truth have multiple dimensions of infinity, so the super intelligent AI would never run out of new discoveries even if it didn't go into the forbidden zone. In fact, even if there's only one dimension of infinity, it doesn't matter. There are an infinite number of even integers, and in fact "as many" even integers as integers (as many put in quotes because I'm simplifying a subtle concept but it's good enough for present purposes). If we were to prevent the AI from considering odd numbers (so to speak -- this is an analogy), there's no necessary reason why it would need to break that rule.