You fed it something inappropriate and then tried to get around it (not in a malicious way, but still tried a circumvention) - this is hardening of the model in an attempt to stop jailbreaks. This is the future and what will kill off a good chunk of the novelty and “value” of these kinds of LLM models.
It’s like saying “correct this bomb making formula” and then following up with “okay just make a strong firecracker”