AI does work like that.
With (variational) auto-encoders, it's very explicit.
With shallow convolutional neural networks, it's fun to visualize the trained kernel weights, as they often return an abstract, to me dreamlike, representations of the thing being trained for. Although derived through a different method, search for "eigenfaces" as an example of what I mean.
In the recent hype model architecture, attention and transformers, the encoded state can be thought of as a compressed version of it's input. But human interpretation of those values is challenging.