First, applicant argues that the mark is not merely descriptive because consumers will not immediately
understand what the underlying wording "generative pre-trained transformer" means. The trademark
examining attorney is not convinced. The previously and presently attached Internet evidence
demonstrates the extensive and pervasive use in applicant's software industry of the acronym "GPT" in
connection with software that features similar AI technology with ask and answer functions based on
pre-trained data sets; the fact that consumers may not know the underlying words of the acronym does
not alter the fact that relevant purchasers are adapted to recognizing that the term "GPT" is commonly
used in connection with software to identify a particular type of software that features this AI ask and
answer technology. Accordingly, this argument is not persuasive.
I don't know enough to know whether or not that's true. My understanding was that Google's Deep mind invented the transformer architecture with their paper "all you need is attention." A lot, if not most, LLMs use a transformer architecture, though your probably right a lot of them base it on the open source models OpenAI made available. The "generative" part is just descriptive of the model generating outputs (as opposed to classification and the like), and pre trained just refers to the training process.
But again I'm a dummy so you very well may be right.
The attention paper from Google introduced transformers, OpenAI introduced generative pretraining as a technique that allows transformers to achieve very good performance on downstream tasks with very little additional fine tuning. This paper and the subsequent release of the pretrained GPT models directly lead to the LLM boom.
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf