I suppose most sensible people already know that ChatGPT is not the answer for medical diagnosis.
Prompts were input to the GPT-3.5-turbo-0301 model via the ChatGPT (OpenAI) interface.
If the researcher wanted to investigate whether LLM is helpful, they should develop a model specifically using cancer treatment plans with GPT-4/3.5 before testing it thoroughly, in addition to entering prompts into the model that is available on OpenAI.
Seems like the flawed interviewing process is to blame.