9
submitted 3 days ago* (last edited 3 days ago) by roadrunner_ex@lemmy.ca to c/sql@programming.dev

Putting aside any opinions on performance, I've been trying to test a notion about whether a couple queries would output the same data (ordering doesn't matter).

SELECT *
FROM articles
WHERE (
  last_updated >= %s
  OR id IN (1, 2, 3)
  )
  AND created_at IS NOT NULL
SELECT *
FROM articles
WHERE last_updated >= %s
  AND created_at IS NOT NULL
UNION
SELECT *
FROM articles
WHERE id IN (1, 2, 3)
  AND created_at IS NOT NULL

I think they're equivalent, but I can't prove it to myself.

Edit: Aye, looking at the replies, I'm becoming aware that I left out a couple key assumptions I've made. Assuming:

a) id is a PRIMARY KEY (or otherwise UNIQUE)

b) I mean equivalent insofar as "the rows returned will contain equivalent data same (though maybe ordered differently)"

you are viewing a single comment's thread
view the rest of the comments
[-] Deebster@programming.dev 1 points 3 days ago

It's equivalent because UNION removes duplicates; the behaviour you're describing happens with UNION ALL. Since both queries are article.*, both halves will have the same columns and the dedupe will be successful.

UNION is less efficient because of this deduplication, but it's the default since that's what most people want. If that matters then you'd be correct that a JOIN version will be more efficient (possibly depending on indexes present and sql engine).

this post was submitted on 10 Aug 2025
9 points (100.0% liked)

SQL

1034 readers
1 users here now

Related Fediverse communities:

Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient

founded 2 years ago
MODERATORS