SQL Server - 如何插入记录并确保它是唯一的

2022-01-09 00:00:00 unique insert sql-server

I'm trying to figure out the best way to insert a record into a single table but only if the item doesn't already exist. The KEY in this case is an NVARCHAR(400) field. For this example, lets pretend it's the name of a word in the Oxford English Dictionary / insert your fav dictionary here. Also, i'm guessing i will need to make the Word field a primary key. (the table will also have a unique identifier PK also).

So .. i might get these words that i need to add to the table...

eg.

  • Cat
  • Dog
  • Foo
  • Bar
  • PewPew
  • etc...

So traditionally, i would try the following (pseudo code)

SELECT WordID FROM Words WHERE Word = @Word
IF WordID IS NULL OR WordID <= 0
    INSERT INTO Words VALUES (@Word)

ie. If the word doesn't exist, then insert it.

Now .. the problem i'm worried about is that we're getting LOTS of hits .. so is it possible that the word could be inserted from another process in between the SELECT and the INSERT .. which would then throw a constraint error? (ie. a Race Condition).

I then thought that i might be able to do the following ...

INSERT INTO Words (Word)
SELECT @Word
WHERE NOT EXISTS (SELECT WordID FROM Words WHERE Word = @Word)

basically, insert a word when it doesn't exist.

Bad syntax aside, i'm not sure if this is bad or good because of how it locks down the table (if it does) and is not that performant on a table that it getting massive reads and plenty of writes.

So - what do you Sql gurus think / do?

I was hoping to have a simple insert and 'catch' that for any errors thrown.

解决方案

Your solution:

INSERT INTO Words (Word)
    SELECT @Word
WHERE NOT EXISTS (SELECT WordID FROM Words WHERE Word = @Word)

...is about as good as it gets. You could simplify it to this:

INSERT INTO Words (Word)
    SELECT @Word
WHERE NOT EXISTS (SELECT * FROM Words WHERE Word = @Word)

...because EXISTS doesn't actually need to return any records, so the query optimiser won't bother looking at which fields you asked for.

As you mention, however, this isn't particularly performant, because it'll lock the whole table during the INSERT. Except that, if you add a unique index (it doesn't need to be the primary key) to Word, then it'll only need to lock the relevant pages.

Your best option is to simulate the expected load and look at the performance with SQL Server Profiler. As with any other field, premature optimisation is a bad thing. Define acceptable performance metrics, and then measure before doing anything else.

If that's still not giving you adequate performance, then there's a bunch of techniques from the data warehousing field that could help.

相关文章