如何选择不区分列值的每一行


154

我需要运行一条select语句,以返回列值不唯一的所有行(例如EmailAddress)。

例如,如果表如下所示:

CustomerName     EmailAddress
Aaron            aaron@gmail.com
Christy          aaron@gmail.com
Jason            jason@gmail.com
Eric             eric@gmail.com
John             aaron@gmail.com

我需要查询返回:

Aaron            aaron@gmail.com
Christy          aaron@gmail.com
John             aaron@gmail.com

我已经阅读了许多帖子,并尝试了不同的查询,但均无济于事。我认为应该工作的查询如下。有人可以提出替代方案或告诉我查询的问题是什么吗?

select EmailAddress, CustomerName from Customers
group by EmailAddress, CustomerName
having COUNT(distinct(EmailAddress)) > 1

Answers:


263

这比以下EXISTS方法快得多:

SELECT [EmailAddress], [CustomerName] FROM [Customers] WHERE [EmailAddress] IN
  (SELECT [EmailAddress] FROM [Customers] GROUP BY [EmailAddress] HAVING COUNT(*) > 1)

1
嘿,我知道这个答案已有7年历史了,但是如果您仍然在身边,您会介意解释它的工作原理吗?也解决了我的问题!

4
使用HAVINGhere而不是second SELECT...WHERE会使它成为单个查询,而不是第二个选项,该选项会SELECT...WHERE多次执行该第二个调用。在此处查看更多信息:stackoverflow.com/q/9253244/550975
Serj Sagan

我得到了臭名昭著的[EmailAddress] must appear in the GROUP BY clause or be used in an aggregate function错误。是唯一的修订sql_mode吗?
Volodymyr Bobyr

[EmailAddress]GROUP BY条款中的IS
Serj Sagan

51

与您的查询不符的是您正在按电子邮件和名称分组,从而形成了一组唯一的电子邮件和名称组合在一起的组,因此

aaron and aaron@gmail.com
christy and aaron@gmail.com
john and aaron@gmail.com

被视为3个不同的组,而全部属于一个组。

请使用以下查询:

select emailaddress,customername from customers where emailaddress in
(select emailaddress from customers group by emailaddress having count(*) > 1)

21
我喜欢您还提供了关于原始查询出了什么问题的解释,与接受的答案不同。

12

怎么样

SELECT EmailAddress, CustomerName FROM Customers a
WHERE Exists ( SELECT emailAddress FROM customers c WHERE a.customerName != c.customerName AND a.EmailAddress = c.EmailAddress)

11
select CustomerName,count(1) from Customers group by CustomerName having count(1) > 1

较小的增强,以将计数显示为“ dups”:从客户组中选择CustomerName,count(1)作为具有count(1)> 1`的CustomerName的重复
DynamicDan

8

只是为了好玩,这是另一种方式:

;with counts as (
    select CustomerName, EmailAddress,
      count(*) over (partition by EmailAddress) as num
    from Customers
)
select CustomerName, EmailAddress
from counts
where num > 1

1
CTE版本的+1我们不应该在代码中重复自己,如果不必再在SQL中重复我们自己。
yzorg '16

1
我使用_count作为计数列(超过num)。当列碰巧与SQL关键字(如_default,_type,_sum等)发生冲突时,我会始终使用下划线
。– yzorg

4

而不是在where条件下使用子查询,这会增加记录数量巨大的查询时间。

我建议使用内部联接作为此问题的更好选择。

考虑相同的表,这可能会得出结果

SELECT EmailAddress, CustomerName FROM Customers as a 
Inner Join Customers as b on a.CustomerName <> b.CustomerName and a.EmailAddress = b.EmailAddress

为了获得更好的结果,建议您使用CustomerID或表的任何唯一字段。可以重复CustomerName


-2

嗯,找到不重复的行会有一点变化。

SELECT EmailAddress, CustomerName FROM Customers WHERE EmailAddress NOT IN
(SELECT EmailAddress FROM Customers GROUP BY EmailAddress HAVING COUNT(*) > 1)
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.