22 November 2011

Entity Framework: duplicate rows in resultset from a view

After including a view I have in my database in an Entity Framework data model, I noticed that for some strange reason, the result was showing duplicates. When a set should have for instance 10 different rows, it might contain only 4 different rows, some of which were duplicated so it was still a total of 10 rows.
The view itself showed the correct results, but when called from code using the Entity Framework, it once again showed the wrong results.
After some testing I found out that the rows that were duplicated, were rows that shared certain values with the rows that were replaced by it. After checking this in my Entity Model, I soon found at that the Entity Model has a strange way of handling rows with equal primary key values.

I'll try and explain my findings:

If multiple rows share the same value in the primary column(s), only the first of these seems to be retrieved from the database and copied into these other rows. For instance, if a set has 2 columns, with the first being marked as the primary column, and the correct result should be:

A, 1
A, 2
B, 4
B, 6
B, 7
C, 5


The actual result in code with Entity Framework would become:


A, 1
A, 1
B, 4
B, 4
B, 4
C, 5


When including a view in your Entity Model, the model seems to simply use the first not-nullable columns as primary key (as all columns used in the primary key should be non-nullable).
To fix the problem, make sure that the primary key columns are chosen correctly. If you cannot create the correct primary key because of null-values or simply not having set of columns that differs for each row, try adding a column to your view that always contains a unique value for each row. After you add it, make sure it is set as the primary key in your Entity Model.


If I would set both rows to be the Primary Columns in the above example, I would get the desired result.



The following solution was given by Lee Boozer (thank you, Lee):

Add .AsNoTracking() to your Linq query. When you do this, EF returns the data as is, completely ignoring key definitions.
For instance: 
context.vwItems.AsNoTracking().Where( ....

10 comments:

heritage said...

Thanks mate! I met the same issue and when I add PK to the view, it works now :)

Kamal Saini said...

Super Duper Blog Post...

I am struggling with same issue. But this article helps me a lot.


What i did is add a OW_NUMBER() OVER (ORDER BY R.ID ASC) AS ID for a view.


Then in the EF Model i set this as an Entity Key and thats it....


(I dont know how to set as a Primary key...)

Michael Jepson said...

Hi Kamal,

That's one way of adding a column with a unique value for each row. I prefer adding a PK column of one of the tables in the view, since nothing has to be generated. But if that's not possible, your solution is a reliable option.

I think entity calls the primary key the "entity key", but it's basically the same thing.

Regards,
Michael

kavitha said...

Can you tell me how to add the PK column to view.
see this link :

http://stackoverflow.com/questions/13099663/not-printing-duplicate-values-using-view-in-entity-model/13099702#13099702

This is my view..

CREATE view [dbo].[SalesDetailView]
as

Select datename(yyyy,SH.CreatedDateTime)AS Year,datename(mm,SH.CreatedDateTime) AS Month,
SH.CreatedDateTime as Date,SH.TransactionName as Type,SH.SalesHeaderID as No,Customer.CustomerName as Customer,CustomerGroup.CustomerGroupName as Customer_Group,
SH.Reference as Memo,Item.ItemName as Item ,SD.LineDescription as Item_Description,
Item.ItemType as Item_Type,Item.UOM,ItemGroup.ItemGroupName as Item_Group,
cast (SD.Quantity as int)as Quantity, cast(SD.Amount as money)as Amount,SD.Price,
SD.Discount,SH.ExchangeRate as Exchange_Rate,Currency.CurrencyDescription as Currency,
SD.ClassID as Class_ID,SD.SalesTaxID as SalesTax_ID,
SalesTaxGroup.SalesTaxGroupName as Tax_Group,
Employee.EmployeeName as Salesperson,ShippingMethod.ShippingMethodName as Shipping_Method,
PaymentTerm.PaymentTermName as Payment_Term,
PaymentMethod.PaymentMethodName as Payment_Method

from

SalesHeader SH left Outer join SalesDetail SD on SH.SalesHeaderID=SD.SalesHeaderID
left outer join Item on SD.ItemID=Item.ItemID
Left Outer join ItemGroup on Item.ItemGroupId =ItemGroup.ItemGroupID,Customer
Left outer join CustomerGroup on Customer.CustomerGroupId=CustomerGroup.CustomerGroupID
Left Outer join Employee on Customer.EmployeeID=Employee.EmployeeID
Left outer join Currency on Customer.CurrencyID=Currency.CurrencyID
Left Outer join SalesTaxGroup on Customer.SalesTaxGroupID=SalesTaxGroup.SalesTaxGroupID
Left Outer join PaymentTerm on Customer.PaymentTermID=PaymentTerm.PaymentTermID
Left Outer join ShippingMethod on Customer.ShippingMethodID=ShippingMethod.ShippingMethodID
Left Outer join PaymentMethod on Customer.PaymentMethodID=PaymentMethod.PaymentMethodID
where SH.CustomerID=Customer.CustomerID
and(SH.TransactionName<>'SalesOrder'
and Sh.TransactionName<>'Quote')

GO

Michael Jepson said...

Hi Kavitha,

It depends, what does make a unique identifier for a row in this view? Depending on how all your tables are related, I think you need a combination of an ID for the sale and and ID for each line in the sale. So I think combining SalesHeaderID and ItemID would be a good unique identifier, but it all depends on your data model, really.

Just try to set these as a combined Entity Key and see if it solves your problem.

Regards,
Michael

Anonymous said...

You're my hero of the day.

Anonymous said...

Genius, I'm grateful!

Anonymous said...

Why not try the "AsNoTracking()" method on the ObjectSet?

Lee Boozer said...

If you don't actually need unique records (for update purposes), by far the simplest solution is to add .AsNoTracking() to your Linq query. When you do this, EF returns the data as is, completely ignoring key definitions.

Shaun Chin said...

What you really want is:

ISNULL(ROW_NUMBER() OVER (ORDER BY ColumnName), - 1) AS PrimaryKeyId

to get the EF Model to treat it as a primary key. I'm not sure, but it's best to put this column first, so that EF can "guess" that it's the primary key.