29/12/2015

Report from the battlefield #2 - amount of data matters a lot

Home

In the next post from the Report from the battlefield series I'll wrote about a serious mistake that is quite common according to my experience. I'm thinking about a situation when a developer assumes that all data from a database can be processed on the client side. I'll give you 2 examples that I encountered during my reviews.

Case 1

A developer was asked to implement the paging functionality. He created a single page Web application. It looked nice and the paging was working correctly at first glance. I decided to check how it was implemented under the hood. I examined a web service that was used by the application and I was surprised. Why? I didn't find a web method responsible for returning pages. The next step was to dig into a java script code. Unfortunately, I discovered that the paging was implemented only on the client side i.e. the application initially downloaded all data from a database (via Web Service).

Case 2

In the another project the paging was implemented correctly on the server side but a developer made a more subtle mistake. An application had a shopping cart function. Of course it was possible to add and remove products to and from a cart. To do so a web service used by the application had a method GetCart. This method was responsible for retrieving a current content of a cart from a database.

However, it was strange that this method returned only identifiers of products. What's more there was no GetProductDetails web method. It made me curious how the application displays products details to users only knowing its identifiers. It turned out that at the initiation the application was reading details of all products from a database. Having all products on the client side it was easy to find details of a product based on its identifier.

Summary

In both cases applications were fast enough because of a small amount of data. In the case of real-life databases they will not. I think that we should always be prepared for the worst case. Especially, when we participate in a recruitment and we want to show ourselves from the best side. An evaluator shouldn't guess whether we know something or not.

24/12/2015

Merry Christmas!

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy


Giving wishes in a foreign language can be challenging so my wishes will be simple but very sincere. I wish you a Merry, Peaceful Christmas and an Amazing 2016. Let it be at least as good as the past year.

Best wishes,
Michał Komorowski

24/11/2015

Report from the battlefield #1 - EF and DTOs

Home

Some time ago, I started doing code reviews of various projects for the recruitment company. It is an interesting experience and I'm learning a lot by this occasion. I also observed that some mistakes are repeated by different authors. Other are not so common but are not obvious. So I came up with the idea to start a new series of posts under the title "Report from the battlefield". In this series I'll describe my observations and findings from my reviews.

Let's start. Recently, I reviewed a project created with AngularJS + ASP.NET Web API + Entity Framework. The code was neither very good nor very bad. However, I noticed that the author decided to use a class generated from the EDMX model as DTO (Data Transfer Object). The reasoning behind this decision was simple - this class had all properties required on the client side so why not to use it. Well there are a few reasons why it is not a good idea.
  • With dedicated DTOs it is less possible that changes on the server side will affect the client side.
  • With dedicated DTOs we can easily control what will be send to the client side and in what format.
  • With dedicated DTOs the server side model can be completely different from the client side model.
  • By exposing EF classes to the client side we effectively expose the database model to the client side!
You may agree with my points or not. So, I'll give you a practical example what could happen if we use EF classes as DTOs. Let's assume that there is EDMX model with 3 types of entities:
  • Customer with Orders navigation property.
  • Orders with Customer and Products navigation properties.
  • Products with Orders navigation property.
Now we want to read only 1 customer from a database, serialize it to JSON and send the result to the client side. What could go wrong? Well, because of the navigation properties the JSON serializer that is used by ASP.NET Web API will read from the database and convert to JSON the whole graph of customers, orders and products! To be more specific, I saw 0.5 MB response which should have a few kilobytes for a very small database (it contained small dozens of records in all tables)! I can bet that in the case of a production database a response would have hundreds of megabytes.

15/11/2015

Interview Questions for Programmers by MK #6

Home

Question #6
What is the arithmetic overflow and how is it handled in .NET?

Answer #6
It is a situation when the result of an arithmetic operation exceeds (is outside of) the range of a given numeric type. For example the maximum value for byte type in .NET is 255. So in the following example, an operation a+b will cause an overflow:
byte a = 255;
byte b = 20;
byte c = a + b;
The final result depends on the used numeric types:
  • For integer types either OverflowException will be thrown or the result will be trimmed/cropped (the default behaviour). It depends on the compiler configuration and usage of checked / unchecked keywords.
  • For floating point types OverflowException will never be thrown. Instead the overflow will lead either to the positive or the negative infinity.
  • For decimal type OverflowException will be always thrown.
var b = byte.MaxValue;
//The result will be zero because:
//b = 255 = 1111 1111 
//b++ = 256 = 1 0000 0000
//The result has 9 bits so the result will be trimmed to 8 bits what gives 0000 0000
b++; 
         
checked
{
 b = byte.MaxValue;
 //Exception will be thrown 
 b++; 
}

var f = float.MaxValue;
//The result will be float.PositiveInfinity
f *= 2;  

decimal d = decimal.MaxValue;
//Exception will be thrown
d++; 

22/10/2015

TransactionScope and multi-threading

Home

It's my third post about TransactionScope. This time I'll write about using it with multi-threading. Let's start with the following code:
using (var t = new TransactionScope())
{
   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}

private static void UpdateDatabase()
{
   using (var c = new SqlConnection(connectionString))
   {
      c.Open();

      WriteDebugInfo();

      new SqlCommand(updateCommand, c).ExecuteNonQuery();
   }
}

private static void WriteDebugInfo()
{
   Console.WriteLine("Thread= {0}, LocalIdentifier = {1}, DistributedIdentifier = {2}",
      Thread.CurrentThread.ManagedThreadId,
      Transaction.Current?.TransactionInformation.LocalIdentifier,
      Transaction.Current?.TransactionInformation.DistributedIdentifier);
}
It seems simple but it doesn't work. The problem is that a connection that is created in UpdateDatabase method will not participate in any transaction. We can also observe that WriteDebugInfo will write empty transaction identifiers to the console. It happens because in order to read an ambient transaction (the transaction the code is executed in) TransactionScope uses Transaction.Current property which is thread static (i.e. specific for a thread).

To overcome this issue we have two possibilities. The first one is to use DependentTransacion. However, I'll not show how to do it because since .NET 4.5.1 there is a better way - TransactionScopeAsyncFlowOption enum. Let's try.
using (var t = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
   ...
}
Unfortunately, there is a big chance that this time we will get TransactionException with the message The operation is not valid for the state of the transaction. in the line with ExecuteNonQuery. The simplified stack trace is:

at System.Transactions.TransactionStatePSPEOperation.get_Status(InternalTransaction tx)
at System.Transactions.TransactionInformation.get_Status()
...
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at Sandbox.Program.UpdateDatabase(Object o)

I read a lot about this but nobody was able to explain why it happens. I also looked into the source code of TransactionStatePSPEOperation class. It was instructive because I learned what is PSPE - Promotable Single Phase Enlistment. However, it also didn't give me an exact answer.

So, I played a little bit with the code and I noticed that the problem occurs when:
  • One thread tries to run ExecuteNonQuery.
  • Another thread waits for the opening of the connection.
However, when both connections were already opened then the exception wasn't thrown.

At this point it is worth reminding one thing - when there are 2 or more connections opened in a transaction scope at the same time then a transaction is promoted to a distributed one. I'm not 100% sure but I think that the problem occurs because it is not allowed to use a connection which participates in a transaction which is in the middle of promotion to the distributed one. So, the solution is to assure that a transaction will be distributed from the beginning. Here is a fixed code with a magic line (I found it here):

using (var t = new TransactionScope())
{
   //The magic line that makes a transaction distributed
   TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);

   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}
Nonetheless, the more I think about this the more convinced I'm that using TransactionScope with multi-threading is asking for problems.

13/10/2015

How not to use TransactionScope. Another WTF!

Home

This time I will write again about TransactionScope. It is a very useful class and seems to be extremely easy in use. In majority of cases it is true. However, there are also some pitfalls lurking for developers. Especially for these who don't like to waste time for reading MSDN documentation if not really needed i.e. probably vast majority of us ;)

Some time ago, I was analysing more or less the following code:
using(var t = new TransactionScope())
{
   var c = ConnectionProvider.ProvideConnection();
   //Use a connection to update a database
   //...
   t.Complete();
}
ConnectionProvider is a class that hides details of managing connections to a database. There was also a bug in the code responsible for updating a database which caused exceptions. I fixed it and I run tests again. This time an exception was not thrown but something was wrong because a database contained unexpected data. It looked like the transaction was not rollbacked!

Firstly, I though that it is some kind of magic. However, as usual in this kind of cases it wasn't. I digged into ConnectionProvider and I found out that this class was performing some kind of pooling and a connection wasn't opening every time. It was a big problem because connections opened outside a transaction scope do not participate in a transaction. The solution of this problem is to explicitly enlist a connection in an existing transaction scope with the EnlistTransaction method.

It is also worth highlighting that the described problem won't occur if ConnectionProvider doesn't try to implement polling on its own. In general we don't have to do it because .NET do it for us. The problem will also not occur if using statement is used to close a connection returned by a provider.

09/09/2015

TransactionScope + Ninject + a small mistake = WTF

Home

Sometimes one stupid mistake can cost a lot of time. A few days ago my application (AngularJS + ASP.NET Web API) started crashing because of the following error:

MSDTC on server 'XXX' is unavailable

It was strange. I wasn't aware of any distributed transactions in my application. To be honest, I was using TransacionScope but I was sure that there was no reason to promote a lightweight transaction into a distributed one. To make things more strange the error wasn't reported every time. When I tried to update data for the first time everything was ok. However, the second attempt (and every next) was failing.

It took me some time to examine all recent changes but finally I found a problem. It was quite tricky so I decided to write about it. Let's start with the fact that I use Ninject as a dependency injection container. Among others Ninject allow us to control a lifetime of objects (instances). Particularly, in the case of web applications, we can use:
  • InRequestScope method - it tells Ninject that one object of a particular type should be created for each individual request.
  • InSingletonScope method - it tells Ninject that one object of a particular type should be created for all requests.
For example:
kernel.Bind(x => x
   .FromAssembliesMatching("test.dll")
   .SelectAllClasses().InheritedFrom(typeof(IInterfaace))
   .BindAllInterfaces()
   .Configure(z => z.InSingletonScope()));
The problem was that accidentally I mixed InSingletonScope and InRequestScope. For example, let's assume that each request requires objects of two classes A and B. Objects of type A are within the request scope and objects of type B are within the singleton scope.

Both objects perform updates/inserts/deletes and are used inside TransacionScope. For the first request it is not a problem. Both objects are initialized within the same request and use the same database connection. It means that a lightweight transaction is used.

However, for the second (and every next) request an object of type B is re-used whereas a new object of type A is created. Object of type B was initiated in the previous request and it uses a different connection than the one used by an object of type A. It means that a distributed transaction will be used in this case.

To sum up:
  • DI containers give a great power but with the power comes great responsibility.
  • Be careful when using objects of a different scope together. Especially when these objects require data access.
  • Be careful when using multiple connections inside TransacionScope. In the case of MSSQL 2005 in this situation a distributed transaction will be always used. In the case of MSSQL 2008 or newer it is possible to use more than one connection inside TransacionScope without automatic promotion. However, if and only if these connections are not opened at the same time.
  • TransactionScope automatically escalating to MSDTC on some machines? is a great source of knowledge about TransacionScope and about the process of promoting lightweight transactions into distributed ones.

21/08/2015

Do you know OUTPUT clause?

Home

Today, I'll write about using OUTPUT clause together with INSERT statements. It seems to be that it is not a very well known syntax. However, it is especially useful when use Identity columns to generate keys. Let's start with a simple table:
CREATE TABLE dbo.Main
( 
 Id int Identity (1,1) PRIMARY KEY,
 Code varchar(10),
 UpperCode AS Upper(Code)
);
The old fashioned approach to retrieve a value of Identity column for a new row is to use SCOPE_IDENTITY(). For example:
INSERT INTO dbo.Main (Code) VALUES ('aaa');
SELECT SCOPE_IDENTITY();
With OUTPUT clause it will look in the following way:
DECLARE @InsertedIdentity TABLE(Id int);
INSERT INTO dbo.Main (Code) OUTPUT INSERTED.Id INTO @InsertedIdentity VALUES ('aaa')
SELECT TOP(1) * FROM @InsertedIdentity
You can say wait a minute. If I want to use OUTPUT I have to declare a table variable first and then use SELECT. It is more complex than just using SCOPE_IDENTITY().

Well, the first benefit is that with OUTPUT clause we can read values from many columns, including these that are computed (as it was shown above). However, the real power of OUTPUT clause can be observed if we want to insert many rows into a table:
DECLARE @ToBeInserted TABLE(Code varchar(10), Name varchar(100));

INSERT INTO @ToBeInserted
VALUES ('aaa','1111111111'), ('ddd','2222222222'), ('ccc','3333333333');

DECLARE @Inserted TABLE(Id int, Code varchar(10), UpperCode varchar(10));

INSERT INTO dbo.Main (Code)
OUTPUT INSERTED.Id, INSERTED.Code, INSERTED.UpperCode INTO @Inserted
SELECT Code
FROM @ToBeInserted;

SELECT * FROM @Inserted;
Without OUTPUT we would have to write a nasty loop!

Here is one more example. Let's assume that we have an additional table that references dbo.Main.
CREATE TABLE dbo.Child
( 
 MainId int,
 Name varchar(100),
 CONSTRAINT [FK_Child_Main] FOREIGN KEY(MainId)REFERENCES dbo.Main (Id)
);
We want to insert a few rows into dbo.Main and then related rows to dbo.Child. It is quite easy if we use OUTPUT clause.
INSERT INTO dbo.Child (MainId, Name)
SELECT i.Id, tbi.Name
FROM @ToBeInserted tbi
 JOIN @Inserted i ON i.Code = tbi.Code;
Extremely useful thing that you must know!

At the end it is worth mentioning that OUTPUT clause can be also used together with UPDATE, DELETE or MERGE statements.

02/08/2015

Oracle VM VirtualBox and Windows 8.1

Home

In my day to day work I use a 64 bit version of Windows 8.1 Pro N. I needed a virtualization software so I decided to use a free Oracle VM Virtual Box. Everything was ok up to the moment when I wanted to install a x64 version of an operating system on a fresh virtual machine. To my surprise VirtualBox reported the following error:

VT-x/AMD-V hardware acceleration has been enabled, but is not operational. Your 64-bit guest will fail to detect a 64-bit CPU and will not be able to boot.

Please ensure that you have enabled VT-x/AMD-V properly in the BIOS of your host computer.


After some time I noticed that VirtualBox stopped showing 64 bit versions in the Version list. Well, it was actually good because I couldn't use a64 bit virtual machines anyway ;) But, I still didn't know why it happened.

I checked BIOS settings and it seemed ok. I searched Internet for the answer but everyone were recommended to verify BIOS configuration what I've already done. I needed a new VM quickly so at that point I installed a x86 version of Windows.

A few days later my colleague Przemek suggested that the problem may be in the conflict between Hyper-V and VirtualBox and that I should disable Hyper-V. It was strange because I've never installed Hyper-V. However, I checked and I discovered that Hyper-V features were enabled on my computer. It seems to me that they are installed by default with the operating system.



The solution was easy. I pressed Win+S and typed Turn windows feature on or off. Then, I cleared a box next to Hyper-V and restarted computer. After that I was able to install a x64 version of a operating system on a virtual machine.

To sum up, if:
  • Your host system is a x64 version of Windows 8.1.
  • Virtualization is enabled in BIOS.
  • You use VirtualBox.
And you cannot install x64 operating system on a virtual machine then try to disable Hyper-V.