22/10/2015

TransactionScope and multi-threading

Home

It's my third post about TransactionScope. This time I'll write about using it with multi-threading. Let's start with the following code:
using (var t = new TransactionScope())
{
   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}

private static void UpdateDatabase()
{
   using (var c = new SqlConnection(connectionString))
   {
      c.Open();

      WriteDebugInfo();

      new SqlCommand(updateCommand, c).ExecuteNonQuery();
   }
}

private static void WriteDebugInfo()
{
   Console.WriteLine("Thread= {0}, LocalIdentifier = {1}, DistributedIdentifier = {2}",
      Thread.CurrentThread.ManagedThreadId,
      Transaction.Current?.TransactionInformation.LocalIdentifier,
      Transaction.Current?.TransactionInformation.DistributedIdentifier);
}
It seems simple but it doesn't work. The problem is that a connection that is created in UpdateDatabase method will not participate in any transaction. We can also observe that WriteDebugInfo will write empty transaction identifiers to the console. It happens because in order to read an ambient transaction (the transaction the code is executed in) TransactionScope uses Transaction.Current property which is thread static (i.e. specific for a thread).

To overcome this issue we have two possibilities. The first one is to use DependentTransacion. However, I'll not show how to do it because since .NET 4.5.1 there is a better way - TransactionScopeAsyncFlowOption enum. Let's try.
using (var t = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
   ...
}
Unfortunately, there is a big chance that this time we will get TransactionException with the message The operation is not valid for the state of the transaction. in the line with ExecuteNonQuery. The simplified stack trace is:

at System.Transactions.TransactionStatePSPEOperation.get_Status(InternalTransaction tx)
at System.Transactions.TransactionInformation.get_Status()
...
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at Sandbox.Program.UpdateDatabase(Object o)

I read a lot about this but nobody was able to explain why it happens. I also looked into the source code of TransactionStatePSPEOperation class. It was instructive because I learned what is PSPE - Promotable Single Phase Enlistment. However, it also didn't give me an exact answer.

So, I played a little bit with the code and I noticed that the problem occurs when:
  • One thread tries to run ExecuteNonQuery.
  • Another thread waits for the opening of the connection.
However, when both connections were already opened then the exception wasn't thrown.

At this point it is worth reminding one thing - when there are 2 or more connections opened in a transaction scope at the same time then a transaction is promoted to a distributed one. I'm not 100% sure but I think that the problem occurs because it is not allowed to use a connection which participates in a transaction which is in the middle of promotion to the distributed one. So, the solution is to assure that a transaction will be distributed from the beginning. Here is a fixed code with a magic line (I found it here):

using (var t = new TransactionScope())
{
   //The magic line that makes a transaction distributed
   TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);

   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}
Nonetheless, the more I think about this the more convinced I'm that using TransactionScope with multi-threading is asking for problems.

13/10/2015

How not to use TransactionScope. Another WTF!

Home

This time I will write again about TransactionScope. It is a very useful class and seems to be extremely easy in use. In majority of cases it is true. However, there are also some pitfalls lurking for developers. Especially for these who don't like to waste time for reading MSDN documentation if not really needed i.e. probably vast majority of us ;)

Some time ago, I was analysing more or less the following code:
using(var t = new TransactionScope())
{
   var c = ConnectionProvider.ProvideConnection();
   //Use a connection to update a database
   //...
   t.Complete();
}
ConnectionProvider is a class that hides details of managing connections to a database. There was also a bug in the code responsible for updating a database which caused exceptions. I fixed it and I run tests again. This time an exception was not thrown but something was wrong because a database contained unexpected data. It looked like the transaction was not rollbacked!

Firstly, I though that it is some kind of magic. However, as usual in this kind of cases it wasn't. I digged into ConnectionProvider and I found out that this class was performing some kind of pooling and a connection wasn't opening every time. It was a big problem because connections opened outside a transaction scope do not participate in a transaction. The solution of this problem is to explicitly enlist a connection in an existing transaction scope with the EnlistTransaction method.

It is also worth highlighting that the described problem won't occur if ConnectionProvider doesn't try to implement polling on its own. In general we don't have to do it because .NET do it for us. The problem will also not occur if using statement is used to close a connection returned by a provider.

09/09/2015

TransactionScope + Ninject + a small mistake = WTF

Home

Sometimes one stupid mistake can cost a lot of time. A few days ago my application (AngularJS + ASP.NET Web API) started crashing because of the following error:

MSDTC on server 'XXX' is unavailable

It was strange. I wasn't aware of any distributed transactions in my application. To be honest, I was using TransacionScope but I was sure that there was no reason to promote a lightweight transaction into a distributed one. To make things more strange the error wasn't reported every time. When I tried to update data for the first time everything was ok. However, the second attempt (and every next) was failing.

It took me some time to examine all recent changes but finally I found a problem. It was quite tricky so I decided to write about it. Let's start with the fact that I use Ninject as a dependency injection container. Among others Ninject allow us to control a lifetime of objects (instances). Particularly, in the case of web applications, we can use:
  • InRequestScope method - it tells Ninject that one object of a particular type should be created for each individual request.
  • InSingletonScope method - it tells Ninject that one object of a particular type should be created for all requests.
For example:
kernel.Bind(x => x
   .FromAssembliesMatching("test.dll")
   .SelectAllClasses().InheritedFrom(typeof(IInterfaace))
   .BindAllInterfaces()
   .Configure(z => z.InSingletonScope()));
The problem was that accidentally I mixed InSingletonScope and InRequestScope. For example, let's assume that each request requires objects of two classes A and B. Objects of type A are within the request scope and objects of type B are within the singleton scope.

Both objects perform updates/inserts/deletes and are used inside TransacionScope. For the first request it is not a problem. Both objects are initialized within the same request and use the same database connection. It means that a lightweight transaction is used.

However, for the second (and every next) request an object of type B is re-used whereas a new object of type A is created. Object of type B was initiated in the previous request and it uses a different connection than the one used by an object of type A. It means that a distributed transaction will be used in this case.

To sum up:
  • DI containers give a great power but with the power comes great responsibility.
  • Be careful when using objects of a different scope together. Especially when these objects require data access.
  • Be careful when using multiple connections inside TransacionScope. In the case of MSSQL 2005 in this situation a distributed transaction will be always used. In the case of MSSQL 2008 or newer it is possible to use more than one connection inside TransacionScope without automatic promotion. However, if and only if these connections are not opened at the same time.
  • TransactionScope automatically escalating to MSDTC on some machines? is a great source of knowledge about TransacionScope and about the process of promoting lightweight transactions into distributed ones.

21/08/2015

Do you know OUTPUT clause?

Home

Today, I'll write about using OUTPUT clause together with INSERT statements. It seems to be that it is not a very well known syntax. However, it is especially useful when use Identity columns to generate keys. Let's start with a simple table:
CREATE TABLE dbo.Main
( 
 Id int Identity (1,1) PRIMARY KEY,
 Code varchar(10),
 UpperCode AS Upper(Code)
);
The old fashioned approach to retrieve a value of Identity column for a new row is to use SCOPE_IDENTITY(). For example:
INSERT INTO dbo.Main (Code) VALUES ('aaa');
SELECT SCOPE_IDENTITY();
With OUTPUT clause it will look in the following way:
DECLARE @InsertedIdentity TABLE(Id int);
INSERT INTO dbo.Main (Code) OUTPUT INSERTED.Id INTO @InsertedIdentity VALUES ('aaa')
SELECT TOP(1) * FROM @InsertedIdentity
You can say wait a minute. If I want to use OUTPUT I have to declare a table variable first and then use SELECT. It is more complex than just using SCOPE_IDENTITY().

Well, the first benefit is that with OUTPUT clause we can read values from many columns, including these that are computed (as it was shown above). However, the real power of OUTPUT clause can be observed if we want to insert many rows into a table:
DECLARE @ToBeInserted TABLE(Code varchar(10), Name varchar(100));

INSERT INTO @ToBeInserted
VALUES ('aaa','1111111111'), ('ddd','2222222222'), ('ccc','3333333333');

DECLARE @Inserted TABLE(Id int, Code varchar(10), UpperCode varchar(10));

INSERT INTO dbo.Main (Code)
OUTPUT INSERTED.Id, INSERTED.Code, INSERTED.UpperCode INTO @Inserted
SELECT Code
FROM @ToBeInserted;

SELECT * FROM @Inserted;
Without OUTPUT we would have to write a nasty loop!

Here is one more example. Let's assume that we have an additional table that references dbo.Main.
CREATE TABLE dbo.Child
( 
 MainId int,
 Name varchar(100),
 CONSTRAINT [FK_Child_Main] FOREIGN KEY(MainId)REFERENCES dbo.Main (Id)
);
We want to insert a few rows into dbo.Main and then related rows to dbo.Child. It is quite easy if we use OUTPUT clause.
INSERT INTO dbo.Child (MainId, Name)
SELECT i.Id, tbi.Name
FROM @ToBeInserted tbi
 JOIN @Inserted i ON i.Code = tbi.Code;
Extremely useful thing that you must know!

At the end it is worth mentioning that OUTPUT clause can be also used together with UPDATE, DELETE or MERGE statements.

02/08/2015

Oracle VM VirtualBox and Windows 8.1

Home

In my day to day work I use a 64 bit version of Windows 8.1 Pro N. I needed a virtualization software so I decided to use a free Oracle VM Virtual Box. Everything was ok up to the moment when I wanted to install a x64 version of an operating system on a fresh virtual machine. To my surprise VirtualBox reported the following error:

VT-x/AMD-V hardware acceleration has been enabled, but is not operational. Your 64-bit guest will fail to detect a 64-bit CPU and will not be able to boot.

Please ensure that you have enabled VT-x/AMD-V properly in the BIOS of your host computer.


After some time I noticed that VirtualBox stopped showing 64 bit versions in the Version list. Well, it was actually good because I couldn't use a64 bit virtual machines anyway ;) But, I still didn't know why it happened.

I checked BIOS settings and it seemed ok. I searched Internet for the answer but everyone were recommended to verify BIOS configuration what I've already done. I needed a new VM quickly so at that point I installed a x86 version of Windows.

A few days later my colleague Przemek suggested that the problem may be in the conflict between Hyper-V and VirtualBox and that I should disable Hyper-V. It was strange because I've never installed Hyper-V. However, I checked and I discovered that Hyper-V features were enabled on my computer. It seems to me that they are installed by default with the operating system.



The solution was easy. I pressed Win+S and typed Turn windows feature on or off. Then, I cleared a box next to Hyper-V and restarted computer. After that I was able to install a x64 version of a operating system on a virtual machine.

To sum up, if:
  • Your host system is a x64 version of Windows 8.1.
  • Virtualization is enabled in BIOS.
  • You use VirtualBox.
And you cannot install x64 operating system on a virtual machine then try to disable Hyper-V.

27/07/2015

A hint how to use TaskCompletionSource<T>

Home

Some time ago I wrote about using TaskCompletionSource<T> class in order to take advantage of async/await keywords. In that post I included the following code:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
Now, Let's assume that we want to provide a possibility to cancel a task returned from ProcessFileAsync method. We can do something like that:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
I used CancellationToken.Register method in order to register a callback that will be executed when a token is canceled. This callback is responsible for notifying TaskCompletionSource<T> that underlying task should be cancelled.

You may say that it is not enough because this code doesn't inform DropNetClient that an action should be cancelled. You are right. However, according to my knowledge DropNet API doesn't provide such a possibility.

It leads to the situation when a task is cancelled but DropNetClient continues processing and finnaly TaskCompletionSource.SetResult method will be executed. This will cause ObjectDisposedException because this method cannot be executed for a disposed task. What can we do in this case?

The first solution is to check if a task is cancelled before calling SetResult method. However, it can still happen that a task will be cancelled after this check but before calling SetResult method.

My proposition is to use methods from TaskCompletionSource.Try* family. They don't throw exceptions for disposed tasks.
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.TrySetResult(new MemoryStream(response.RawBytes)), tcs.TrySetException);
   return tcs.Task;
}
I'm aware that it is not a perfect solution because it actually does not cancel processing. However, without modifying DropNet code it is not possible. It the case of my application it is an acceptable solution but it is not a rule.

16/07/2015

Interview Questions for Programmers by MK #5

Home

Question #5
Here you have a very simple implementation of Template method pattern.
public abstract class BaseAlgorithm
{
   protected SomeObject Resource { get; set; }
   //Other resources

   public void Start()
   {
      // Configure
      Resource = new SomeObject();
      //...
      try
      {
         InnerStart();
      }
      finally
      {
         // Clean up
         Resource.Dispose();
         Resource= null;               
         //...
      }
   }

   protected abstract void InnerStart();
}

public class Algorithm1: BaseAlgorithm
{
   protected override void InnerStart()
   {
      //Do something with allocated resources
   }  
}
At some point someone decided to create a new class Algorithm2 derived from BaseAlgorithm. The difference between the new class and the previous one is that Algorithm2 starts an asynchronous operation. A programmer decided to use async/await keywords to handle this scenario. What do you think about this approach? What could possibly go wrong?
public class Algorithm2: BaseAlgorithm
{
   protected async override void InnerStart()
   {
      var task = DoAsyncCalculations();
      await task;

      //Do something with allocated resources
   }

   private Task DoAsyncCalculations()
   {
      //Let's simulate asynchronous operation
      return Task.Factory.StartNew(() => Thread.Sleep(1000));
   }
}
Answer #5
I think that the developer who created Algorithm2 doesn't understand well how async/await keywords work. The main problem is that finally block inside Start method will be executed before DoAsyncCalculations method will end calculations. In other words resources will be disposed in the middle of calculations and this will cause an exception. Sequence of events will be as follows:
  • Start method begins.
  • SomeObject is created.
  • InnerStart method begins.
  • InnerStart method starts an asynchronous operation and uses await to suspend its progress.
  • This causes that control returns to Start method.
  • Start method cleanups resources.
  • When the asynchronous operation is finished InnerStart method continues processing. It tries to use resources, that have been already disposed, what leads to an exception.
It is also not recommended to have async void methods (except event handlers). If an async method doesn't return a task it cannot be awaited. It is also easier to handle exceptions if an async method returns a task. For details see also this article.

To fix a problem BaseAlgorithm must be aware of asynchronous nature of calculations. For example InnerStart method can return a task which will be awaited inside try block. However, it also means that synchronous version of InnerStart method in Algorithm1 will have to be changed. It may not be acceptable. Generally, providing asynchronous wrappers for synchronous methods is debatable and should be carefully considered.

In this case, I'll consider to have separated implementations of Template method pattern for synchronous and asynchronous algorithms.

12/07/2015

Interview Questions for Programmers by MK #4

Home

Question #4
You have to implement a very fast, time critical, network communication between nodes of a distributed system. You decided to use good old sockets. However, you haven't decided yet whether to use TCP or UDP protocol. Which one would you choose and why?

Answer #4
If a speed is the only one important factor, I'd choose UDP. UDP is faster than TCP because it has a smaller overhead. In comparison to TCP it is a connection-less, unreliable protocol that doesn't provide features like retransmission, acknowledgment or ordering of messages.

However, it also means that usage of UDP might be more difficult and will require additional coding in some cases. For example, if we have to assure that sent messages have been delivered. In this case, I'd certainly use TCP.

Finally, there is one more thing in favor of UDP. It provides broadcasting and multicasting. So, if it is required I'd also use UDP instead of TCP.

What every blogger should do if using someone else's code #3

Home

Recently, I've written a post about integration with Dropbox. This one is also about integration but this time with Skydrive. I spent some time on investigating how to implement this feature in a desktop application. I was even considering to write what I need from the scratch on my own.

Fortunately, it turned out that someone did it for me. I found a nice piece of code prepared by Microsoft. It includes sample applications showing how to do everything step by step. You can download it from github.

06/07/2015

A practical example of using TaskCompletionSource<T>

Home

Recently I've found a question about real life scenarios for using rather unknown TaskCompletionSource<T> class. I started thinking where I would use it and very quickly I found a good practical example.

I have a pet project LanguageTrainer that helps me in learning words in foreign languages. Some time ago I added Dropbox support to it. It allows me to export/import list of words to/from Dropbox. I developed it in synchronous way. Now I prefer an asynchronous approach and I want to take advantages of async/await keywords.

The problem is that DropNet library, that makes communication with Dropbox easy, doesn't use async/await. It has asynchronous API but it is callback based. The really easy solution here is to use TaskCompletionSource<T>. Here is an example (simplified). Let's start with the original code that downloads a given file from Dropbox.
public void ProcessFile(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   // ...
   var bytes = client.GetFile(path)
   //Process bytes
}
The version that uses DropNet asynchronous API looks in the following way:
public void ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   client.GetFileAsync(path, 
      response => 
      {
         var bytes = response.RawBytes;
         //Process bytes
      }, 
      ex => 
      {
         //Handle exception
      });
}
And finally the asynchronous version with async/await looks in the following way:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
...
var bytes = await ProcessFileAsync(key, secret, path);
//Process bytes
The method ProcessFileAsync is marked as async and returns a task so it can be awaited. Easy. isn't it? A few lines of code and you can use async/await with other types of asynchronous APIs.

02/07/2015

Careercon Warszawa 2015 - surveys

Home

I'd like to share with you the results of 2 surveys regarding my last presentation. The first one was prepared by organizers of the conference. The second one was created by me on SurveyMonkey portal. I placed a link to it on each slide in the lower  left corner. It was visible through the whole presentation so anyone with a smartphone could have filled it. The survey was short and consisted of 3 obligatory and 2 voluntary questions.

Let's start with the official survey of the conference. According to it there were ~50 people on my presentation. ~30 said that it was very good, ~18 that it was good and ~2 that it was week. In average it gives 4,56. These are very good results for me, thanks!



My survey looked in the following way:
  • Question 1 - What is your overall assessment of my presentation on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 2 - How do you assess the substantive content of my presentation on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 3 - How would you rate my way of presenting on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 4 - Name one thing that you remember from my presentation.
  • Question 5 - This is the place for all sorts of comments, applications and complaints. Each comment will be valuable for me.
I received 8 answers for these questions. Much less than for the previous survey but all the answers were extremely valuable to me. Thanks! Here are details:


12345Average
Question 11434,25
Question 2444,50
Question 3354,63

I also received a couple of comments that will allow me to be a better speaker in the future:
  • The demo should have been a little bit longer.
  • The sample code used during a demo could be more complex.
  • I should have used a microphone.
  • I should have spoken slower.
  • Not all terms used by me were understandable for a layman.
It was nice to read that my presentation was useful and that I showed interesting tools.

As a summary I have to say that these results are a boost of energy for me and an encouragement to be a speaker more often.

27/06/2015

Careercon Warszawa 2015 - already after the presentation

Home

A couple of hours ago I returned from the conference Carrercon Warszawa 2015 where I had a presentation about historical debuggers. I'm generally happy with my speech. There were dozens of people in the room. I didn't forget to tell about anything important, I wasn't stressed too much and my presentation took as much time as I planned. As always I see a room for improvement but it was a good decision to take part in the conference as a speaker.

In the next post I want to write much more about Careercon Warszawa 2015 and my presentation. Now, I'd like to ask everybody who attended my presentation and who are reading this post for one thing. The thing which is very important to me. I'll be happy if you fill the following short survey:

Click to start a survey

Any feedback will be extremely valuable for me.

22/06/2015

CareerCon Warszawa 2015

Home

On Saturday (27-06) I'll have a presentation about historical debuggers at the conference "CareerCon Warszawa". I'll tell what they are and why you should use them. I'll also explain how IntelliTrace historical debugger for .NET works under the hood. If you have time I'll be glad to see you! The presentation will be in Polish.

Conference site
Fanpage

For now that's all. After the conference you can expect a post about how it looked like.

15/06/2015

Interview Questions for Programmers by MK #3

Home

Question #3
You found the following code and were asked to refactor it if needed:
var sb = new StringBuilder();
sb.AppendLine("<root>")
sb.AppendLine(String.Format("   <node id="{0}"/>", 1));
sb.AppendLine(String.Format("   <node id="{0}"/>", 2));
sb.AppendLine(String.Format("   <node id="{0}"/>", 3));
//Many, many lines of code
sb.AppendLine("</root>");
What would you do and why?

Answer #3
It is not the best idea to create XML documents using string concatenation because it is error prone. Besides created documents are not validated in any way. In .NET we have a few possibilities to refactor this code.

I recommend to use XmlWriter in this case because we want to create a new document and we do not want to edit an existing one. However, if we also want to modify existing XML documents, the good choice will be XDocument or XmlDocument class.

In the case of small XML documents (when performance is not critical) it might be a good idea to use XDocument or XmlDocument even if we don't want to edit existing documents. Especially XDocument can be simpler in use than XmlWriter.

Comments #3
I remember that when I wanted to create an XML document in C# for the first time I did it by using string concatenation. The code worked ok and I was surprised that it didn't pass a review :)

12/06/2015

What do you know about low-level programming?

Home

Have you ever written anything in a low-level assembly language? The last time, I was in touch with low-level programming, was during my studies. I wrote something in x86 and MIPS assembly languages. It was not easy but I liked it and I think that every good developer should know basics of programming at low-level. Why I'm writing about that?

Recently, I've found a very good online game known as microcorruption which reminded me good old times. The goal od this game is to open a lock by exploiting bugs in the source code. In order to do so you have to use a debugger of MSP430 assembly language.


At the beginning, initial tasks seems to be very easy e.g. a password can be hardcoded in the source code. However, if you haven't worked with any low-level language for many years even so simple task can be challenging. Besides, every next task is more and more difficult.

microcorruption is a great game if you want to remind yourself things like registers, a calling convention, a stack, low-level addressing and many others.

I started playing and I cannot stop! I encourage you to try.