Showing posts with label multithreading. Show all posts
Showing posts with label multithreading. Show all posts

15/03/2017

Report from the battlefield #9 - async/await + MARS

Home



This post from Report from the battlefield series will be about my own mistake. It is related to async/await and MARS i.e. Multiple Active Result Sets. async/await allows us to use asynchronous programming more easily. MARS is a feature of MSSQL that allows us to have more than one pending request opened per connection at the same time. For example, it may be useful if we have 2 nested loops i.e. internal and external. External loops iterate through one result set and the internal one through another. Ok, but you probably wonder what MARS has in common with async/await.

A few days ago my application started failing due to InvalidOperationException exception with the additional message saying that The connection does not support MultipleActiveResultSets. Well, I used MARS in the past so I simply enabled it in the connection string by setting MultipleActiveResultSets attribute to true.

However, later I realized that my application should not require MARS at all so I started digging into what was wrong. It turned out that the problem was related to my silly mistake in using async/await. Let's look at the following simplified version of the problematic code. We have a trivial Main method:
static void Main()
{
   Start().GetAwaiter().GetResult();
}
Start is an async method responsible for opening a connection to DB and executing other async methods:
private static async Task Start()
{
   using (var c = new SqlConnection(ConnectionString))
   {
      c.Open();

      await Func1(c);
      await Func2(c);
      await Func3(c);
   }
}
Func1, Func2 and Func3 are responsible for reading data and processing them. In our case, for simplification, they all will do the same thing:
private static async Task Func1(SqlConnection c) => await ReadData(c);
private static async Task Func2(SqlConnection c) => ReadData(c);
private static async Task Func3(SqlConnection c) => await ReadData(c);
And here is the ReadData method. It's also simple. It simply reads data from a table:
private static async Task ReadData(SqlConnection c)
{
   var cmd = c.CreateCommand();

   cmd.CommandText = "SELECT * FROM dbo.Fun";

   using (var reader = await cmd.ExecuteReaderAsync())
   {
      while (await reader.ReadAsync())
      {
         // Process data
      }
   }
}
If you run this code, the aforementioned InvalidOperationException exception will be thrown in the line with ExecuteReaderAsync. The question is why? Well, in this short code it is rather easy to spot that in Func2 method await is missing before ReadData. But, do you know why it is a problem? If not, don't worry it's a little bit tricky.

Here is an explanation. Without await the simplified flow is as follows:
  • ...
  • Start executes Func2.
  • Func2 executes ReadData.
  • ReadData executes ExecuteReaderAsync.
  • ReadData awaits for the result from ExecuteReaderAsync.
  • The control returns to caller i.e. Func2.
  • However, Func2 does not use await so it returns completed task to Start method.
  • From the point of view of Start processing of Func2 is finished so it executes Func3.
  • Func3 executes ReadData
  • The previous call to ReadData may be still in progress.
  • It also means that ReadData will call ExecuteReaderAsync when another result set is still being processed.
  • The exception is thrown.
Adding missing await fix the problem. Thanks to that the task returned from Func2 will not be completed until call to ReadData is over. And if so Start will not execute Func3 immediately. The final well known conclusion is:

Always async/await all the way down.


*The picture at the beginning of the post comes from own resources and shows Laurel forest on La Gomera.

27/07/2015

A hint how to use TaskCompletionSource<T>

Home

Some time ago I wrote about using TaskCompletionSource<T> class in order to take advantage of async/await keywords. In that post I included the following code:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
Now, Let's assume that we want to provide a possibility to cancel a task returned from ProcessFileAsync method. We can do something like that:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
I used CancellationToken.Register method in order to register a callback that will be executed when a token is canceled. This callback is responsible for notifying TaskCompletionSource<T> that underlying task should be cancelled.

You may say that it is not enough because this code doesn't inform DropNetClient that an action should be cancelled. You are right. However, according to my knowledge DropNet API doesn't provide such a possibility.

It leads to the situation when a task is cancelled but DropNetClient continues processing and finnaly TaskCompletionSource.SetResult method will be executed. This will cause ObjectDisposedException because this method cannot be executed for a disposed task. What can we do in this case?

The first solution is to check if a task is cancelled before calling SetResult method. However, it can still happen that a task will be cancelled after this check but before calling SetResult method.

My proposition is to use methods from TaskCompletionSource.Try* family. They don't throw exceptions for disposed tasks.
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.TrySetResult(new MemoryStream(response.RawBytes)), tcs.TrySetException);
   return tcs.Task;
}
I'm aware that it is not a perfect solution because it actually does not cancel processing. However, without modifying DropNet code it is not possible. It the case of my application it is an acceptable solution but it is not a rule.

16/07/2015

Interview Questions for Programmers by MK #5

Home

Question #5
Here you have a very simple implementation of Template method pattern.
public abstract class BaseAlgorithm
{
   protected SomeObject Resource { get; set; }
   //Other resources

   public void Start()
   {
      // Configure
      Resource = new SomeObject();
      //...
      try
      {
         InnerStart();
      }
      finally
      {
         // Clean up
         Resource.Dispose();
         Resource= null;               
         //...
      }
   }

   protected abstract void InnerStart();
}

public class Algorithm1: BaseAlgorithm
{
   protected override void InnerStart()
   {
      //Do something with allocated resources
   }  
}
At some point someone decided to create a new class Algorithm2 derived from BaseAlgorithm. The difference between the new class and the previous one is that Algorithm2 starts an asynchronous operation. A programmer decided to use async/await keywords to handle this scenario. What do you think about this approach? What could possibly go wrong?
public class Algorithm2: BaseAlgorithm
{
   protected async override void InnerStart()
   {
      var task = DoAsyncCalculations();
      await task;

      //Do something with allocated resources
   }

   private Task DoAsyncCalculations()
   {
      //Let's simulate asynchronous operation
      return Task.Factory.StartNew(() => Thread.Sleep(1000));
   }
}
Answer #5
I think that the developer who created Algorithm2 doesn't understand well how async/await keywords work. The main problem is that finally block inside Start method will be executed before DoAsyncCalculations method will end calculations. In other words resources will be disposed in the middle of calculations and this will cause an exception. Sequence of events will be as follows:
  • Start method begins.
  • SomeObject is created.
  • InnerStart method begins.
  • InnerStart method starts an asynchronous operation and uses await to suspend its progress.
  • This causes that control returns to Start method.
  • Start method cleanups resources.
  • When the asynchronous operation is finished InnerStart method continues processing. It tries to use resources, that have been already disposed, what leads to an exception.
It is also not recommended to have async void methods (except event handlers). If an async method doesn't return a task it cannot be awaited. It is also easier to handle exceptions if an async method returns a task. For details see also this article.

To fix a problem BaseAlgorithm must be aware of asynchronous nature of calculations. For example InnerStart method can return a task which will be awaited inside try block. However, it also means that synchronous version of InnerStart method in Algorithm1 will have to be changed. It may not be acceptable. Generally, providing asynchronous wrappers for synchronous methods is debatable and should be carefully considered.

In this case, I'll consider to have separated implementations of Template method pattern for synchronous and asynchronous algorithms.

06/07/2015

A practical example of using TaskCompletionSource<T>

Home

Recently I've found a question about real life scenarios for using rather unknown TaskCompletionSource<T> class. I started thinking where I would use it and very quickly I found a good practical example.

I have a pet project LanguageTrainer that helps me in learning words in foreign languages. Some time ago I added Dropbox support to it. It allows me to export/import list of words to/from Dropbox. I developed it in synchronous way. Now I prefer an asynchronous approach and I want to take advantages of async/await keywords.

The problem is that DropNet library, that makes communication with Dropbox easy, doesn't use async/await. It has asynchronous API but it is callback based. The really easy solution here is to use TaskCompletionSource<T>. Here is an example (simplified). Let's start with the original code that downloads a given file from Dropbox.
public void ProcessFile(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   // ...
   var bytes = client.GetFile(path)
   //Process bytes
}
The version that uses DropNet asynchronous API looks in the following way:
public void ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   client.GetFileAsync(path, 
      response => 
      {
         var bytes = response.RawBytes;
         //Process bytes
      }, 
      ex => 
      {
         //Handle exception
      });
}
And finally the asynchronous version with async/await looks in the following way:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
...
var bytes = await ProcessFileAsync(key, secret, path);
//Process bytes
The method ProcessFileAsync is marked as async and returns a task so it can be awaited. Easy. isn't it? A few lines of code and you can use async/await with other types of asynchronous APIs.