29/12/2015

Report from the battlefield #2 - amount of data matters a lot

Home

In the next post from the Report from the battlefield series I'll wrote about a serious mistake that is quite common according to my experience. I'm thinking about a situation when a developer assumes that all data from a database can be processed on the client side. I'll give you 2 examples that I encountered during my reviews.

Case 1

A developer was asked to implement the paging functionality. He created a single page Web application. It looked nice and the paging was working correctly at first glance. I decided to check how it was implemented under the hood. I examined a web service that was used by the application and I was surprised. Why? I didn't find a web method responsible for returning pages. The next step was to dig into a java script code. Unfortunately, I discovered that the paging was implemented only on the client side i.e. the application initially downloaded all data from a database (via Web Service).

Case 2

In the another project the paging was implemented correctly on the server side but a developer made a more subtle mistake. An application had a shopping cart function. Of course it was possible to add and remove products to and from a cart. To do so a web service used by the application had a method GetCart. This method was responsible for retrieving a current content of a cart from a database.

However, it was strange that this method returned only identifiers of products. What's more there was no GetProductDetails web method. It made me curious how the application displays products details to users only knowing its identifiers. It turned out that at the initiation the application was reading details of all products from a database. Having all products on the client side it was easy to find details of a product based on its identifier.

Summary

In both cases applications were fast enough because of a small amount of data. In the case of real-life databases they will not. I think that we should always be prepared for the worst case. Especially, when we participate in a recruitment and we want to show ourselves from the best side. An evaluator shouldn't guess whether we know something or not.

24/12/2015

Merry Christmas!

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy


Giving wishes in a foreign language can be challenging so my wishes will be simple but very sincere. I wish you a Merry, Peaceful Christmas and an Amazing 2016. Let it be at least as good as the past year.

Best wishes,
Michał Komorowski

24/11/2015

Report from the battlefield #1 - EF and DTOs

Home

Some time ago, I started doing code reviews of various projects for the recruitment company. It is an interesting experience and I'm learning a lot by this occasion. I also observed that some mistakes are repeated by different authors. Other are not so common but are not obvious. So I came up with the idea to start a new series of posts under the title "Report from the battlefield". In this series I'll describe my observations and findings from my reviews.

Let's start. Recently, I reviewed a project created with AngularJS + ASP.NET Web API + Entity Framework. The code was neither very good nor very bad. However, I noticed that the author decided to use a class generated from the EDMX model as DTO (Data Transfer Object). The reasoning behind this decision was simple - this class had all properties required on the client side so why not to use it. Well there are a few reasons why it is not a good idea.
  • With dedicated DTOs it is less possible that changes on the server side will affect the client side.
  • With dedicated DTOs we can easily control what will be send to the client side and in what format.
  • With dedicated DTOs the server side model can be completely different from the client side model.
  • By exposing EF classes to the client side we effectively expose the database model to the client side!
You may agree with my points or not. So, I'll give you a practical example what could happen if we use EF classes as DTOs. Let's assume that there is EDMX model with 3 types of entities:
  • Customer with Orders navigation property.
  • Orders with Customer and Products navigation properties.
  • Products with Orders navigation property.
Now we want to read only 1 customer from a database, serialize it to JSON and send the result to the client side. What could go wrong? Well, because of the navigation properties the JSON serializer that is used by ASP.NET Web API will read from the database and convert to JSON the whole graph of customers, orders and products! To be more specific, I saw 0.5 MB response which should have a few kilobytes for a very small database (it contained small dozens of records in all tables)! I can bet that in the case of a production database a response would have hundreds of megabytes.

15/11/2015

Interview Questions for Programmers by MK #6

Home

Question #6
What is the arithmetic overflow and how is it handled in .NET?

Answer #6
It is a situation when the result of an arithmetic operation exceeds (is outside of) the range of a given numeric type. For example the maximum value for byte type in .NET is 255. So in the following example, an operation a+b will cause an overflow:
byte a = 255;
byte b = 20;
byte c = a + b;
The final result depends on the used numeric types:
  • For integer types either OverflowException will be thrown or the result will be trimmed/cropped (the default behaviour). It depends on the compiler configuration and usage of checked / unchecked keywords.
  • For floating point types OverflowException will never be thrown. Instead the overflow will lead either to the positive or the negative infinity.
  • For decimal type OverflowException will be always thrown.
var b = byte.MaxValue;
//The result will be zero because:
//b = 255 = 1111 1111 
//b++ = 256 = 1 0000 0000
//The result has 9 bits so the result will be trimmed to 8 bits what gives 0000 0000
b++; 
         
checked
{
 b = byte.MaxValue;
 //Exception will be thrown 
 b++; 
}

var f = float.MaxValue;
//The result will be float.PositiveInfinity
f *= 2;  

decimal d = decimal.MaxValue;
//Exception will be thrown
d++; 

22/10/2015

TransactionScope and multi-threading

Home

It's my third post about TransactionScope. This time I'll write about using it with multi-threading. Let's start with the following code:
using (var t = new TransactionScope())
{
   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}

private static void UpdateDatabase()
{
   using (var c = new SqlConnection(connectionString))
   {
      c.Open();

      WriteDebugInfo();

      new SqlCommand(updateCommand, c).ExecuteNonQuery();
   }
}

private static void WriteDebugInfo()
{
   Console.WriteLine("Thread= {0}, LocalIdentifier = {1}, DistributedIdentifier = {2}",
      Thread.CurrentThread.ManagedThreadId,
      Transaction.Current?.TransactionInformation.LocalIdentifier,
      Transaction.Current?.TransactionInformation.DistributedIdentifier);
}
It seems simple but it doesn't work. The problem is that a connection that is created in UpdateDatabase method will not participate in any transaction. We can also observe that WriteDebugInfo will write empty transaction identifiers to the console. It happens because in order to read an ambient transaction (the transaction the code is executed in) TransactionScope uses Transaction.Current property which is thread static (i.e. specific for a thread).

To overcome this issue we have two possibilities. The first one is to use DependentTransacion. However, I'll not show how to do it because since .NET 4.5.1 there is a better way - TransactionScopeAsyncFlowOption enum. Let's try.
using (var t = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
   ...
}
Unfortunately, there is a big chance that this time we will get TransactionException with the message The operation is not valid for the state of the transaction. in the line with ExecuteNonQuery. The simplified stack trace is:

at System.Transactions.TransactionStatePSPEOperation.get_Status(InternalTransaction tx)
at System.Transactions.TransactionInformation.get_Status()
...
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at Sandbox.Program.UpdateDatabase(Object o)

I read a lot about this but nobody was able to explain why it happens. I also looked into the source code of TransactionStatePSPEOperation class. It was instructive because I learned what is PSPE - Promotable Single Phase Enlistment. However, it also didn't give me an exact answer.

So, I played a little bit with the code and I noticed that the problem occurs when:
  • One thread tries to run ExecuteNonQuery.
  • Another thread waits for the opening of the connection.
However, when both connections were already opened then the exception wasn't thrown.

At this point it is worth reminding one thing - when there are 2 or more connections opened in a transaction scope at the same time then a transaction is promoted to a distributed one. I'm not 100% sure but I think that the problem occurs because it is not allowed to use a connection which participates in a transaction which is in the middle of promotion to the distributed one. So, the solution is to assure that a transaction will be distributed from the beginning. Here is a fixed code with a magic line (I found it here):

using (var t = new TransactionScope())
{
   //The magic line that makes a transaction distributed
   TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);

   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}
Nonetheless, the more I think about this the more convinced I'm that using TransactionScope with multi-threading is asking for problems.

13/10/2015

How not to use TransactionScope. Another WTF!

Home

This time I will write again about TransactionScope. It is a very useful class and seems to be extremely easy in use. In majority of cases it is true. However, there are also some pitfalls lurking for developers. Especially for these who don't like to waste time for reading MSDN documentation if not really needed i.e. probably vast majority of us ;)

Some time ago, I was analysing more or less the following code:
using(var t = new TransactionScope())
{
   var c = ConnectionProvider.ProvideConnection();
   //Use a connection to update a database
   //...
   t.Complete();
}
ConnectionProvider is a class that hides details of managing connections to a database. There was also a bug in the code responsible for updating a database which caused exceptions. I fixed it and I run tests again. This time an exception was not thrown but something was wrong because a database contained unexpected data. It looked like the transaction was not rollbacked!

Firstly, I though that it is some kind of magic. However, as usual in this kind of cases it wasn't. I digged into ConnectionProvider and I found out that this class was performing some kind of pooling and a connection wasn't opening every time. It was a big problem because connections opened outside a transaction scope do not participate in a transaction. The solution of this problem is to explicitly enlist a connection in an existing transaction scope with the EnlistTransaction method.

It is also worth highlighting that the described problem won't occur if ConnectionProvider doesn't try to implement polling on its own. In general we don't have to do it because .NET do it for us. The problem will also not occur if using statement is used to close a connection returned by a provider.

09/09/2015

TransactionScope + Ninject + a small mistake = WTF

Home

Sometimes one stupid mistake can cost a lot of time. A few days ago my application (AngularJS + ASP.NET Web API) started crashing because of the following error:

MSDTC on server 'XXX' is unavailable

It was strange. I wasn't aware of any distributed transactions in my application. To be honest, I was using TransacionScope but I was sure that there was no reason to promote a lightweight transaction into a distributed one. To make things more strange the error wasn't reported every time. When I tried to update data for the first time everything was ok. However, the second attempt (and every next) was failing.

It took me some time to examine all recent changes but finally I found a problem. It was quite tricky so I decided to write about it. Let's start with the fact that I use Ninject as a dependency injection container. Among others Ninject allow us to control a lifetime of objects (instances). Particularly, in the case of web applications, we can use:
  • InRequestScope method - it tells Ninject that one object of a particular type should be created for each individual request.
  • InSingletonScope method - it tells Ninject that one object of a particular type should be created for all requests.
For example:
kernel.Bind(x => x
   .FromAssembliesMatching("test.dll")
   .SelectAllClasses().InheritedFrom(typeof(IInterfaace))
   .BindAllInterfaces()
   .Configure(z => z.InSingletonScope()));
The problem was that accidentally I mixed InSingletonScope and InRequestScope. For example, let's assume that each request requires objects of two classes A and B. Objects of type A are within the request scope and objects of type B are within the singleton scope.

Both objects perform updates/inserts/deletes and are used inside TransacionScope. For the first request it is not a problem. Both objects are initialized within the same request and use the same database connection. It means that a lightweight transaction is used.

However, for the second (and every next) request an object of type B is re-used whereas a new object of type A is created. Object of type B was initiated in the previous request and it uses a different connection than the one used by an object of type A. It means that a distributed transaction will be used in this case.

To sum up:
  • DI containers give a great power but with the power comes great responsibility.
  • Be careful when using objects of a different scope together. Especially when these objects require data access.
  • Be careful when using multiple connections inside TransacionScope. In the case of MSSQL 2005 in this situation a distributed transaction will be always used. In the case of MSSQL 2008 or newer it is possible to use more than one connection inside TransacionScope without automatic promotion. However, if and only if these connections are not opened at the same time.
  • TransactionScope automatically escalating to MSDTC on some machines? is a great source of knowledge about TransacionScope and about the process of promoting lightweight transactions into distributed ones.

21/08/2015

Do you know OUTPUT clause?

Home

Today, I'll write about using OUTPUT clause together with INSERT statements. It seems to be that it is not a very well known syntax. However, it is especially useful when use Identity columns to generate keys. Let's start with a simple table:
CREATE TABLE dbo.Main
( 
 Id int Identity (1,1) PRIMARY KEY,
 Code varchar(10),
 UpperCode AS Upper(Code)
);
The old fashioned approach to retrieve a value of Identity column for a new row is to use SCOPE_IDENTITY(). For example:
INSERT INTO dbo.Main (Code) VALUES ('aaa');
SELECT SCOPE_IDENTITY();
With OUTPUT clause it will look in the following way:
DECLARE @InsertedIdentity TABLE(Id int);
INSERT INTO dbo.Main (Code) OUTPUT INSERTED.Id INTO @InsertedIdentity VALUES ('aaa')
SELECT TOP(1) * FROM @InsertedIdentity
You can say wait a minute. If I want to use OUTPUT I have to declare a table variable first and then use SELECT. It is more complex than just using SCOPE_IDENTITY().

Well, the first benefit is that with OUTPUT clause we can read values from many columns, including these that are computed (as it was shown above). However, the real power of OUTPUT clause can be observed if we want to insert many rows into a table:
DECLARE @ToBeInserted TABLE(Code varchar(10), Name varchar(100));

INSERT INTO @ToBeInserted
VALUES ('aaa','1111111111'), ('ddd','2222222222'), ('ccc','3333333333');

DECLARE @Inserted TABLE(Id int, Code varchar(10), UpperCode varchar(10));

INSERT INTO dbo.Main (Code)
OUTPUT INSERTED.Id, INSERTED.Code, INSERTED.UpperCode INTO @Inserted
SELECT Code
FROM @ToBeInserted;

SELECT * FROM @Inserted;
Without OUTPUT we would have to write a nasty loop!

Here is one more example. Let's assume that we have an additional table that references dbo.Main.
CREATE TABLE dbo.Child
( 
 MainId int,
 Name varchar(100),
 CONSTRAINT [FK_Child_Main] FOREIGN KEY(MainId)REFERENCES dbo.Main (Id)
);
We want to insert a few rows into dbo.Main and then related rows to dbo.Child. It is quite easy if we use OUTPUT clause.
INSERT INTO dbo.Child (MainId, Name)
SELECT i.Id, tbi.Name
FROM @ToBeInserted tbi
 JOIN @Inserted i ON i.Code = tbi.Code;
Extremely useful thing that you must know!

At the end it is worth mentioning that OUTPUT clause can be also used together with UPDATE, DELETE or MERGE statements.

02/08/2015

Oracle VM VirtualBox and Windows 8.1

Home

In my day to day work I use a 64 bit version of Windows 8.1 Pro N. I needed a virtualization software so I decided to use a free Oracle VM Virtual Box. Everything was ok up to the moment when I wanted to install a x64 version of an operating system on a fresh virtual machine. To my surprise VirtualBox reported the following error:

VT-x/AMD-V hardware acceleration has been enabled, but is not operational. Your 64-bit guest will fail to detect a 64-bit CPU and will not be able to boot.

Please ensure that you have enabled VT-x/AMD-V properly in the BIOS of your host computer.


After some time I noticed that VirtualBox stopped showing 64 bit versions in the Version list. Well, it was actually good because I couldn't use a64 bit virtual machines anyway ;) But, I still didn't know why it happened.

I checked BIOS settings and it seemed ok. I searched Internet for the answer but everyone were recommended to verify BIOS configuration what I've already done. I needed a new VM quickly so at that point I installed a x86 version of Windows.

A few days later my colleague Przemek suggested that the problem may be in the conflict between Hyper-V and VirtualBox and that I should disable Hyper-V. It was strange because I've never installed Hyper-V. However, I checked and I discovered that Hyper-V features were enabled on my computer. It seems to me that they are installed by default with the operating system.



The solution was easy. I pressed Win+S and typed Turn windows feature on or off. Then, I cleared a box next to Hyper-V and restarted computer. After that I was able to install a x64 version of a operating system on a virtual machine.

To sum up, if:
  • Your host system is a x64 version of Windows 8.1.
  • Virtualization is enabled in BIOS.
  • You use VirtualBox.
And you cannot install x64 operating system on a virtual machine then try to disable Hyper-V.

27/07/2015

A hint how to use TaskCompletionSource<T>

Home

Some time ago I wrote about using TaskCompletionSource<T> class in order to take advantage of async/await keywords. In that post I included the following code:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
Now, Let's assume that we want to provide a possibility to cancel a task returned from ProcessFileAsync method. We can do something like that:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
I used CancellationToken.Register method in order to register a callback that will be executed when a token is canceled. This callback is responsible for notifying TaskCompletionSource<T> that underlying task should be cancelled.

You may say that it is not enough because this code doesn't inform DropNetClient that an action should be cancelled. You are right. However, according to my knowledge DropNet API doesn't provide such a possibility.

It leads to the situation when a task is cancelled but DropNetClient continues processing and finnaly TaskCompletionSource.SetResult method will be executed. This will cause ObjectDisposedException because this method cannot be executed for a disposed task. What can we do in this case?

The first solution is to check if a task is cancelled before calling SetResult method. However, it can still happen that a task will be cancelled after this check but before calling SetResult method.

My proposition is to use methods from TaskCompletionSource.Try* family. They don't throw exceptions for disposed tasks.
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.TrySetResult(new MemoryStream(response.RawBytes)), tcs.TrySetException);
   return tcs.Task;
}
I'm aware that it is not a perfect solution because it actually does not cancel processing. However, without modifying DropNet code it is not possible. It the case of my application it is an acceptable solution but it is not a rule.

16/07/2015

Interview Questions for Programmers by MK #5

Home

Question #5
Here you have a very simple implementation of Template method pattern.
public abstract class BaseAlgorithm
{
   protected SomeObject Resource { get; set; }
   //Other resources

   public void Start()
   {
      // Configure
      Resource = new SomeObject();
      //...
      try
      {
         InnerStart();
      }
      finally
      {
         // Clean up
         Resource.Dispose();
         Resource= null;               
         //...
      }
   }

   protected abstract void InnerStart();
}

public class Algorithm1: BaseAlgorithm
{
   protected override void InnerStart()
   {
      //Do something with allocated resources
   }  
}
At some point someone decided to create a new class Algorithm2 derived from BaseAlgorithm. The difference between the new class and the previous one is that Algorithm2 starts an asynchronous operation. A programmer decided to use async/await keywords to handle this scenario. What do you think about this approach? What could possibly go wrong?
public class Algorithm2: BaseAlgorithm
{
   protected async override void InnerStart()
   {
      var task = DoAsyncCalculations();
      await task;

      //Do something with allocated resources
   }

   private Task DoAsyncCalculations()
   {
      //Let's simulate asynchronous operation
      return Task.Factory.StartNew(() => Thread.Sleep(1000));
   }
}
Answer #5
I think that the developer who created Algorithm2 doesn't understand well how async/await keywords work. The main problem is that finally block inside Start method will be executed before DoAsyncCalculations method will end calculations. In other words resources will be disposed in the middle of calculations and this will cause an exception. Sequence of events will be as follows:
  • Start method begins.
  • SomeObject is created.
  • InnerStart method begins.
  • InnerStart method starts an asynchronous operation and uses await to suspend its progress.
  • This causes that control returns to Start method.
  • Start method cleanups resources.
  • When the asynchronous operation is finished InnerStart method continues processing. It tries to use resources, that have been already disposed, what leads to an exception.
It is also not recommended to have async void methods (except event handlers). If an async method doesn't return a task it cannot be awaited. It is also easier to handle exceptions if an async method returns a task. For details see also this article.

To fix a problem BaseAlgorithm must be aware of asynchronous nature of calculations. For example InnerStart method can return a task which will be awaited inside try block. However, it also means that synchronous version of InnerStart method in Algorithm1 will have to be changed. It may not be acceptable. Generally, providing asynchronous wrappers for synchronous methods is debatable and should be carefully considered.

In this case, I'll consider to have separated implementations of Template method pattern for synchronous and asynchronous algorithms.

12/07/2015

Interview Questions for Programmers by MK #4

Home

Question #4
You have to implement a very fast, time critical, network communication between nodes of a distributed system. You decided to use good old sockets. However, you haven't decided yet whether to use TCP or UDP protocol. Which one would you choose and why?

Answer #4
If a speed is the only one important factor, I'd choose UDP. UDP is faster than TCP because it has a smaller overhead. In comparison to TCP it is a connection-less, unreliable protocol that doesn't provide features like retransmission, acknowledgment or ordering of messages.

However, it also means that usage of UDP might be more difficult and will require additional coding in some cases. For example, if we have to assure that sent messages have been delivered. In this case, I'd certainly use TCP.

Finally, there is one more thing in favor of UDP. It provides broadcasting and multicasting. So, if it is required I'd also use UDP instead of TCP.

What every blogger should do if using someone else's code #3

Home

Recently, I've written a post about integration with Dropbox. This one is also about integration but this time with Skydrive. I spent some time on investigating how to implement this feature in a desktop application. I was even considering to write what I need from the scratch on my own.

Fortunately, it turned out that someone did it for me. I found a nice piece of code prepared by Microsoft. It includes sample applications showing how to do everything step by step. You can download it from github.

06/07/2015

A practical example of using TaskCompletionSource<T>

Home

Recently I've found a question about real life scenarios for using rather unknown TaskCompletionSource<T> class. I started thinking where I would use it and very quickly I found a good practical example.

I have a pet project LanguageTrainer that helps me in learning words in foreign languages. Some time ago I added Dropbox support to it. It allows me to export/import list of words to/from Dropbox. I developed it in synchronous way. Now I prefer an asynchronous approach and I want to take advantages of async/await keywords.

The problem is that DropNet library, that makes communication with Dropbox easy, doesn't use async/await. It has asynchronous API but it is callback based. The really easy solution here is to use TaskCompletionSource<T>. Here is an example (simplified). Let's start with the original code that downloads a given file from Dropbox.
public void ProcessFile(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   // ...
   var bytes = client.GetFile(path)
   //Process bytes
}
The version that uses DropNet asynchronous API looks in the following way:
public void ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   client.GetFileAsync(path, 
      response => 
      {
         var bytes = response.RawBytes;
         //Process bytes
      }, 
      ex => 
      {
         //Handle exception
      });
}
And finally the asynchronous version with async/await looks in the following way:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
...
var bytes = await ProcessFileAsync(key, secret, path);
//Process bytes
The method ProcessFileAsync is marked as async and returns a task so it can be awaited. Easy. isn't it? A few lines of code and you can use async/await with other types of asynchronous APIs.

02/07/2015

Careercon Warszawa 2015 - surveys

Home

I'd like to share with you the results of 2 surveys regarding my last presentation. The first one was prepared by organizers of the conference. The second one was created by me on SurveyMonkey portal. I placed a link to it on each slide in the lower  left corner. It was visible through the whole presentation so anyone with a smartphone could have filled it. The survey was short and consisted of 3 obligatory and 2 voluntary questions.

Let's start with the official survey of the conference. According to it there were ~50 people on my presentation. ~30 said that it was very good, ~18 that it was good and ~2 that it was week. In average it gives 4,56. These are very good results for me, thanks!



My survey looked in the following way:
  • Question 1 - What is your overall assessment of my presentation on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 2 - How do you assess the substantive content of my presentation on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 3 - How would you rate my way of presenting on a scale of 1 to 5 (5 - the highest evaluation)?
  • Question 4 - Name one thing that you remember from my presentation.
  • Question 5 - This is the place for all sorts of comments, applications and complaints. Each comment will be valuable for me.
I received 8 answers for these questions. Much less than for the previous survey but all the answers were extremely valuable to me. Thanks! Here are details:


12345Average
Question 11434,25
Question 2444,50
Question 3354,63

I also received a couple of comments that will allow me to be a better speaker in the future:
  • The demo should have been a little bit longer.
  • The sample code used during a demo could be more complex.
  • I should have used a microphone.
  • I should have spoken slower.
  • Not all terms used by me were understandable for a layman.
It was nice to read that my presentation was useful and that I showed interesting tools.

As a summary I have to say that these results are a boost of energy for me and an encouragement to be a speaker more often.

27/06/2015

Careercon Warszawa 2015 - already after the presentation

Home

A couple of hours ago I returned from the conference Carrercon Warszawa 2015 where I had a presentation about historical debuggers. I'm generally happy with my speech. There were dozens of people in the room. I didn't forget to tell about anything important, I wasn't stressed too much and my presentation took as much time as I planned. As always I see a room for improvement but it was a good decision to take part in the conference as a speaker.

In the next post I want to write much more about Careercon Warszawa 2015 and my presentation. Now, I'd like to ask everybody who attended my presentation and who are reading this post for one thing. The thing which is very important to me. I'll be happy if you fill the following short survey:

Click to start a survey

Any feedback will be extremely valuable for me.

22/06/2015

CareerCon Warszawa 2015

Home

On Saturday (27-06) I'll have a presentation about historical debuggers at the conference "CareerCon Warszawa". I'll tell what they are and why you should use them. I'll also explain how IntelliTrace historical debugger for .NET works under the hood. If you have time I'll be glad to see you! The presentation will be in Polish.

Conference site
Fanpage

For now that's all. After the conference you can expect a post about how it looked like.

15/06/2015

Interview Questions for Programmers by MK #3

Home

Question #3
You found the following code and were asked to refactor it if needed:
var sb = new StringBuilder();
sb.AppendLine("<root>")
sb.AppendLine(String.Format("   <node id="{0}"/>", 1));
sb.AppendLine(String.Format("   <node id="{0}"/>", 2));
sb.AppendLine(String.Format("   <node id="{0}"/>", 3));
//Many, many lines of code
sb.AppendLine("</root>");
What would you do and why?

Answer #3
It is not the best idea to create XML documents using string concatenation because it is error prone. Besides created documents are not validated in any way. In .NET we have a few possibilities to refactor this code.

I recommend to use XmlWriter in this case because we want to create a new document and we do not want to edit an existing one. However, if we also want to modify existing XML documents, the good choice will be XDocument or XmlDocument class.

In the case of small XML documents (when performance is not critical) it might be a good idea to use XDocument or XmlDocument even if we don't want to edit existing documents. Especially XDocument can be simpler in use than XmlWriter.

Comments #3
I remember that when I wanted to create an XML document in C# for the first time I did it by using string concatenation. The code worked ok and I was surprised that it didn't pass a review :)

12/06/2015

What do you know about low-level programming?

Home

Have you ever written anything in a low-level assembly language? The last time, I was in touch with low-level programming, was during my studies. I wrote something in x86 and MIPS assembly languages. It was not easy but I liked it and I think that every good developer should know basics of programming at low-level. Why I'm writing about that?

Recently, I've found a very good online game known as microcorruption which reminded me good old times. The goal od this game is to open a lock by exploiting bugs in the source code. In order to do so you have to use a debugger of MSP430 assembly language.


At the beginning, initial tasks seems to be very easy e.g. a password can be hardcoded in the source code. However, if you haven't worked with any low-level language for many years even so simple task can be challenging. Besides, every next task is more and more difficult.

microcorruption is a great game if you want to remind yourself things like registers, a calling convention, a stack, low-level addressing and many others.

I started playing and I cannot stop! I encourage you to try.

04/06/2015

I'm ashamed that I knew so little about...

Home

This post will not be related to programming. I want to write about something that I've read about recently. I think that it's extremely interesting. Besides it is important for me. I'm talking about Lvov school of mathematics, a group of brilliant Polish mathematicians who worked in Lvov before World War II and had a great impact on contemporary mathematic. You may be surprised that even famous people like John von Neumann visited Lvov in order to talk with them.

Have you heard about them? If you come from Poland there is a chance that you heard although it is not very well know topic. And it is a pity because Polish people should be proud because of their achievements. If you are not from Poland there is a bigger chance that you heard about people like Stefan Banach, Stanisław Ulam or Hugo Steinhaus. Just to mention 3 mathematicians who were a part of Lvov school of mathematics.

These were very special people. Stefan Banach established very important part of mathematic known as functional analysis, was the author of many theorems (e.g. Banach space), has his own planetoid 16856. He had written his PhD thesis within 6 months and after that he needed only 7 years to become a professor. I'm pretty sure that every mathematician knows his name.

Stanisław Ulam had took part in Project Manhattan and then worked on the hydrogen bomb. In the 40s wrote one of the first (if not the first) program playing chess. He also proposed Monte Carlo method. When Kennedy became a president in 1960 Ulam as an advisor was asked which important project should be started. He suggested an expedition to the moon what Kennedy approved!

Hugo Steinhaus did so many things that I don't know what to choose. He "discovered" Stefan Banach so without him Lvov school of mathematics could have never been created. He invented introwizor, ancestor of modern computed tomography, which was patented in many countries in Europe and in USA. One of his books Mathematical Snapshots, that was originally published in 1938, is still available on Amazon! He also worked on game theory. You can say that many people did it. However, Steinhaus had done so 20 years before someone used this term.

I'll stop now because I could write and write about them. Instead I'll cite 2 short anecdotes that show that these were really extraordinary people (based on Genialni. Lwowska szkoła matematyczna by Mariusz Urbanek, unfortunately available only in Polish).

Stefan Banach has never finished his studies, he didn't like bureaucracy, formalisms and official titles. Because of that he had a problem with his PhD. It wasn't important for him. He wanted to focus on mathematics. His friends decided to cheat him a little bit and one day they told him that some important people frrm the capital have a few questions and only he can help. He didn't have any problems to answer all these questions, but he didn't know that it was his examination for the degree of doctor ;) Thanks to this small fraud he received PhD title.

In Lvov there was a restaurant "Szkocka" ("Scotch") and mathematicians like to spend there a lot of time of course at talking about mathematics. Noise and bustle didn't bother them. They also had a habit to write down proofs and theorems on the table with a pencil. The problem was that on the next day tables were cleaned and all the work was lost. To solve a problem the owner of the restaurant was asked to set this table aside and not to clean it until everything will be transferred to paper. This was a task of students.

I hope that I convinced you that you should at least know what is Lvov school of mathematics (especially if you come from Poland or if you are mathematician). Personally, I'm ashamed that I knew so little about it before.

01/06/2015

Ray Tracing a Black Hole in C#

Home

A friend of mine Mikołaj Barwicki has published very interesting article about visualisation of black holes on codeproject. So far he received a grade 5 from 43 readers. It is a great result! If you interested in ray tracing, black holes, numerical analysis or parallel computing it is an article for you.


21/05/2015

What every blogger should do if using someone else's code #2

Home

This time I'd like to write about WPFLocalizationExtension library which makes localization of WPF applications easy. I've been using it for 4 years in my projects and it simply works. To be honest I've just realized that the version I'm using is considerably outdated. However, for all this time I haven't encountered problems that would make me to download a newer version of WPFLocalizationExtension.

I think that it is a quite good recommendation. So, if you work with WPF and you need to localize your application I encourage to give a chance to WPFLocalizationExtension.

15/05/2015

How to solve Transportation problem in Excel?

Home

I think that most of you have heard about Transportation problem. We have N factories and M mines/producers. Each factory needs and each mine can provide particular amount of resources. We have to transport these resources from mines to factories. What is obvious it costs money and this cost depends on the distance between factories and mines. We have to find such an allocation that will minimize this cost.

In order to solve this problem we can use linear programming and one of the most popular algorithms are simplex or stepping stone algorithm. However, today I will not write directly about them but I will show how to solve this problem in Excel. Yes, I'm talking about good old Excel. Surprised?

Excel has an Add-in called Solver which will do a job for us. I'll explain how to do it using a simple example with 3 factories and 3 mines. Here is a table that shows costs of transport between mines and factories. For example, if we want to move 10 units from Mine 1 to Factory 1 then a cost will be 10 *c11.

Transportation CostFactory 1Factory 2Factory 3
Mine 1c11c12c13
Mine 2c21c22c23
Mine 3c31c32c33

We also need another table with supplies and demands. Below is an example. The numbers is the first column shows how many resources each mine can provide and the numbers in the the first row shows how many resources are needed by each factory.

The last row and the last column show sums of allocated resources in each row and in each column. These columns are needed to easily configure Solver. In this example some resources have been already allocated and we need to optimally allocate remaining ones i.e. x12, x13....

Supply\Demand1505050Allocation sums
for mines
4010x12x1310
110x21x22x230
100x31x322020
Allocation sums
for factories
10020

We also we have to define limitations and a cost function. The first limitation is that found allocations should be non negative i.e.

x12, x13 ... >= 0

Besides we want to allocate all resources available in mines and each factory should receive required amount of resources i.e.

40 = 10 + x12 + x13
110 = x21 + x22 + x23
100 = x31 + x32 + 20
150 = 10 + x21 + x31
50 = x12 + x22 + x32
50 = x31 + x32 + 20

Because we have a column and a row with allocation sums it will be very easy to enter these allocations into Solver. It is also worth saying that in general these limitations can be different, for example we can have more resources than needed. Of course, in this case formulas above would be also different.

A cost function is also easy. We want to minimise the following sum which is equal to total cost of moving resources from mines to factories:

c11 * 10 + c12 * x12 + c13 * x13 + ....

Now we have everything to solve a problem in Excel. Firstly we have to enable Solver. To do so open Excel options, select Add-ins. Then find Solver on the list and confirm with OK (this procedure can vary in different versions of Excel).

I've already prepared a spreadsheet with all required equations and data for you. You can download it here (you have to download this file locally and do not use online Excel application). To run Solver go to Data tab and select Solver in Analysis category. Then select Solve button and all missing allocations will be populated. Easy, isn't it? Now, a few words about using Solver.

Here is a screenshot with Solver Parameters. A cell in a red circle contains a cost formula. This formula will be minimized (see a green rectangle). Yellow rectangle contains cells that will be modified by an algorithm and finally blue rectangle contains six formulas explained in the previous post.


The next screenshot shows additional options of Solver. You can display this window by pressing Options button in Solver Parameters window. I want to point 2 selected options. Assume Linear Model tells Solver that it deals with linear programming problem and Assume Non-Negative tells Solver that we are interested only in non-negative results.


As you can see much more options are available. I encourage you to experiment with them and also with different costs, limitations, number of mines/factories and problems.

06/05/2015

Interview Questions for Programmers by MK #2

Home

I prepared this question a long time ago in order to check knowledge of basic good practices.

Question #2
You have the following method which is responsible for updating a client in a database:
public void UpdateClient(
   int id,
   string name,
   string secondname,
   string surname,
   string salutation,
   int age,
   string sex,
   string country,
   string city,
   string province,
   string address,
   string postalCode,
   string phoneNumber,
   string faxNumber,
   string country2,
   string city2,
   string province2,
   string address2,
   string postalCode2,
   string phoneNumber2,
   string faxNumber2,
   ...)
{
   //...
}
Do you think that this code requires any refactoring? If yes, give your proposition. A database access technology doesn't matter in this question.

Answer #2
The basic problem with this method is that it has so many parameters. It is an error prone, difficult to maintain and use approach. I suggest to change the signature of this method in the following way:
public void UpdateClient(Client client)
{
   //...
}
Where Client is a class that models clients. It can look in the following way:
public class Client
{
   public int Id { get; set; }
   public string Name { get; set; }
   public string Secondname { get; set; }
   public string Surname { get; set; }
   public string Salutation { get; set; }
   public int Age { get; set; }
   public string Sex { get; set; }
   public Address MainAddress{ get; set; }
   public Address AdditionalAdddress { get; set; }
   /* More properties */
}
Address class contains details (country, city..) of addresses.

Comments #2
You may also write much more e.g.:
  • It may be good to introduce enums for properties like 'Sex' which can take values only from the strictly limited range.
  • UpdateClient method should inform a caller about the result of an update operation e.g. by returning a code.
However, the most important thing is to say that UpdateClient method shouldn't have so many parameters. Personally, if I see a code as above I immediately want to reduce the number of parameters. This question seemed and still seems to be very easy, however not all candidates were able to answer it. Maybe it should be more accurate. For example, I should have stressed that a candidate should focus ONLY on available code. What do you think?

27/04/2015

Interview Questions for Programmers by MK #1

Home

Do you know series of posts titled Interview Question of the Week on a SQL Authority blog? If not or if you don't know this blog at all you have to catch up. I learned a lot of from this series so I decided to start publishing something similar but to focus more on .NET and programming.

This is a first post from series which I called Interview Questions for Programmers by MK and in which I'm going to publish questions that I'd ask if I were a recruiter. Of course they are somehow based on my experience as a participant of many interviews.

Question #1
What is a meaning of using statement in the code below? What would you do if using keyword did not exist?
using(var file = File.OpenWrite(path))
{
   //...
}
Answer #1
In this example using statement is used to properly release resources (to call Dispose method) that are owned by an object of a class that implements IDisposable interface. It is a syntactic sugar and could be replaced by using try/finally block in the following way:
var file = File.OpenWrite(path);
try
{
   //...
}
finally
{
   if(file != null)
      file.Dispose();
}

23/04/2015

How to build predicates dynamically using expression trees

Home

I'm working at the application which finds so called execution patterns in logs recorded by IntelliTrace historical debugger. An execution pattern is a sequence of methods calls that is widely used in the application and it is a kind of automatically generated documentation. The part of the algorithm is filtering of found patterns based on criteria like the length of a pattern or the number of different methods in a pattern.

At the beginning I used only 2 criteria so it was easy to handle all possible combinations of them i.e. use the first criterion, use the second criterion, use both and used none. Then I added 3rd criterion and I thought that for 3 criteria I still don't need a generic mechanism. However, shortly it turned out that I want to handle 5 criteria what gives 32 of possible combinations. This time I did it once and for all.

I decided to use expression trees to dynamically build an expression that verifies any combination of criteria. The code is quite simple. Firstly we need an enum for all criteria.
[Flags]
public enum Crieria : byte
{
    None = 0,
    CriterionOne = 1,
    CriterionTwo = 2,
    All = CriterionOne | CriterionTwo
}
We also need a class that will represent patterns.
public class Pattern
{
    public int FieldOne { get; set; }
    public int FieldTwo { get; set; }
}
Now we can write a code that will dynamically build needed expressions. I assumed that every criterion has a corresponding static method that knows how to check if a current pattern fulfils it or not. The final expression produced by CreateExpression method will be of the following form pattern => predicate1(pattern) && predicate2(pattern) && predicate3(pattern)....
public static class FilterBuilder
{
    public static Func<Pattern, bool> CreateExpression(Crieria filteringMode)
    {
        var param = Expression.Parameter(typeof(Pattern));

        var subExpressions = new List<MethodCallExpression>();

        if ((filteringMode & Crieria.CriterionOne) != 0)
            subExpressions.Add(Expression.Call(typeof(FilterBuilder), nameof(CriterionOnePredicate), null, param));

        if ((filteringMode & Crieria.CriterionTwo) != 0)
            subExpressions.Add(Expression.Call(typeof(FilterBuilder), nameof(CriterionTwoPredicate), null, param));

        //Other criteria...

        if (subExpressions.Count == 0)
            return p => true;

        Expression finalExpression = subExpressions[0];
        for (var i = 1; i < subExpressions.Count; ++i)
            finalExpression = Expression.And(finalExpression, subExpressions[i]);

        return Expression.Lambda<Func<Pattern, bool>>(finalExpression, param).Compile();
    }

    public static bool CriterionOnePredicate(Pattern p)
    {
        return p.FieldOne > 0;
    }

    public static bool CriterionTwoPredicate(Pattern p)
    {
        return p.FieldTwo < 0;
    }
}
The code can be made even more generic but I'll leave it as an exercise. When I finished this code I started to worry about performance. It is critical for me because my application needs to process large amount of patterns efficiently. I made the following simple test in which dynamically generated and static functions are executed 1 million times.
var iterations = 1000000;

var predicate = FilterBuilder.CreateExpression(Crieria.All);
MeasureIt<Pattern>((p) => predicate(p), new Pattern(), iterations);

predicate = FilterBuilder.CreateExpression(Crieria.CriterionOne);
MeasureIt<Pattern>((p) => predicate(p), new Pattern(), iterations);

MeasureIt<Pattern>((p) =>
{
   FilterBuilder.CriterionOnePredicate(p);
   FilterBuilder.CriterionTwoPredicate(p);
}, new Pattern(), iterations );

MeasureIt<Pattern>((p) => FilterBuilder.CriterionOnePredicate(p), new Pattern(), iterations);
In order to measure time of calculations I used MeasureIt method from my earlier post and I received the following results:
Total time: 54
Total time: 27
Total time: 18
Total time: 12
Dynamically generated predicates are 2-3 times slower than static ones. However, we are still talking here about dozens of milliseconds in order to make 1 million calls. For me it is acceptable.