16/09/2016

.NET Developer Days 2016 - Workshops

Home


In my previous post about .NET Developer Days 2016 I wrote generally about the conference and about presentations I'd like to see. This time I want to drop a few lines about pre-conference workshops (sessions). They will take place just a day before the actual conference (GoldenFloor, Millenium Plaza – Warsaw, Al. Jerozolimskie 123 a) and you could choose from:
The links above will lead you to the description of each session. However, I have a surprise for you. I contacted with experts who will conduct workshops and asked them a few questions. Here are additional information that I got. You'll not find them anywhere else.

31/08/2016

AjaxExtensions.BeginForm doesn't work. Really?

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy

The goal of using Ajax is to communicate with the server asynchronously without reloading the entire page. Specifically AjaxExtensions.BeginForm can be used to updated a selected part of a web page. It is relatively easy in use but can be also troublesome. Especially, when we try to apply it in an application which wasn't using Ajax earlier. I decided to wrote this short technical post because recently I came across the following issue the few times:

AjaxExtensions.BeginForm redirects a user to a new page instead of refreshing a fragment of a current one.

This problem has an easy explanation. Under the hood AjaxExtensions.BeginForm uses Java Script library called Microsoft jQuery Unobtrusive Ajax. The issue is that this library is not installed by default if we create a new project. It's easy to forget about it.

If you have the described problem:
  • Check in packages.config file contains Microsoft.jQuery.Unobtrusive.Ajax package.
  • Check if jquery.unobtrusive-ajax.js file is referenced in html e.g.: <script src="/scripts/jquery.unobtrusive-ajax.js"></script>
  • If you use bundles checik if jquery.unobtrusive-ajax.js was included in a bundle e.g.:
    public static void RegisterBundles(BundleCollection bundles)
    {
       ...
       var js = new ScriptBundle("~/bundles/MyBundle").Include("~/Scripts/jquery.unobtrusive-ajax.js");
       ...
    }
  • Besides, check if a bundle with jquery.unobtrusive-ajax.js is rendered properly e.g.:
    @Scripts.Render("~/bundles/MyBundle")

23/08/2016

.NET Developer Days 2016 are coming

Home


.NET Developer Days 2016 is a third edition of the biggest conference in Central and Eastern Europe dedicated to the .NET platform. I didn't participate in previous editions but this time I'll be. Why? Well, I read a few quite good reviews of a former editions. Besides, recently a friend of mine told me that he was going to go there what is also a good recomendation.

To make things funny, when I was about to buy tickets organizers of the conference asked me to write about it. So yes it is a sponsored text but I wouldn't write it if didn't want to go there anyway. Let's start with a few facts about .NET Developer Days 2016:
  • What: 3 tracks with 24 presentations.
  • Where: 
    • Conference: EXPO XXI Exhibition Center – Warsaw, Prądzyńskiego 12/14
    • Workshops: GoldenFloor, Millenium Plaza – Warsaw, Al. Jerozolimskie 123 a
  • When: 19th-21th October 2016. The workshops will take place on October 19th, and the conference will start one day later. Organizers also plan a party at the end of day one. I think that it'll be a good occasion for networking.
  • Language: 100% English
I'm still thinking which presentations to choose but I have a few solid candidates. I remember Jon Skeet from Dev Day conference. He gave a really good presentation so it is my number one. This time he will start the conference and then will talk about Abusing C# and Immutability. I also saw a few presentations delivered by Tomasz Kopacz in the past. As far as I remember he was always a mine of information. His presentations were advanced and demanding but you could learn a lot from them.

I also heard a lot of good about Maciej Aniserowicz so his presentation about CQRS is also on my list. I don't know other speakers but there are many other promising topics to choose from. For example, I'd like to listen Alex Mang who will talk about containers or Adam Granicz who will give a presentation about funcional programming or ... Actually, I already see some potenial conflicts in my personal agenda so as you can see the choice is not easy. I encorage you to see the full agenda on the conference site. If you want to buy a ticket do it sooner than later because the price goes up every 2 months.

See you there!

19/08/2016

My struggels with GitHub, Subtrees and Nuget

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy

Some time ago I decided to publish my projects on GitHub. This decision had very positive repercussion because it mobilized me to do more refactoring and to clean my solutions. Additionally, I switched totally to management of external references via NuGet. Earlier, I had some binaries in a dedicated directory on my computer. It took me some time but it was worth doing it.

I also had to solve the following problem. I have a solution called Common. It is a collection of libraries, utilities, algorithms, helpers etc. that are used in my other projects. Before migration to GitHub, after a build, all Common binaries were copied to the well known location i.e. N:\bin Thanks to that all other projects could reference them from this location. It works. However, if someone wants to download my projects from GitHub, he or she will need to create a mapped drive N manually. I din't like it.

The next step was to switch from absolute references to binaries to relative references to projects. For example let's consider a library MK.Utilities and a project LanguageTrainer that uses it. Initially LanguageTrainer was referencing:

N:\bin\MK.Utilities.dll

After migration this reference was changed to:

..\..\Common\MK.Utilities\MK.Utilities.csproj

Much more better, isn't it? Still, it is not perfect. This relative path will work only if the folder with Common and LanguageTrainer solutions will be in the same place on the disk. Besides, in order to compile LanguageTrainer solution Common solution must be built first. What's more Common and LanguageTrainer are two separate repositories which have to be downloaded individually. My goal was to be able to download any repository/solution and then be able to compile it without any further steps.

I started reading about possible solutions and I found information about git submodules and git subtrees. There is so much about them in Internet so I will not repeat others. For example see this post. At the end I decided to use subtrees. Simplifying a subtree is a copy of some repository in the another one. Returning to my earlier example, by using subtrees I got a copy of Common repository/solution in LanguageTrainer repository/solution. Then I could change the reference to MK.Utilities as follows:

..\Common\MK.Utilities\MK.Utilities.csproj

Besides, I needed to add MK.Utilities project to LanguageTrainer.sln solution so that all projects could be build at the same time. Finally, my LanguageTrainer repository/solution looks in the following way. On left side you can see GitHub and on the right side Visual Studio:


If needed I can refresh a copy of Common repository/solution inside LanguageTrainer at any time. Why I used subtrees? Well, they work for me and are extreamly easy to create via Source Tree ;) By the way, I like git but I hate all this complex commands. Source Tree solves this problem and allows me to use git via friendly GUI. I strongly recommend it.

At the end I had to solve one more problem with Nuget. Let's return again to my example and let's assume that LanguageTrainer solution is located here:

C:\LanguageTrainer

In that case, by default, Nuget packages will be located here:

C:\LanguageTrainer\packages

However, we also have a subtree:

C:\LanguageTrainer\Common

And projects from a subtree expects that their packages will be here:

C:\LanguageTrainer\Common\packages

Of course they won't be there so Common projects will not compile. To overcome this problem I had to manually update csproj files and replace ..\packages with $(Solutiondir)packages. For example:

..\packages\structuremap.3.1.6.186\lib\net40\StructureMap.dll

Was changed as follows:

$(SolutionDir)packages\structuremap.3.1.6.186\lib\net40\StructureMap.dll

I hope that this post will help you in your struggles with GitHub, Subtrees and Nuget.

06/08/2016

Why did I do my PhD?

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy

This is my second post about the longest project I've ever participated in i.e. about doing PhD. I decided that at the beginning I'll write why I actually started my Ph.D. studies and what I think now about my motivations.

I remember quite well this time in 2009 when I made my decision to get a Ph.D. It was supported mainly by a few things. Firstly, at that point I was a newly minted graduate of Warsaw University of Technology and I had a very good memories of my studies, being a student... and I wanted to continue that. Secondly, I wanted to do something different from what I was doing professionally for money i.e. typical applications for business.

Thirdly, I associated Ph.D. title with the some kind of prestige that will allow me to distinguish myself in the future. Fourthly, a half year earlier my wife and a few of my friends also started Ph.D. studies. Don't understand me wrong. I wasn't jealous, I didn't feel worse or something. But taking into account what I've written earlier it was an additional motivation for me.

What I think about these motivations now? I'd say that they are neither good nor bad but they simply are. However, I think that I missed a few important thing in 2009. Do you agree with me that my way of thinking was somehow romantic? Now, I know that it was. I assumed that I would be working professionally for money and I would be doing Ph.D. to "do something different". I didn't think much about my future scientific carrier, doing habilitation... I also didn't think much what I actually want to achieve during my studies. Though thanks to that I had an occasion to play with technologies like Azul Systems or Agilent N2X before focusing on historical debuggers :)

Briefly summarising my Ph.D. studies were a little bit like a hobby. And as any hobby, on the one hand it gives you fun and satisfaction, but on the another hand it doesn't necessarily lead you to something, and can be easily set aside or abandon. Many times I had a moment of doubt or wanted to say stop.

Would I made the same decision if I could go back in time? Definitely yes, but I'd consider it much more carefully. I'd think through area of my research so that it would be more perspective and more valuable on the market. Probably because of financial reasons I'd share my time between Ph.D. and the professional job. However, I'd try to find a job somehow related to my studies, where Ph.D. could be potentially beneficial. Currently in the vast majority of job offers (that I receive) it isn't. I'd also consider doing Ph.D. abroad (outside of Poland) where funding is better so that I'd be able to focus on science. Thanks to that my result would be also better.

You must be both romantic and pragmatic when doing Ph.D.

24/07/2016

Report from the battlefield #5 - Logging can kill performance

Home

Public Domain, https://commons.wikimedia.org/w/index.php?curid=48390
Source: own resources, Authors: Agnieszka and Michał Komorowscy

So far in Report from the battlefield series I wrote about my experiences as an expert in the recruitment company. This time I'll write a bug that I found in the production. It was all about the performance. The problem was that in the new version of an application one operation slowed down about 6 times. Initially, I suspected that amount of data simply increased considerably or some network problems. Fortunately, I easily reproduced the issue on my dev machine. Reproducing a problem is half the battle. Though performance problem are usually difficult to analyse so I was ready for a long investigation.

I started stepping through the code with a debugger just to see what is going on. Everything seemed to be ok until... One of the final operations was to log into a file what was retrieved from a database. What's important the log level was set to Trace so even large amount of data shouldn't matter in the production. Why? Because in the production, precisely because of the performance reasons, the logger should be configured not to log everything to a file. In other words it should ignore messages usually with the log level = Trace or Debug. However, after I had pressed F10 (Step Over), I had to wait a few seconds till the logging ends. BINGO!

My first though was that someone configured the logger in the wrong way in the production. Typical PEBKAC problem. To verify my hypothesis I changed the configuration of the logger and executed the problematic operation. Unfortunately, the problem occurred again. Another look at the code and I know what was wrong. And do you already know?

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

The problem was that for large amount of data the application required a few seconds just to create/prepare a message for the logger. To make thing worse this message was created, regardless if it was later used by the logger or not. During development it may acceptable but not in the production! There are 2 potential solutions of this problem. Details depends on the logging framework:
  • The first approach is to simply check the logging level before creating/preparing a message e.g.:
    if(Logger.LogLevel == LogLevel.Trace) 
    {
        /* Prepare and log a message */
    }
    
  • The second approach is to use deferred execution for example lambdas e.g.:
    Logger.Trace(() => /* Prepare a message */).
    If a logger supports this syntax, a lambda will be executed if and only if it is required.

17/07/2016

The longest project

Home

Source: own resources, Authors: Agnieszka and Michał Komorowscy

I haven't been blogging for 4 months and it's the longest break I've ever had. Why? Was I sick? Did I have no ideas what to write about? Did I have no time? Did I have too much work? Fortunately, nothing of that. The reason is completely different and probably surprising. So, I finished the longest project in my life.
  • The project that I started in 2009.
  • The project that for all these years was somewhere in my mind.
  • The project that I wanted to abandon over a dozen of times.
  • The project that took hundreds or thousands of hours.
  • The project that allow me to learn a lot of.
  • The project that I would have done in a different way if I had had this chance.
  • The project of which I'm extremely proud.
  • The project after which I simply had to rest.
What could it be? The answer is PhD in Computer Science. On 12 April 2016 I defended my doctoral dissertation, written under the supervision of Professor Janusz Sosnowski, under the title:

Methods of analysis of information systems based on logs of historical debuggers

Even now I remember how relieved and happy I felt then :)

In my work I focused on the problem of storing and analysing of data collected by historical / reversible debuggers. I performed a detailed analysis what could be and what should be improved when it comes to working with them. In the result I proposed new models of representation of execution traces and I implemented tools that facilitate working with data recorded by historical debuggers. I also performed experiments showing advantages of my ideas. It was a really, really huge job.

Now you may want to ask some questions:
  • Was it worth it?
  • Why did you do so?
  • Did you work professionally at the same time?
  • How did you share time between PhD studies with your work? Is it possible at all?
  • What did you actually gain?
  • How to start PhD studies?
  • How much could I earn at the university?
  • Would you continue your science career?;
  • Why you didn't write about PhD earlier
  • And many, many more.
I plan a series of post about doing PhD in the computer science. Many topics will be specific for Poland but many will be general. I want to do that because of two things. Firstly, it'll be a form of therapy for me :) I simply want/need to write about something that was so important to me for such a long time. Secondly, I think that there are not so many blogs/articles about writing PhD so it should be simply useful for others.

If you have any specific questions just let me know.

18/03/2016

Two things I learned about HTML and CSS

Home

I've never worked a lot of with CSS. However, from time to time I do something with it, for example in order to check out new possibilities. Recently, I read about CSS transformations and I decided to give it a try. For the beginning I wanted to achieve a very simple effect i.e. a red square with a blue and a green diagonal lines. It sounds simple and it is simple but there are traps in this exercise. I decided to write this post because it took me a moment to figure out that was wrong. It was also difficult to find a solution in Internet. Maybe because it is so obvious ;)

My idea was to use 3 div elements. One for a square and 2 for diagonal lines. I also wanted to use transformations in order to rotate divs so that they look like diagonals. My first attempt looks as follows. Do you know what is wrong? There are 2 main problems here.

Scroll down if you want to see a correct solution:
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.


I changed two things, one in html and one in CSS:
  • The first problem was that I used div as a self closing tag. It is not allowed. Browsers treat <div /> as <div >. It is quite difficult to spot.
  • The second problem was in the greenLine style. It was not enough to rotate the green div by 45 degrees. Firstly we need to translate it in this way: translate(100px,-141px) rotate(45deg). It might be surprising because in the case of the blue div the rotation was enough. However, we we have to remember that the green div is not located in the origin of the coordinate system but just below the blue div. The blue div looks like a thin line but it's height is set to 141 pixels.

14/03/2016

Report from the battlefield #4 - Do not waste my and your time

Home

The Report from the battlefield series is based on my experience as a reviewer. The idea is simple. In order to evaluate programming skills, a candidate is asked to write a simple project. To do so he/she needs to invest some amount of time (roughly speaking a few hours). Taking this into account I assume that he/she must be interested in finding a new job. Otherwise he/she wouldn't spend his/her private time writing a project which rather isn't extremely exciting. The more I'm surprised why some people doesn't care about the first impression.

Here are some examples showing what I'm talking about:
  • A connection string used by the application referred to some server that of course wasn't available to me.
  • The database used by the application didn't contain any sample data.
  • I had to manually create a database used by the application. There was a script but it didn't work without fixes.
  • The application crashed immediately when started.
  • ...
It's wasting of time from my perspective. It is true that all these problems can be fixed quickly but they require additional effort from me. You can believe me that it is extremely annoying. Instead of making an actual review someone forces me to fix bugs. What's the worst these bugs could be avoided easily with a little bit more effort.

Please remember, the first impression is important. It'll be appreciated if a reviewer can run your application just by pressing F5 in Visual Studio (or in another IDE). You can test it in a straightforward way. Before submitting a project to a review, copy it to another machine and try to run it there. It should work without any additional actions.

Currently, if a project cannot be run without problems I don't make a review. However, I have a soft heart and I give a candidate one chance to fix them. Do you think that it's a good approach? I have my doubts because an employer probably wouldn't do so.

27/02/2016

Tips & Tricks: How to tell VS to modify variables in the runtime for us?

Home

Today, I'd like to share with you a simple but useful trick. Imagine yourself that you are debugging an application and you find a place with the following very simple code:
            
var flag = ReadConfiguration();
if (flag)
{
   //...
}
else
{
   //...
}
The problem is that the flag variable is set to false but you need to check what would happen if it is set to true. Of course you can easily change the value of this variable in Visual Studio. But what would you do if this kind of code is executed dozens, hundreds... of times and every time the flag variable must be set to true? One solution is to modify a configuration, another might be to change the source code. However, all these things require an additional action. It would be much more better to tell Visual Studio to do it for us. How? In order to achieve desired effect we can utilize breakpoints and custom actions. I'll show how to do it in Visual Studio 2015.

Firstly, put a breakpoint in the line with if.


Right click the breakpoint and from a context menu select Actions... Then in the text box enter {flag = true}. You can even use IntelliSense here. At the end click Close button.


An that's all. Now, if you run the application under debugger control a flag variable will be set to true whenever a line with a breakpoint is executed. What's more this trick also works with other types of variables and you can also execute methods in this way e.g.:


At the end I want to say 2 things. Custom actions are usually used to write diagnostic messages to Output window. This trick works because in order to write a message Visual Studio has to execute some code and this code can have side effects. Besides, you can also use this trick in older versions of Visual Studio. The only difference is that from a context menu you need to select When Hit... option.

21/02/2016

My list of online editors

Home

Online editors (testers, debuggers) are awesome if we want to quickly test some code. They are also very useful to check our solution when we want to post an answer on Stack Overflow. Here is my collection of various online editors that I encountered, though personally I use only some of them.

I'm publishing it because it can be helpfull for others and because I'd like to have this list easily accessible in Internet. Of course this list is not complete and there are many other editors. If you know something interesting let me know and I will add it here.


Online Editor Language / Technology Share function Collaborate function
yUMLUMLYes
draw.ioDiagramsYes (via Google docs)Yes
moqupsUI moqupsYesYes
ideoneC#, Java, Haskell, C++, Ada and many otherYes
SQL FiddleSQL (MySQL, Oracle, PostgreeSQL, SQLLite, MSSQL)Yes
regular expressions 101Regular expressionsYes
.NET FiddleC#, VB.NET, F#YesYes
C# PadC#
D3.jsD3.js Java Script library
CodePenHTML + CSS + JSYes
jsfiddleHTML + CSS + JSYesYes
JS BinHTML + CSS + JSYesSoon
CSS DeckHTML + CSS + JSYesYes
LiveweaveHTML + CSS + JSYesYes
PlunkerHTML + CSS + JSYesYes
cpp.shC++Yes

By share function I mean a possibility to create a pernament link to our code. Collaborate functions allows a group of developers to write a code together.

16/02/2016

Interview Questions for Programmers by MK #7

Home

Question #7
You have the following code that uses Entity Framework to retrieve data from Northwind database. Firstly it finds customers that are from London and then process their orders. All data model classes were generated with the code first from database approach. Unfortunately, this code contains a bug that can lead to performance problems. Identity this problem and propose a fix.
using (var ctx = new NorthwindContext())
{
   var londoners = ctx.Customers.Where(e => e.City == "London");
   foreach (var londoner in londoners)
   {
      foreach (var o in londoner.Orders)
      {
         foreach (var d in o.OrderDetails)
         {
            //....
         }
      }
   }               
}
Answer #7
This code is a classical example of N+1 select problem where too many queries are sent to a database. The first query will be sent to a database in order to find customers from London. Then for each customer another query will be sent to read orders. Finally, for each order another query will be sent to retrieve details of a given order. Instead, all data could be retrieved by sending only one query. To do so we need to tell EF that we want to read customers together with their orders and orders details. It can be achieved with Include method.
using (var ctx = new NorthwindContext())
{
   londoners = ctx.Customers.Include("Orders").Include("Orders.Order_Details").Where(e => e.City == "London");
   ...
}
A quick test will show that originally 53 queries are sent to a database and after a fix only 1.

07/02/2016

Report from the battlefield #3 - IEnumerable vs IQueryable

Home

Sometime ago I was reviewing the data access layer that was based on Entity Framework. I found a code which immediately attracted my attention. The simplified version is shown below.
public IList<Product> GetAll()
{
   return ctx.Products.Select(p => new Product() { ... }).ToList(); 
}
...
var numberOfProducts = GetAll().Count();
GetAll method is pretty simple because it just reads products from a database. The result returned from this method is used to count number of products in the database. Although it is simple it contains a serious bug. The problem is that it uses ToList method to return a list of products. It causes that ALL products must be retrieved from the database in order to return them in the form of a list. In other words there is no deferred execution here.

If we work with a local database and the number of products is small it shouldn't be a problem. However, this kind of code might lead to difficult to analyse performance problems. For example if our application uses a remote database and/or there are thousands of products. The desired behaviour is that products are counted by a database engine. So let's try to make a fix:
public IEnumerable<Product> GetAll()
{
   return ctx.Products.Select(p => new Product() { ... });
}
...
var numberOfProducts = GetAll().Count();
Now it looks much more better, doesn't it? GetAll doesn't use ToList and returns IEnumerable. interface. Unfortunately this solution is far from being perfect. In comparison to the first version, the only difference is the moment when all products are retrieved from the database. This time it will happen when Count method is executed. Why? Before I'll explain let's see the correct solution:
public IQueryable<Product> GetAll()
{
   return ctx.Products.Select(p => new Product() { ... });
}
...
var numberOfProducts = GetAll().Count();
This time I used IQueryable instead of IEnumerable. This small change is crucial. It causes that no products are read from a database. Entity Framework "sees" that we only wanto to count number of products and an appropriate query is sent to a database. In other words LINQ To Entities is used.

The situation is completely different when we work with IEnumerable. In order to understand a difference we have to realise one thing. Count method for IEnumerable is something different than Count method for IQueryable. With IEnumerable we use LINQ To Objects and LINQ To Objects operates on objects in memory, it cannot communicate with a database. It is why all products must be read from a database if we want to count them.

Now someone inquiring can say that for virtual methods it shouldn't matter if we have variables of type IEnumerable or of type IQueryable if these variables points the same objects. After all C# is an object oriented language that supports polymorphism etc. Well, it is all true but only for virtual methods and Count is not a virtual method. It is an extension method and extension methods don't support polymorphism.

05/02/2016

Sandbox Database Manager

Home

My colleague Tomasz Moska published very nice tool that makes management of development MSSQL sandbox databases very easy. It is called Sandbox Database Manager and you can download it here or from GitHub.

Why is it worth recommending? Try to imagine yourself situation like this. A tester found a bug in the application. In order to reproduce it you need a copy of his database from a system test environment. With Sandbox Database Manager you can make a copy of this database and restore it on a selected server with just a few clicks. Another click or two and you have a snapshot created. Thanks to that you are be able to revert the database to its original state at any time. Now let's assume that this database contains hundreds of tables and you don't know all of them. To investigate a problem you want to run an application and see which tables (probably dozens of them) will be updated and how. Sandbox Database Manager also supports this scenario because it'll allow you to track data changes at the column level.

These are only a few features of Sandbox Database Manager. It can do much more, for example to run the same query against many databases or compare data between two databases. I can guarantee that Sandbox Database Manager is a really, really helpful tool because I use it in my day to day work. I recommend it without any hesitation. What is the best you can use it completely for free!

29/12/2015

Report from the battlefield #2 - amount of data matters a lot

Home

In the next post from the Report from the battlefield series I'll wrote about a serious mistake that is quite common according to my experience. I'm thinking about a situation when a developer assumes that all data from a database can be processed on the client side. I'll give you 2 examples that I encountered during my reviews.

Case 1

A developer was asked to implement the paging functionality. He created a single page Web application. It looked nice and the paging was working correctly at first glance. I decided to check how it was implemented under the hood. I examined a web service that was used by the application and I was surprised. Why? I didn't find a web method responsible for returning pages. The next step was to dig into a java script code. Unfortunately, I discovered that the paging was implemented only on the client side i.e. the application initially downloaded all data from a database (via Web Service).

Case 2

In the another project the paging was implemented correctly on the server side but a developer made a more subtle mistake. An application had a shopping cart function. Of course it was possible to add and remove products to and from a cart. To do so a web service used by the application had a method GetCart. This method was responsible for retrieving a current content of a cart from a database.

However, it was strange that this method returned only identifiers of products. What's more there was no GetProductDetails web method. It made me curious how the application displays products details to users only knowing its identifiers. It turned out that at the initiation the application was reading details of all products from a database. Having all products on the client side it was easy to find details of a product based on its identifier.

Summary

In both cases applications were fast enough because of a small amount of data. In the case of real-life databases they will not. I think that we should always be prepared for the worst case. Especially, when we participate in a recruitment and we want to show ourselves from the best side. An evaluator shouldn't guess whether we know something or not.

24/12/2015

Merry Christmas!

Home


Source: own resources, Authors: Agnieszka and Michał Komorowscy


Giving wishes in a foreign language can be challenging so my wishes will be simple but very sincere. I wish you a Merry, Peaceful Christmas and an Amazing 2016. Let it be at least as good as the past year.

Best wishes,
Michał Komorowski

24/11/2015

Report from the battlefield #1 - EF and DTOs

Home

Some time ago, I started doing code reviews of various projects for the recruitment company. It is an interesting experience and I'm learning a lot by this occasion. I also observed that some mistakes are repeated by different authors. Other are not so common but are not obvious. So I came up with the idea to start a new series of posts under the title "Report from the battlefield". In this series I'll describe my observations and findings from my reviews.

Let's start. Recently, I reviewed a project created with AngularJS + ASP.NET Web API + Entity Framework. The code was neither very good nor very bad. However, I noticed that the author decided to use a class generated from the EDMX model as DTO (Data Transfer Object). The reasoning behind this decision was simple - this class had all properties required on the client side so why not to use it. Well there are a few reasons why it is not a good idea.
  • With dedicated DTOs it is less possible that changes on the server side will affect the client side.
  • With dedicated DTOs we can easily control what will be send to the client side and in what format.
  • With dedicated DTOs the server side model can be completely different from the client side model.
  • By exposing EF classes to the client side we effectively expose the database model to the client side!
You may agree with my points or not. So, I'll give you a practical example what could happen if we use EF classes as DTOs. Let's assume that there is EDMX model with 3 types of entities:
  • Customer with Orders navigation property.
  • Orders with Customer and Products navigation properties.
  • Products with Orders navigation property.
Now we want to read only 1 customer from a database, serialize it to JSON and send the result to the client side. What could go wrong? Well, because of the navigation properties the JSON serializer that is used by ASP.NET Web API will read from the database and convert to JSON the whole graph of customers, orders and products! To be more specific, I saw 0.5 MB response which should have a few kilobytes for a very small database (it contained small dozens of records in all tables)! I can bet that in the case of a production database a response would have hundreds of megabytes.

15/11/2015

Interview Questions for Programmers by MK #6

Home

Question #6
What is the arithmetic overflow and how is it handled in .NET?

Answer #6
It is a situation when the result of an arithmetic operation exceeds (is outside of) the range of a given numeric type. For example the maximum value for byte type in .NET is 255. So in the following example, an operation a+b will cause an overflow:
byte a = 255;
byte b = 20;
byte c = a + b;
The final result depends on the used numeric types:
  • For integer types either OverflowException will be thrown or the result will be trimmed/cropped (the default behaviour). It depends on the compiler configuration and usage of checked / unchecked keywords.
  • For floating point types OverflowException will never be thrown. Instead the overflow will lead either to the positive or the negative infinity.
  • For decimal type OverflowException will be always thrown.
var b = byte.MaxValue;
//The result will be zero because:
//b = 255 = 1111 1111 
//b++ = 256 = 1 0000 0000
//The result has 9 bits so the result will be trimmed to 8 bits what gives 0000 0000
b++; 
         
checked
{
 b = byte.MaxValue;
 //Exception will be thrown 
 b++; 
}

var f = float.MaxValue;
//The result will be float.PositiveInfinity
f *= 2;  

decimal d = decimal.MaxValue;
//Exception will be thrown
d++; 

22/10/2015

TransactionScope and multi-threading

Home

It's my third post about TransactionScope. This time I'll write about using it with multi-threading. Let's start with the following code:
using (var t = new TransactionScope())
{
   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}

private static void UpdateDatabase()
{
   using (var c = new SqlConnection(connectionString))
   {
      c.Open();

      WriteDebugInfo();

      new SqlCommand(updateCommand, c).ExecuteNonQuery();
   }
}

private static void WriteDebugInfo()
{
   Console.WriteLine("Thread= {0}, LocalIdentifier = {1}, DistributedIdentifier = {2}",
      Thread.CurrentThread.ManagedThreadId,
      Transaction.Current?.TransactionInformation.LocalIdentifier,
      Transaction.Current?.TransactionInformation.DistributedIdentifier);
}
It seems simple but it doesn't work. The problem is that a connection that is created in UpdateDatabase method will not participate in any transaction. We can also observe that WriteDebugInfo will write empty transaction identifiers to the console. It happens because in order to read an ambient transaction (the transaction the code is executed in) TransactionScope uses Transaction.Current property which is thread static (i.e. specific for a thread).

To overcome this issue we have two possibilities. The first one is to use DependentTransacion. However, I'll not show how to do it because since .NET 4.5.1 there is a better way - TransactionScopeAsyncFlowOption enum. Let's try.
using (var t = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
   ...
}
Unfortunately, there is a big chance that this time we will get TransactionException with the message The operation is not valid for the state of the transaction. in the line with ExecuteNonQuery. The simplified stack trace is:

at System.Transactions.TransactionStatePSPEOperation.get_Status(InternalTransaction tx)
at System.Transactions.TransactionInformation.get_Status()
...
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at Sandbox.Program.UpdateDatabase(Object o)

I read a lot about this but nobody was able to explain why it happens. I also looked into the source code of TransactionStatePSPEOperation class. It was instructive because I learned what is PSPE - Promotable Single Phase Enlistment. However, it also didn't give me an exact answer.

So, I played a little bit with the code and I noticed that the problem occurs when:
  • One thread tries to run ExecuteNonQuery.
  • Another thread waits for the opening of the connection.
However, when both connections were already opened then the exception wasn't thrown.

At this point it is worth reminding one thing - when there are 2 or more connections opened in a transaction scope at the same time then a transaction is promoted to a distributed one. I'm not 100% sure but I think that the problem occurs because it is not allowed to use a connection which participates in a transaction which is in the middle of promotion to the distributed one. So, the solution is to assure that a transaction will be distributed from the beginning. Here is a fixed code with a magic line (I found it here):

using (var t = new TransactionScope())
{
   //The magic line that makes a transaction distributed
   TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);

   var t1 = Task.Factory.StartNew(UpdateDatabase);
   var t2 = Task.Factory.StartNew(UpdateDatabase);
   Task.WaitAll(t1, t2);
   t.Complete();
}
Nonetheless, the more I think about this the more convinced I'm that using TransactionScope with multi-threading is asking for problems.

13/10/2015

How not to use TransactionScope. Another WTF!

Home

This time I will write again about TransactionScope. It is a very useful class and seems to be extremely easy in use. In majority of cases it is true. However, there are also some pitfalls lurking for developers. Especially for these who don't like to waste time for reading MSDN documentation if not really needed i.e. probably vast majority of us ;)

Some time ago, I was analysing more or less the following code:
using(var t = new TransactionScope())
{
   var c = ConnectionProvider.ProvideConnection();
   //Use a connection to update a database
   //...
   t.Complete();
}
ConnectionProvider is a class that hides details of managing connections to a database. There was also a bug in the code responsible for updating a database which caused exceptions. I fixed it and I run tests again. This time an exception was not thrown but something was wrong because a database contained unexpected data. It looked like the transaction was not rollbacked!

Firstly, I though that it is some kind of magic. However, as usual in this kind of cases it wasn't. I digged into ConnectionProvider and I found out that this class was performing some kind of pooling and a connection wasn't opening every time. It was a big problem because connections opened outside a transaction scope do not participate in a transaction. The solution of this problem is to explicitly enlist a connection in an existing transaction scope with the EnlistTransaction method.

It is also worth highlighting that the described problem won't occur if ConnectionProvider doesn't try to implement polling on its own. In general we don't have to do it because .NET do it for us. The problem will also not occur if using statement is used to close a connection returned by a provider.

09/09/2015

TransactionScope + Ninject + a small mistake = WTF

Home

Sometimes one stupid mistake can cost a lot of time. A few days ago my application (AngularJS + ASP.NET Web API) started crashing because of the following error:

MSDTC on server 'XXX' is unavailable

It was strange. I wasn't aware of any distributed transactions in my application. To be honest, I was using TransacionScope but I was sure that there was no reason to promote a lightweight transaction into a distributed one. To make things more strange the error wasn't reported every time. When I tried to update data for the first time everything was ok. However, the second attempt (and every next) was failing.

It took me some time to examine all recent changes but finally I found a problem. It was quite tricky so I decided to write about it. Let's start with the fact that I use Ninject as a dependency injection container. Among others Ninject allow us to control a lifetime of objects (instances). Particularly, in the case of web applications, we can use:
  • InRequestScope method - it tells Ninject that one object of a particular type should be created for each individual request.
  • InSingletonScope method - it tells Ninject that one object of a particular type should be created for all requests.
For example:
kernel.Bind(x => x
   .FromAssembliesMatching("test.dll")
   .SelectAllClasses().InheritedFrom(typeof(IInterfaace))
   .BindAllInterfaces()
   .Configure(z => z.InSingletonScope()));
The problem was that accidentally I mixed InSingletonScope and InRequestScope. For example, let's assume that each request requires objects of two classes A and B. Objects of type A are within the request scope and objects of type B are within the singleton scope.

Both objects perform updates/inserts/deletes and are used inside TransacionScope. For the first request it is not a problem. Both objects are initialized within the same request and use the same database connection. It means that a lightweight transaction is used.

However, for the second (and every next) request an object of type B is re-used whereas a new object of type A is created. Object of type B was initiated in the previous request and it uses a different connection than the one used by an object of type A. It means that a distributed transaction will be used in this case.

To sum up:
  • DI containers give a great power but with the power comes great responsibility.
  • Be careful when using objects of a different scope together. Especially when these objects require data access.
  • Be careful when using multiple connections inside TransacionScope. In the case of MSSQL 2005 in this situation a distributed transaction will be always used. In the case of MSSQL 2008 or newer it is possible to use more than one connection inside TransacionScope without automatic promotion. However, if and only if these connections are not opened at the same time.
  • TransactionScope automatically escalating to MSDTC on some machines? is a great source of knowledge about TransacionScope and about the process of promoting lightweight transactions into distributed ones.

21/08/2015

Do you know OUTPUT clause?

Home

Today, I'll write about using OUTPUT clause together with INSERT statements. It seems to be that it is not a very well known syntax. However, it is especially useful when use Identity columns to generate keys. Let's start with a simple table:
CREATE TABLE dbo.Main
( 
 Id int Identity (1,1) PRIMARY KEY,
 Code varchar(10),
 UpperCode AS Upper(Code)
);
The old fashioned approach to retrieve a value of Identity column for a new row is to use SCOPE_IDENTITY(). For example:
INSERT INTO dbo.Main (Code) VALUES ('aaa');
SELECT SCOPE_IDENTITY();
With OUTPUT clause it will look in the following way:
DECLARE @InsertedIdentity TABLE(Id int);
INSERT INTO dbo.Main (Code) OUTPUT INSERTED.Id INTO @InsertedIdentity VALUES ('aaa')
SELECT TOP(1) * FROM @InsertedIdentity
You can say wait a minute. If I want to use OUTPUT I have to declare a table variable first and then use SELECT. It is more complex than just using SCOPE_IDENTITY().

Well, the first benefit is that with OUTPUT clause we can read values from many columns, including these that are computed (as it was shown above). However, the real power of OUTPUT clause can be observed if we want to insert many rows into a table:
DECLARE @ToBeInserted TABLE(Code varchar(10), Name varchar(100));

INSERT INTO @ToBeInserted
VALUES ('aaa','1111111111'), ('ddd','2222222222'), ('ccc','3333333333');

DECLARE @Inserted TABLE(Id int, Code varchar(10), UpperCode varchar(10));

INSERT INTO dbo.Main (Code)
OUTPUT INSERTED.Id, INSERTED.Code, INSERTED.UpperCode INTO @Inserted
SELECT Code
FROM @ToBeInserted;

SELECT * FROM @Inserted;
Without OUTPUT we would have to write a nasty loop!

Here is one more example. Let's assume that we have an additional table that references dbo.Main.
CREATE TABLE dbo.Child
( 
 MainId int,
 Name varchar(100),
 CONSTRAINT [FK_Child_Main] FOREIGN KEY(MainId)REFERENCES dbo.Main (Id)
);
We want to insert a few rows into dbo.Main and then related rows to dbo.Child. It is quite easy if we use OUTPUT clause.
INSERT INTO dbo.Child (MainId, Name)
SELECT i.Id, tbi.Name
FROM @ToBeInserted tbi
 JOIN @Inserted i ON i.Code = tbi.Code;
Extremely useful thing that you must know!

At the end it is worth mentioning that OUTPUT clause can be also used together with UPDATE, DELETE or MERGE statements.

02/08/2015

Oracle VM VirtualBox and Windows 8.1

Home

In my day to day work I use a 64 bit version of Windows 8.1 Pro N. I needed a virtualization software so I decided to use a free Oracle VM Virtual Box. Everything was ok up to the moment when I wanted to install a x64 version of an operating system on a fresh virtual machine. To my surprise VirtualBox reported the following error:

VT-x/AMD-V hardware acceleration has been enabled, but is not operational. Your 64-bit guest will fail to detect a 64-bit CPU and will not be able to boot.

Please ensure that you have enabled VT-x/AMD-V properly in the BIOS of your host computer.


After some time I noticed that VirtualBox stopped showing 64 bit versions in the Version list. Well, it was actually good because I couldn't use a64 bit virtual machines anyway ;) But, I still didn't know why it happened.

I checked BIOS settings and it seemed ok. I searched Internet for the answer but everyone were recommended to verify BIOS configuration what I've already done. I needed a new VM quickly so at that point I installed a x86 version of Windows.

A few days later my colleague Przemek suggested that the problem may be in the conflict between Hyper-V and VirtualBox and that I should disable Hyper-V. It was strange because I've never installed Hyper-V. However, I checked and I discovered that Hyper-V features were enabled on my computer. It seems to me that they are installed by default with the operating system.



The solution was easy. I pressed Win+S and typed Turn windows feature on or off. Then, I cleared a box next to Hyper-V and restarted computer. After that I was able to install a x64 version of a operating system on a virtual machine.

To sum up, if:
  • Your host system is a x64 version of Windows 8.1.
  • Virtualization is enabled in BIOS.
  • You use VirtualBox.
And you cannot install x64 operating system on a virtual machine then try to disable Hyper-V.

27/07/2015

A hint how to use TaskCompletionSource<T>

Home

Some time ago I wrote about using TaskCompletionSource<T> class in order to take advantage of async/await keywords. In that post I included the following code:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();
   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
Now, Let's assume that we want to provide a possibility to cancel a task returned from ProcessFileAsync method. We can do something like that:
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.SetResult(new MemoryStream(response.RawBytes)), tcs.SetException);
   return tcs.Task;
}
I used CancellationToken.Register method in order to register a callback that will be executed when a token is canceled. This callback is responsible for notifying TaskCompletionSource<T> that underlying task should be cancelled.

You may say that it is not enough because this code doesn't inform DropNetClient that an action should be cancelled. You are right. However, according to my knowledge DropNet API doesn't provide such a possibility.

It leads to the situation when a task is cancelled but DropNetClient continues processing and finnaly TaskCompletionSource.SetResult method will be executed. This will cause ObjectDisposedException because this method cannot be executed for a disposed task. What can we do in this case?

The first solution is to check if a task is cancelled before calling SetResult method. However, it can still happen that a task will be cancelled after this check but before calling SetResult method.

My proposition is to use methods from TaskCompletionSource.Try* family. They don't throw exceptions for disposed tasks.
public async Task<Stream> ProcessFileAsync(string key, string secret, string path, CancellationToken ct)
{
   var client = new DropNetClient(key, secret);
   //...
   var tcs = new TaskCompletionSource<Stream>();

   ct.Value.Register(tcs.SetCanceled);

   client.GetFileAsync(path, response => tcs.TrySetResult(new MemoryStream(response.RawBytes)), tcs.TrySetException);
   return tcs.Task;
}
I'm aware that it is not a perfect solution because it actually does not cancel processing. However, without modifying DropNet code it is not possible. It the case of my application it is an acceptable solution but it is not a rule.