13/02/2017

C++ for C# developers - automatic garbage collection



Title: Sushi on the fish market in Tokyo, Source: own resources, Authors: Agnieszka and Michał Komorowscy

In my previous post I wrote that C++ doesn't have the automatic garbage collection. However, it's not entirely true. Let's think about it more carefully. To start, look at the following method in C#. It simply creates an instance of MyObject class.
public void Fun() 
{
   var x = new MyObject();
   //...
}
When this object will be collected by GC? When it is needed but not sooner than the execution of Fun method is finished. Can we do the same thing in C++? Of course yes, here is an example:
void fun() {
   MyObject x;
   //...
}
If you are not familiar with C++ you may say that I only declared the variable x in this example and that I didn't create any object. Well, in C++ the following 3 commands are equivalent:

MyObject x;
MyObject x();
MyObject x = MyObject();

Personally, I still cannot used to that ;) But let's return to C++ and ask the similar question as for C#. When a destructor will be called for created instance of MyObject? The answer is easy - when the execution of fun method is over. Or in other words when an object goes out of scope. It's worth nothing that this behaviour is called the automatic storage duration. Usually it is implemented by using stack however it's not the rule. Now, let's consider this code in C++:
void fun() {
   MyObject* x = new MyObject();
   //...
}
It looks almost the same like in C#. However, this time we're creating an object dynamically with new keyword. And this kind of objects won't be removed automatically from the memory, even if the execution of fun is over. This is also known as the dynamic storage duration. How to release this kind of objects?

In the past a C++ developer had to use delete keyword to do so. I did the same in the code from my post about virtual destructors. However, since C++ 11 we can use something else i.e. the explicit automatic garbage collection. More precisely I'm talking about smart pointers. Here is the rewritten version of Node class:

class Node {
 public:
  static int _count;
  
  Node(int i) : _value(i) { Node::_count++; }
  ~Node() {
    std::cout << "~Node " << _value;
    _count--;
  }
  
  int _value;

  std::unique_ptr<Node> _left = nullptr;
  std::unique_ptr<Node> _right = nullptr;
};

int Node::_count = 0;
In this code I defined ~Node destructor only for its side effects i.e. to decrease a counter. I didn't use delete in this code at all. Instead I wrapped pointers with std::unique_ptr. It works in this way that it releases the pointer when it goes out of scope. Nothing more nothing less. But thanks to that we don't have to remember about delete. Almost like in C#. Here is the testing code:
int main()
{
    Node* root = new Node(1);
    root->_left = std::unique_ptr<Node>(new Node(2));
    root->_right = std::unique_ptr<Node>(new Node(3));

    std::cout << " Existing nodes: " <<  Node::_count;
    delete root;
    std::cout << " Existing nodes: " <<  Node::_count;
}
I didn't wrap root in a smart pointer because I wanted to explicitly use delete and verify the final number of nodes. Easy, isn't it?

At the end it's worth mentioning that there are also different types of smart pointers. shared_ptr should be used when the same pointer can have many owners. Whereas std::weak_ptr represents the same concept as C# WeakReference class. Last but not least, except the automatic storage duration and the dynamic storage duration we also have the static storage duration and the thread storage duration. The former is used to store static variables which are release at the end of program (pretty the same as in C#) and the latter to store variables that will survive until the end of a thread (in C# we can use TheradLocalStorage for the similar effect). More reading can be found here.

06/02/2017

C++ for C# developers - virtual destructors



Title: Tokio, Source: own resources, Authors: Agnieszka and Michał Komorowscy

In C# it's simple, we use destructors a.k.a. finalizers almost never. The only one case when they are inevitable is the implementation of Disposable pattern. In C++ the situation is different because we don't have the automatic garbage collection. It means that if we create a new object with new keyword we have to destroy it later by using delete keyword. And if the object being deleted contains pointers to other objects created dynamically, they also need to be deleted. It's where destructors come to game. Here is an example with a class Node which models a binary tree. It's simplified and it is why all fields are public, don't do it in the production! Node::_count is a static field that I'm using to count created objects.
#include <stdexcept>
#include <iostream>

class Node {
 public:
  Node(int i) : _value(i) { Node::_count++; }
  
  ~Node() {
    std::cout << " ~Node " << _value <<;
    
    if(_left != nullptr) delete _left;
    if(_right != nullptr) delete _right;
    
    _count--;
  }

  static int _count;
 
  int _value;
  
  Node* _left = nullptr;
  Node* _right = nullptr;
};

int Node::_count = 0;
Here is a testing code. If you run it you should see the result as follows: Existing nodes: 3 ~Node 1 ~Node 2 ~Node 3 Existing nodes: 0. We can see that all nodes have been deleted and that a destructor was executed 3 times.
int main()
{
    Node* root = new Node(1);
    root->_left = new Node(2);
    root->_right = new Node(3);

    std::cout << " Existing nodes: " << Node::_count;
    delete root;
    std::cout << " Existing nodes: " << Node::_count;
}
Now let's derived a new class from Node in the following way:
class DerivedNode : public Node {
 public:
  DerivedNode(int i) : Node(i) {
  }
  ~DerivedNode() {
    std::cout << " ~DerivedNode " << _value;
  }
};
And modify a testing code a little bit in order to use our new class:
int main()
{
    Node* root = new DerivedNode(1);
    root->_left = new DerivedNode(2);
    root->_right = new DerivedNode(3);

    std::cout << " Existing nodes: " << Node::_count;
    delete root;
    std::cout << " Existing nodes: " << Node::_count;
}
The expectation is that ~DerivedNode destructor should be called together with the base class destructor ~Node. However, if you run the above code you'll notice see that it's not true i.e. you'll see the same result as earlier. To explain what's going look at the C# code below and answer the following question: Why I see "I'm A" if I created an instance of class B
public class A
{
   public void Fun() { Console.WriteLine("I'm A"); }
}

public class B: A
{
   public void Fun() { Console.WriteLine("I'm  B"); }
}

A a = new B();
a.Fun();
I hope that it's not a difficult question. The answer is of course because Fun is not a virtual method. In C++ we have the same situation. Now you may say "Wait a minute, but we're talking about destructors not methods". Ya, but destructors are actually a special kind of methods. The fix is simple we just need to use a concept completely unknown in C# i.e. a virtual destructor.
virtual ~Node() {
  ...
}
This time the test code will give the following result Existing nodes: 3 ~DerivedNode 1 ~Node 1 ~DerivedNode 2 ~Node 2 ~DerivedNode 3 ~Node 3 Existing nodes: 0 .

30/01/2017

C++ for C# developers - var and foreach



Title: A-Bomb dome in Hiroshima, Source: own resources, Authors: Agnieszka and Michał Komorowscy

When I returned to programming in C++ after years of using C# a few things were especially painful. Today I'll wrote about 2 at the top of the list. The first one was a need to explicitly declare types of local variables. For example:
std::vector< std::string > v = someMethod();
std::map< std::string, std::map<std::string, std::string> > m = someMethod2();
It looks terrible and is simply cumbersome. However, as you may noticed I used the past tense. It turned out that it's not needed any more. Glory and honor to C++11!!! Now, I can write something like this.
auto v = someMethod();
auto m = someMethod2();
The second problem was the lack of foreach operator. For example let's write a code that iterates through a map from the example above:
typedef std::map<std::string, std::map<std::string, std::string>>::iterator outer_iterator;
typedef std::map<std::string, std::string>::iterator inner_iterator;
    
for(outer_iterator it1 = m.begin(); it1 != m.end(); it1++) {
   for(inner_iterator it2 = it1->second.begin(); it2 != it1->second.end(); it2++) {
      std::cout<< it1->first << " " << it2->first << " " << it2->second << std::endl;
   }
}
Again it looks terrible and is cumbersome. All this begin(), end(), typedef are horrible. We can fix it a little bit if we use auto keyword:
for(auto it1 = m.begin(); it1 != m.end(); it1++) {
 for(auto it2 = it1->second.begin(); it2 != it1->second.end(); it2++) {
  std::cout<< it1->first << " " << it2->first << " " << it2->second << std::endl;
 }
}
But even the better result we will achieve if we use a new for loop syntax:
for(auto it1 : m) {
   for(auto it2 : it1.second) {
      std::cout<< it1.first << " " << it2.first << " " << it2.second << std::endl;
   }
}
The difference is striking! It's so much readable and easier to write and uderstand.

23/01/2017

C++ for C# developers - code like in Google



Title: Elephant Retirement Camp in the vicinity of Chiang Mai, Source: own resources, Authors: Agnieszka and Michał Komorowscy

In the post Nuget in C++ rulez I wrote that I returned to programming in C++. It is like a new world for me but it's better and better. I'm even reminding myself things that I learned many years ago so it's not bad with me ;) Recently, I've discovered a C++ alternative for .NET StyleCop. StyleCop is a tool that analyses C# code in order to check if it is consistent with given rules and good practices. What is obvious there is a similar thing for C++ I'm talking about a tool called CppLint that was created by Google. It's written in Python and is fairly easy in use. However, please note that CodeLint requires the old Python 2.7. I tried and it won't work with Python 3.5.

When I run CppLint on my code it turned out that my habits from C# don't fit to C++ world according to Google. Here is an example of Hello Word written in C++ but in C# style.
#include <iostream>

namespace sample 
{
    class HelloWorld
    {
        public:
            void Fun()
            {
                std::cout << "Hello World Everyone!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" << std::endl;
            }
    };
}

int main()
{
  sample::HelloWorld hw = sample::HelloWorld();
  hw.Fun();
  
  return 0;
}
If we verify this code, we will get the following errors:
a.cpp:0:  No copyright message found.  You should have a line: "Copyright [year] <Copyright Owner>"  [legal/copyright] [5]
a.cpp:3:  Line ends in whitespace.  Consider deleting these extra spaces.  [whitespace/end_of_line] [4]
a.cpp:4:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
a.cpp:5:  Do not indent within a namespace  [runtime/indentation_namespace] [4]
a.cpp:6:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
a.cpp:7:  public: should be indented +1 space inside class HelloWorld  [whitespace/indent] [3]
a.cpp:9:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
a.cpp:10:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
a.cpp:13:  Namespace should be terminated with "// namespace sample"  [readability/namespace] [5]
a.cpp:16:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
a.cpp:19:  Line ends in whitespace.  Consider deleting these extra spaces.  [whitespace/end_of_line] [4]
a.cpp:21:  Could not find a newline character at the end of the file.  [whitespace/ending_newline] [5]
At the beginning of each line we have the line number where an error was detected. The number in square brackets at the end of each line informs you how confident CppLint is about each error i.e. 1 - it may be a false positive, 5 - extremely confident. In order to fix all these problems I did the following things:
  • Added Copyright 2016 Michał Komorowski.
  • Removed whitespaces at the end of lines.
  • Added a new line at the end of file.
  • Added a comment // namespace sample
  • Move curly braces. This one I don't like the most.
  • Break a too long line. It's also a little bit strange to me. 80 doesn't seem to be a lot. However, shorter lines makes working with multiple windows easier (see also this answer).
Then I run a tool again and I got some new errors:
a.cpp:6:  Do not indent within a namespace  [runtime/indentation_namespace] [4]
a.cpp:7:  public: should be indented +1 space inside class HelloWorld  [whitespace/indent] [3]
I also fixed them and the final version of Hello Worlds compliant with Google rules looks as follows: Here is the correct version:
// Copyright 2016 Michal Komorowski

#include <iostream>

namespace sample {
class HelloWorld {
 public:
  void Fun() {
    std::cout
      << "Hello World Everyone!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
      << std::endl;
  }
};
}  // namespace sample

int main() {
  sample::HelloWorld hw = sample::HelloWorld();
  hw.Fun();
  return 0;
}

It's worth adding that CppLint has many configuration options for example you can disable some rules if you don't agree with them or change the maximum allowed length of a line (default is 80). Options can be also read from the configuration file CPPLINT.cfg.

16/01/2017

When Excel is better than machine learning?



Title: Ruins of a castle in southern Poland, Source: own resources, Authors: Agnieszka and Michał Komorowscy

I can bet that some of you think that I'm crazy because I'm saying such blasphemies! Surely everyone knows that Excel is not for real developers ;) If you think so, I'll tell you a short story.