Cunning Planning

How Compilers Work - The Problem of Parsing

Parsing is the second stage of compilation, it comes right after lexical analysis. While lexical analysis focuses on breaking down the input text into a set of tokens, parsing focuses on ensuring that the relationships between those tokens are valid. Lexical analysis is about valid words, parsing is about grammatically correct sentences. The problem with parsing, is that at this point during most discussions, the text moves headfirst into theory - and there is quite a bit of it.

C# Value Types are Objects

Value types in C# are Objects, but they don’t behave exactly like Objects do. A part of the reason for this is that all value types are implicitly derived from System.ValueType. public abstract class ValueType { protected ValueType(); public override bool Equals(object obj); public override int GetHashCode(); public override string ToString(); } But wait! Structs can’t derive from classes! At first glance, this seems true. You can’t for example do this:

A Short History of Sorting in C#

Recently, while reading Jon Skeet’s excellent book“C# in Depth” I came across the long and varied history of sorting in the C# language. It provides a tantalising view of how the language has evolved over the years. Apart from being a great example on C# in particular it’s also a good example on how languages have improved over the years to become more expressive, while maintaining backward compatibility with earlier versions.

C# Abstract Properties, Virtual Properties and Access Modifiers

Properties in C# are first class citizens, this means that they can be declared as abstract, virtual, private and protected. This is not the general use case though, most of the time properties are used to encapsulate fields. class Customer { public int Number { get; set; } public string Name { get; set; } } Customer customer = new Customer(); customer.Number = 9; customer.Name = "Mrs Lovett's Pie Shop"; n the example above, auto implemented properties are used to aid encapsulation.

How Compilers Work - Lexical Analysis

Compilers are wonderful, if somewhat under appreciated programs. Like the foundations of great buildings, they form the latticework on which modern computing is built. Unfortunately, they have been a bit neglected of late. Writing code tends to be more about innovating and creating exciting new businesses than the actual pleasure of knowing how things work, how to break them and how to fix them again. The modern day queen of compilers, Grace Hopper[1] , started life by taking clocks apart and trying to put them back together.

Emacs - Debugging Code under Linux with GDB

Visual Studio provides a very nice developer experience when debugging code. Emacs, on the other hand, doesn’t really shine in this area. It relies heavily the on strong debugging tools that ship with Linux, particularly GDB. While GDB is powerful and extensive, it’s command line interface is dated, and can come as a shock to a developer used to the Visual Studio way of doing things. Let’s take the play program below, which introduces an obvious bug involving an array overflow which causes stack corruption.

Emulating Classical Inheritance in Javascript

Javascript isn’t a classical language, at least not yet. It’s prototypal, which is a bit like every object being a class in waiting. You can create graphs of objects where one object inherits from another, but the idea of classes doesn’t really exist. This can be a drawback, specially with code encapsulation. It becomes difficult to hide private implementation details from public interfaces to the object because Javascript provides no native access modifiers.

C++ Constructors and Move Semantics

In order to do almost anything with a class in C++, you need to define a few constructors. If you don’t, depending on usage, the compiler will generate those constructors for you. For example, consider the following code: class Person { public: string name; int age; }; The above class does not explicitly define a constructor of any kind and if you don’t use the class at all, the compiler will just ignore it.

CMake Platform Specific Actions

Sometimes you need to link one library on Windows and another on Linux (or vice versa). CMake has a number of variables that allow you to do this easily: UNIX TRUE on all UNIX-like OS’s including Apple OS X and CygWin WIN32 TRUE on Windows including CygWin APPLE TRUE on Apple systems. ** MINGW TRUE when using the MinGW compiler in Windows MSYS TRUE when using the MSYS developer environment in Windows CYGWIN TRUE on Windows when using the CygWin version of cmake ** Note this does not imply the system is Mac OS X only that APPLE is #defined in C/C++ header files.

Pessimistic vs Optimistic Concurrency

Concurrency is when multiple jobs see incremental progress within the same time period. It may be that the concurrency is parallel where both jobs progress together at the same time or time sliced wherein parts of both jobs are done one after the other by a single processor. Whichever the case concurrency creates problems for shared resources. When there is only one job executing at a given point in time the resource can be viewed as consistent on any access.

CMake find_package

In the last scroll on CMake we discussed adding libraries that you write to your own code. This time, we look at using third party libraries out of the box. This turns out to be fairly easy with CMake - though there are exceptions (alternatively spelled “bugs”). Adding a third party package such as the precious Boost libraries for C++ is fairly easy with CMake. In your CMakeLists.txt file add a call to FIND_PACKAGE like so:

Redis vs Azure

Redis and Azure table storage have a lot in common. Both have no schema’s enforced on the objects that they store. Both have limited complex query capabilities and both are touted as being “post SQL” storage. However there is one major difference between the two that tends to dampen comparisons and that is how they deal with concurrency. Redis deals with concurrency by queuing requests and handling them one at a time.

High Performance Code - Make No Assumptions

Concurrent bear – parallel fish Often when writing high performance code one can fall into the trap of making assumptions about the nature of the problem. Donald Knuth said “Premature optimisation is the root of all evil”. If that is the case then prejudiced optimisation is the seed. I learned the bitter lesson myself. Recently while building an architecture to support crunching what I thought was a moderately large amount of data I made the mistake of designing it with multiple threads of execution.

Emacs - Frames, Windows and Buffers

First for some terminology. What Emacs called “frames” is what everyone else now calls windows. This is because Emacs started out as a full screen terminal tool and GUI’s came along later and because of that what Emacs calls windows is now what most visual studio programmers would know as split views. You can experiment with creating new frames, which are by default bound to [C-x 5 2] and deleting the created frame by selecting it and pressing [C-x 5 0].

CMake add_library

In the last post we built an executable from a source file and a header only library. However, sometimes we need to build shared library binaries that can be linked into an executable. This is fairly simple with CMake. Instead of using add_executable like the last time we now simply need to use add_library like so: # Sources file(GLOB Library_SOURCES *.cpp) file(GLOB Library_HEADER *.hpp) # Executable add_library(LibProject ${Library_SOURCES} ${Library_HEADERS}) CMake will now get the build system that you choose to create a shared library that can be linked into an executable one.

CMake Include Directories

To do almost anything today with C++ requires the use of a number of different third party libraries. Some of these come in the form of header only libraries containing only include files (.h and .hpp). The references to these libraries are easy to resolve with CMake, let’s contrive an example. Consider the following folder structure. [CMakeHome] [CMakeHome]\src Contains all the source code for the project [CMakeHome]\headerlib\ Contains a small header only library with a number of different applications

CMake The Beautiful Beast

CMake is a mix of things. It’s a cross platform makefile generator that makes porting your C/C++ code from one OS to another or from one compiler to another inside the same OS really simple. However it complicates things a bit as well as now you need to write an extra “CMakeLists.txt” file for each of your code directories in order to allow CMake to generate those files for you.

Machines and Morality

The murder of Jews and other “undesirables” were at first carried out by special SS outfits called Einsatzgruppen. The method was to get the intended victims to dig ditches and then line them up and shoot them. However this took a toll on the executioners as they often had to shoot unarmed men, women and children at point blank range. When SS Reichsführer Himmler himself saw such a shooting it’s said that he nearly fainted and mandated that a more humane process should be created for the final destruction of the Jews.

World War and Economic Unions

It’s interesting that when the European project kicked off it was against the backdrop of the end of the second World War. Europe had seen two world wars within a generation with much resultant suffering and destruction. There must have been an urgency therefore to stop a third from happening on the continent by creating a pan national entity that binds many nations together. Perhaps the thought was that if one nation desires to start a campaign of aggression against another – if the cost of the aggression would hit home very quickly because of mutual economic dependence then the desire for such an aggression would quickly fizzle out in the home nation.

CMake Trickery - Platform Specific Linking

Quite often when your working on cross platform code you need to build applications that use one set of libraries when compiled under Microsoft Windows and another set when compiled under Unix/Linux. Fortunately CMake is a cunning tool to help you on your way. It has a bunch of predefined constants that you can use to switch libraries based on the platform like so: # SomeCoolLibrary IF (WIN32) message(STATUS "Building on Microsoft Windows.

Unicode on Linux

At first the place of Unicode in Linux looks simple. It is deceptively so. Linux C/C++ has support for both char and wchar_t. Let’s take a look at the char version: #include <cstring> #include <iostream> int main(int argc, char* argv[]) { const char text[] = "Άλφα"; std::cout << text << std::endl; std::cout << "sizeof(char): " << sizeof(char) << std::endl; std::cout << "sizeof(text): " << sizeof(text) << std::endl; } This results in the output:

The Configuration of Emacs

Emacs can only be configured naked by moonlight. I kid, I kid – the moonlight is optional. Emacs has a number of “modes”. Much has been said about them on the interwebs so I’ll take the easy way out and link to a such an article. There are as you would have seen two kinds of modes, major modes and minor modes. Major modes change what Emacs believes is the content, for example there’s a major mode for C and a major mode for Python.

Emacs Key Bindings

Emacs is like your average dragon, very cranky before breakfast and set in its ways. Take for instance the key bindings. For you to do anything – and I mean anything at all you need to type in an arcane set of shortcut keys, like for instance the basic Control Meta X, Control Sacrifice Goat, Enter Enter, Control F command which launches the built in dishwasher. You might be tempted to change the key bindings to better suite the feel of your windows environment but that is not a very good idea for the vast majority of cases.

The Bumpy Road to Emacs

The Road goes ever on and on Down from the door where it began. Now far ahead the Road has gone, And I must follow, if I can, -Bilbo Baggins, The Fellowship of the Ring Sometimes the completely respectable Visual Studio programmer is called upon to follow the arduous road to places unknown where danger and dragons await him at every turn. One such dragon is called Emacs the Keybinder and subduing this beast requires a great deal of effort to those who have lived in the relative comfort of Visual Studio land.

Big Data and Intelligent Machines

Big data has been getting a lot of attention lately as a field in computing that could significantly increase the value of the vast amounts of data that are captured by digital networks and devices. Phrases like “Data is the new oil.” has been flying around glibly for a while now but it’s difficult to extract the truth from the vast amount of data on big data that’s out there. It sounds like a job for Big Data… but I get ahead of myself.

The pImpl Idiom – Hiding Private Implementation Details

One key principle behind the design of classes is that of the encapsulation of the implementation from the clients using the code. Here a client is any bit of source code that uses the classes that you write. C++ provides an imperfect implementation of this idea in the way it defines rules for classes. Consider the following class which is declared in event.hpp. #include "system_clock.hpp" class Event { private: SystemClock mClock; std::string mMessage; public: Event(std::string message); ~Event(); }; The Event class models an event that can happen at some point in time.

Sociable Programming

Programmers are not known to be social creatures. It’s an unfortunate reputation really, considering that programmers spend most of their lives writing their thoughts to each other - they should be right up there in the social landscape with Paris Hilton and Lady Gaga. The problem of course is that programmers never recognise the social aspect of their work. They think they write code for computers. Nothing could be further from the truth.

Step on that Code

In Sociable Code I mentioned that an often overlooked consideration when writing code is the strain that it puts on the programmers who end up reading it. Difficult to read code tends to mash the wetware and end up causing more bugs and longer resolution times. One way to reduce that effort is to write code that would make control flow explicit when being stepped through with a debugger. Take a look at this bit of code.

Unicode and Windows

Unicode string support in Windows is a strange beast. When Microsoft hired Dave Cutler and others to write a new OS for them they were thinking about writing a new version of OS/2. In fact the new Microsoft OS was a combined effort by Microsoft and IBM that would have resulted in OS/2 version 3.0 if not for the success of Microsoft’s own Windows 3 which made them reconsider that approach.


For many years programmers and strings existed in a state of balance, as much balance as could have been possible under the constant threat of undefined behaviour that sociopathic string functions threatened in any case. It was not to be however for their lives of relative stability was forever changed with the advent of Other Languages into string buffers. In this part of our continual study of strings we delve into intriguing world of internationalisation, multi byte character sets and Unicode.

The Private Lives of Strings

Ah, Strings. Surprisingly simple for most C# and Java programmers yet so complex for those of us left behind in the world of C and C++, Strings are perhaps one of the strangest beasts in the programming landscape. Yet they began so simple, perhaps deceptively so like a chameleon on a plain coloured leaf waiting to deceive the observer with its next background. Yet I have tracked this beast down through the ages, yea, verily I have identified its habits and its lair and here I present to you in true Attenboroughish fashion the Private Life of Strings.