Once again I missed a Sunday to do a post. Oh well. These of you who actually read ma post should know, that, even I am posting on each Sunday, I only make an interesting post every three weeks, or so.

Today, I have news from the VISlib: the VISlib is now finally released as version 1.0!

With that, the VISlib reached her final development state. We will only continue to fix bugs, based on that release, but we will not introduce new functionality. Instead we implement The.Vislib.Legacy-Project in the trunk of the repository of the VISlib. This project is a step-by-step migration towards TheLib.

Last weak I wrote about legacy code. That topic prevails.

There are two fundamental views of legacy code:

  1. Some people believe legacy code is trash and needs to be removed as fast as possible.
  2. Some other people believe legacy code is a gold mine to be harvestes as fast as possible.

The truth is not somewhere in the middle, like it is often. Instead the truth is that both types of arguments are right. Legacy code is a trash pile full of gold! The gold must be harvested and the trash must be removed. However, that, of course, needs a lot of effort to be achieved. One should not be greedy. With each project, one should remove a little bit of the trash and should reveal a little bit of the gold. Over time, everything will get better.

For TheLib we have that problem as well, of course! To implement the “small steps” I was writing about we started the.vislib.legacy project.

Whenever you work on a software, which is not trivial or small, than you have to work with legacy code. I talk about old codes. Either one wants to re-use them or one does not. Regardless, however, you always have to marry the new codes with the old ones, and that is never easy. I gets especially hard when have interfaces to the outside world, i.e. other software using at least parts of the software one is working on. Facing such tasks separates the men from the boys. :-P I am not talking about programming. Programming is easily learned and teached. I talk about software development, design, and architecture.

Current, I work with friends on the creation of the THElib, as successor and replacement of the infamous VISlib. The VISlib has several design errors which we cannot fix due to strong dependancies with other software projects. Of course, we are not rewriting THElib from scratch. She is based on large portions of the VISlib, streamlined and corrected. But here we face a huge foundation of legacy code we need to cope with. It is simply not simple.

The maybe most important coding project of myself which has impact on my private programming as well as on my work is thelib_icon16 TheLib. The basic idea is to collect all classes which we (two friends and myself) wrote and used several times in several different projects over and over again. These classes usually are wrappers for compatibility or convenience around API calls or library calls (e.g. STL, Boost, whatever). That’s where the name of our lib cames from: Totally Helpful Extensions. And, it is just cool to write: #include "the/exception.h".

However, I hear very often: “Why do you write a lib? There are plenty already for all tasks.”

If that would be true, none of us would write programs anymore and we would only “compose” programs from libs. Well, we don’t. Or, rather I don’t. Meaning: TheLib really is helpful. It is not a replacement for the other libs. It’s a complement, and extension.

On Example: Strings!

The string functionality in TheLib is not nearly as powerful as one would required to write a fully fledged text processor. This is not the goal of TheLib. We wrote these functions to provide somewhat beyond basic functionality. The idea is to enable simple applications or prototypical applications to easily implement nice-to-use interfaces for the user.

Especially unter Linux (but also unter Windows) there are usually a total of three different types of strings:

  1. char * or std::string which store ASCII or ANSI strings with locale dependent character sets
  2. char * or std::string which store multi-byte strings, e.g. using UTF-8 encoding
  3. wchar_t * or std::wstring which store unicode strings.

Depending on these types different API functions need to be called, e.g. for determining the length of the string:

  1. strlen
  2. multiple calls of mbrlen
  3. wcslen

On issue that arises between case 1 and 2 is that modern Linux often uses a locale which stores UTF-8 strings within the standard strings. As long as strings are only to be writte, stored, and displayed, this is a great way to maintain compatibility and gain the modern feature of special character availability. However, as soon as to perform a more complex operation (like creating a substring) this approach results in unexpected behaviour als the bytes of a single multi-byte character are threated like independent characters.

Example:

  • Your user is a geek and enters “あlptraum” as input string.
  • This string is stored in std::string using the utf8-en encoding.
  • Your application now wants to extract the first character for some reason (e.g. to produce typographic capitalization using a specialized font).
  • The normal way of doing this is accessing char* first_char = s[0]; and std::string remaining = s.substr(1);
  • Because the japanese “あ” uses two bytes, this results in: “0” + “Blptraum”

This not only applies to japanese characters, but obviously to almost all characters with diacritics. What is even more important: this issue also results in unexpected behaviour when using (or implementing) string operations which ignore case, e.g. comparisons.

Example of changing a string to lower case:

// we will do this the STL-way:
// http://notfaq.wordpress.com/2007/08/04/cc-convert-string-to-upperlower-case/

std::string data;
// contains 'data' is set to "あlptraum" encoded with utf8-en locale

std::transform(data.begin(), data.end(), data.begin(), ::tolower);
// well, content of 'data' is now: "0blptraum"
// ...

To avoid this problem, TheLib internally initializes the system locale for the application and detects if the locale uses UTF-8 encoding. If it does, all TheLib string functions will call the multi-byte API functions to work as expected. In addition TheLib provides some functions to explicitly convert from or to UTF-8 strings (e.g. for file io).

Of course, you don’t need TheLib to do this. You can use another lib (probably. I only know the IBM-Unicode-Lib, which seems like a huge hulk) or you can use your own workarounds or you can ignore such problems as “they will not occure in your application scenarios”. However, having TheLib doing the job is just handy. Nothing more.

Drat! I wanted to write a post each sunday. Oh well. I will try to minimize these slips and I will try to catch up as quickly as possible.

The news: The TheLib project now uses a Subversion repository!

After Mercurial failed to proof its value, our eagerness for experiments had dried up. We thus switched to a system we knew it worked and how it worked.

Of course, there are countless projects happily working with Mercurial and, of course, there are countless good reasons why everything would be better using Git. To be honest: I don’t care.

The development of TheLib continues.

One thing I learned during my work as developer of scientific software is that large software projects at universities are extreamly difficult to manage. Mainly because the people involved are “scientists” and no “developers”. The experience, the processes, and the structures to reach common goals (compromises) are mostly missing. At the same time, many problems of todays sciences cannot be solved without large and complex software. Thus, most software created in academia is pratically not usable. This is a fact I can not accept.

The approach of TheLib is based on the idea of “divide an conquer”. Instead of developing a huge software which can do everything, we concentrate on a clean development of smaller pieces of the software. This way, we can increaste the usablility, the maintainability and thus the re-usability of these components. The TheLib provides fundamental base classes. I see this project as “extension” to existing projects like the STL or BOOST. It is not a competitor and it does not aim to be one. Therefore, we pay close attantion to reach very high interoperability with these projects.

Even so we reduced the scope of the TheLib to a moderate minimum, it still is a lot to do. Whatever. This is how it should be. If it would be easy, anyone could do it. ;-)

The year 2012 is almost gone. Let’s use this opportunity to reflect on what’s happend. Ok, ok. No one likes annual reviews (me neither), but still…

Early this year I defended my dissertation, and finished my Ph.D. With this, I also finished my work at the Visualization Research Center of the University of Stuttgart. And I mean “at”, not “with”. I still continue working with the “guys from Stuttgart”, and we have some pretty exciting ideas under development.

Then I moved to Dresden and started working at the TU Dresden in the project VICCI. It’s a great city and a great group working here at the computer graphics lab. I really enjoy working and we have some fascinating science projects going on here.

What else? TheLib started. Together with two friends and ex-colleagues from Stuttgart, we decided to fix the design issues of the VISlib, by creating a new, clean library. It is a lot of work, but it is worth it.

And, of course, there is my private game project: Springerjagd (Knights Hunt). Although, I already started to work on the rules last year, it was this year that the game finally got it’s name. And it’s webseit, although there is not much to see there. But this project is something I will be definitly going to continue.

And, with this we reach the New Year’s resolutions (although it’s 1-2 days early): basically, I only want to do what I can to make 2013 as successful as 2012 was. No, I will make it even more successful. However, I believer there is no need for more detailed plans :-)

And, because I do not plan to post again at New Year’s Eve, I wish you all:

“A happy new Year!”

The TheLib Project was started 22. March.

It is the follow-up project of the VISlib, an openly available library of Totally Helpful Extensions for C++ and C# with focus on scientific visualization. TheLib is created in a cooperation betweent the Visualization Research Center of the University of Stuttgart and the Computer Graphics and Visualization lab of the University of Dresden. Source code and error tickes are hosted at SourceForge.

I am eagerly waiting how the project is going to develop.