QtCon: Squish for Qt Training in Berlin


On September 1st, as part of the QtCon conference, our partner KDAB hosts a day of training. This training day allows you to gain knowledge in several Qt-related topics, including automated Qt GUI testing with Squish for Qt.

froglogic‘s ISTQB-certified Senior Software Trainer Florian Turck will conduct the full-day Squish for Qt training and share his in-depth experience of effectively using Squish. Seats for the training are still available.

Register for QtCon here:


Boost dependencies and bcp

Recently I generated diagrams showing the header dependencies between Boost libraries, or rather, between various Boost git repositories. Diagrams showing dependencies for each individual Boost git repo are here along with dot files for generating the images.

The monster diagram is here:

Edges and Incidental Modules and Packages

The directed edges in the graphs represent that a header file in one repository #includes a header file in the other repository. The idea is that, if a packager wants to package up a Boost repo, they can’t assume anything about how the user will use it. A user of Boost.ICL can choose whether ICL will use Boost.Container or not by manipulating the ICL_USE_BOOST_MOVE_IMPLEMENTATION preprocessor macro. So, the packager has to list Boost.Container as some kind of dependency of Boost.ICL, so that when the package manager downloads the boost-icl package, the boost-container package is automatically downloaded too. The dependency relationship might be a ‘suggests’ or ‘recommends’, but the edge will nonetheless exist somehow.

In practice, packagers do not split Boost into packages like that. At least for debian packages they split compiled static libraries into packages such as libboost-serialization1.58, and put all the headers (all header-only libraries) into a single package libboost1.58-dev. Perhaps the reason for packagers putting it all together is that there is little value in splitting the header-only repository content in the monolithic Boost from each other if it will all be packaged anyway. Or perhaps the sheer number of repositories makes splitting impractical. This is in contrast to KDE Frameworks, which does consider such edges and dependency graph size when determining where functionality belongs. Typically KDE aims to define the core functionality of a library on its own in a loosely coupled way with few dependencies, and then add integration and extension for other types in higher level libraries (if at all).

Another feature of my diagrams is that repositories which depend circularly on each other are grouped together in what I called ‘incidental modules‘. The name is inspired by ‘incidental data structures’ which Sean Parent describes in detail in one of his ‘Better Code’ talks. From a packager point of view, the Boost.MPL repo and the Boost.Utility repo are indivisible because at least one header of each repo includes at least one header of the other. That is, even if packagers wanted to split Boost headers in some way, the ‘incidental modules’ would still have to be grouped together into larger packages.

As far as I am aware such circular dependencies don’t fit with Standard C++ Modules designs or the design of Clang Modules, but that part of C++ would have to become more widespread before Boost would consider their impact. There may be no reason to attempt to break these ‘incidental modules’ apart if all that would do is make some graphs nicer, and it wouldn’t affect how Boost is packaged.

My script for generating the dependency information is simply grepping through the include/ directory of each repository and recording the #included files in other repositories. This means that while we know Boost.Hana can be used stand-alone, if a packager simply packages up the include/boost/hana directory, the result will have dependencies on parts of Boost because Hana includes code for integration with existing Boost code.

Dependency Analysis and Reduction

One way of defining a Boost library is to consider the group of headers which are gathered together and documented together to be a library (there are other ways which some in Boost prefer – it is surprisingly fuzzy). That is useful for documentation at least, but as evidenced it appears to not be useful from a packaging point of view. So, are these diagrams useful for anything?

While Boost header-only libraries are not generally split in standard packaging systems, the bcp tool is provided to allow users to extract a subset of the entire Boost distribution into a user-specified location. As far as I know, the tool scans header files for #include directives (ignoring ifdefs, like a packager would) and gathers together all of the transitively required files. That means that these diagrams are a good measure of how much stuff the bcp tool will extract.

Note also that these edges do not contribute time to your slow build – reducing edges in the graphs by moving files won’t make anything faster. Rewriting the implementation of certain things might, but that is not what we are talking about here.

I can run the tool to generate a usable Boost.ICL which I can easily distribute. I delete the docs, examples and tests from the ICL directory because they make up a large chunk of the size. Such a ‘subset distribution’ doesn’t need any of those. I also remove 3.5M of preprocessed files from MPL. I then need to define BOOST_MPL_CFG_NO_PREPROCESSED_HEADERS when compiling, which is easy and explained at the end:

$ bcp --boost=$HOME/dev/src/boost icl myicl
$ rm -rf boostdir/libs/icl/{doc,test,example}
$ rm -rf boostdir/boost/mpl/aux_/preprocessed
$ du -hs myicl/
15M     myicl/

Ok, so it’s pretty big. Looking at the dependency diagram for Boost.ICL you can see an arrow to the ‘incidental spirit’ module. Looking at the Boost.Spirit dependency diagram you can see that it is quite large.

Why does ICL depend on ‘incidental spirit’? Can that dependency be removed?

For those ‘incidental modules’, I selected one of the repositories within the group and named the group after that one repository. Too see why ICL depends on ‘incidental spirit’, we have to examine all 5 of the repositories in the group to check if it is the one responsible for the dependency edge.

boost/libs/icl$ git grep -Pl -e include --and \
  -e "thread|spirit|pool|serial|date_time" include/

Formatting wide terminal output is tricky in a blog post, so I had to make some compromises in the output here. Those ICL headers are including Boost.DateTime headers.

I can further see that gregorian.hpp and ptime.hpp are ‘leaf’ files in this analysis. Other files in ICL do not include them.

boost/libs/icl$ git grep -l gregorian include/
boost/libs/icl$ git grep -l ptime include/

As it happens, my ICL-using code also does not need those files. I’m only using icl::interval_set<double> and icl::interval_map<double>. So, I can simply delete those files.

boost/libs/icl$ git grep -l -e include \
  --and -e date_time include/boost/icl/ | xargs rm

and run the bcp tool again.

$ bcp --boost=$HOME/dev/src/boost icl myicl
$ rm -rf myicl/libs/icl/{doc,test,example}
$ rm -rf myicl/boost/mpl/aux_/preprocessed
$ du -hs myicl/
12M     myicl/

I’ve saved 3M just by understanding the dependencies a bit. Not bad!

Mostly the size difference is accounted for by no longer extracting boost::mpl::vector, and secondly the Boost.DateTime headers themselves.

The dependencies in the graph are now so few that we can consider them and wonder why they are there and can they be removed. For example, there is a dependency on the Boost.Container repository. Why is that?

include/boost/icl$ git grep -C2 -e include \
   --and -e boost/container
#   include <boost/container/set.hpp>
#   include <set>

#   include <boost/container/map.hpp>
#   include <boost/container/set.hpp>
#   include <map>

#   include <boost/container/set.hpp>
#   include <set>

So, Boost.Container is only included if the user defines ICL_USE_BOOST_MOVE_IMPLEMENTATION, and otherwise not. If we were talking about C++ code here we might consider this a violation of the Interface Segregation Principle, but we are not, and unfortunately the realities of the preprocessor mean this kind of thing is quite common.

I know that I’m not defining that and I don’t need Boost.Container, so I can hack the code to remove those includes, eg:

index 6f3c851..cf22b91 100644
--- a/include/boost/icl/map.hpp
+++ b/include/boost/icl/map.hpp
@@ -12,12 +12,4 @@ Copyright (c) 2007-2011:
-#   include <boost/container/map.hpp>
-#   include <boost/container/set.hpp>
 #   include <map>
 #   include <set>
-#else // Default for implementing containers
-#   include <map>
-#   include <set>

This and following steps don’t affect the filesystem size of the result. However, we can continue to analyze the dependency graph.

I can break apart the ‘incidental fusion’ module by deleting the iterator/zip_iterator.hpp file, removing further dependencies in my custom Boost.ICL distribution. I can also delete the iterator/function_input_iterator.hpp file to remove the dependency on Boost.FunctionTypes. The result is a graph which you can at least reason about being used in an interval tree library like Boost.ICL, quite apart from our starting point with that library.

You might shudder at the thought of deleting zip_iterator if it is an essential tool to you. Partly I want to explore in this blog post what will be needed from Boost in the future when we have zip views from the Ranges TS or use the existing ranges-v3 directly, for example. In that context, zip_iterator can go.

Another feature of the bcp tool is that it can scan a set of source files and copy only the Boost headers that are included transitively. If I had used that, I wouldn’t need to delete the ptime.hpp or gregorian.hpp etc because bcp wouldn’t find them in the first place. It would still find the Boost.Container etc includes which appear in the ICL repository however.

In this blog post, I showed an alternative approach to the bcp --scan attempt at minimalism. My attempt is to use bcp to export useful and as-complete-as-possible libraries. I don’t have a lot of experience with bcp, but it seems that in scanning mode I would have to re-run the tool any time I used an ICL header which I had not used before. With the modular approach, it would be less-frequently necessary to run the tool (only when directly using a Boost repository I hadn’t used before), so it seemed an approach worth exploring the limitations of.

Examining Proposed Standard Libraries

We can also examine other Boost repositories, particularly those which are being standardized by newer C++ standards because we know that any, variant and filesystem can be implemented with only standard C++ features and without Boost.

Looking at Boost.Variant, it seems that use of the Boost.Math library makes that graph much larger. If we want Boost.Variant without all of that Math stuff, one thing we can choose to do is copy the one math function that Variant uses, static_lcm, into the Variant library (or somewhere like Boost.Core or Boost.Integer for example). That does cause a significant reduction in the dependency graph.

Further, I can remove the hash_variant.hpp file to remove the Boost.Functional dependency:

I don’t know if C++ standardized variant has similar hashing functionality or how it is implemented, but it is interesting to me how it affects the graph.

Using a bcp-extracted library with Modern CMake

After extracting a library or set of libraries with bcp, you might want to use the code in a CMake project. Here is the modern way to do that:

add_library(boost_mpl INTERFACE)
target_compile_definitions(boost_mpl INTERFACE
target_include_directories(boost_mpl INTERFACE 

add_library(boost_icl INTERFACE)
target_link_libraries(boost_icl INTERFACE boost_mpl)
target_include_directories(boost_icl INTERFACE 
add_library(boost::icl ALIAS boost_icl)

Boost ships a large chunk of preprocessed headers for various compilers, which I mentioned above. The reasons for that are probably historical and obsolete, but they will remain and they are used by default when using GCC and that will not change. To diverge from that default it is necessary to set the BOOST_MPL_CFG_NO_PREPROCESSED_HEADERS preprocessor macro.

By defining an INTERFACE boost_mpl library and setting its INTERFACE target_compile_definitions, any user of that library gets that magic BOOST_MPL_CFG_NO_PREPROCESSED_HEADERS define when compiling its sources.

MPL is just an internal implementation detail of ICL though, so I won’t have any of my CMake targets using MPL directly. Instead I additionally define a boost_icl INTERFACE library which specifies an INTERFACE dependency on boost_mpl with target_link_libraries.

The last ‘modern’ step is to define an ALIAS library. The alias name is boost::icl and it aliases the boost_icl library. To CMake, the following two commands generate an equivalent buildsystem:

target_link_libraries(myexe boost_icl)
target_link_libraries(myexe boost::icl)

Using the ALIAS version has a different effect however: If the boost::icl target does not exist an error will be issued at CMake time. That is not the case with the boost_icl version. It makes sense to use target_link_libraries with targets with :: in the name and ALIAS makes that possible for any library.

QtWebKit: I'm back!

Hello world!


Five years have passed since the last entry in this blog, and almost 3 years since infamous "Changes in QtWebKit development" thread at webkit.org. Fortunately, we've made quite different kind of change in QtWebKit development lately, and it is much more exciting.

QtWebKit is back again!

If you were following QtWebKit development after 2013, you know that actually development have never stopped: each release was getting a bunch of bugfixes and even brand new features. However, WebKit engine itself has not been updated since Qt 5.2 release. That's why it didn't support recent changes in Web standards that happened after 2013, including: new JavaScript language standard ES2015 (also known as ES6), as well as improvements in DOM API and CSS.

However, things have changed in 2016, and now we have revived QtWebKit! Core engine code was updated to it's actual state, and as a result we (and you!) can use all improvements made by WebKit community during these 3 years without any changes in code of existing Qt applications!

You may be wondering, why anyone would like to use QtWebKit in 2016, when shiny new QtWebEngine is available? There is a number of reasons:
  • When used in Qt application, QtWebKit has smaller footprint because it shares a lot of code with Qt. For example, it uses the same code paths for drawing and networking that your regular Qt code uses. This is especially important for embedded systems, where both storage space and memory are scarce resources. It's possible to go further and cut away features which are not crucial for your application, using flexible configuration system of WebKit.
  • On Linux QtWebKit uses GStreamer as a default media player backend. This means that application users will be able to use patent encumbered codecs (if this is legal in their areas) without getting you (as application developer or distributor) into legal troubles.
  • Lots of existing open source applications depend on QtWebKit, but without security updates their users are left open to vulnerabilities. The are only two ways to work around this problem: port applications away from QtWebKit (which is often a hard task because QtWebKit allows much deeper integration with application code than alternative solutions), or update QtWebKit itself, which makes these large porting work unnecessary.
  • QtWebKit is more portable than Chromium: it can run on any CPU architecture supported by Qt and on virtually any Unixish OS (as well as Windows and Mac). The only requirement is a C++11 compiler.
  • Non-interactive user agents like PhantomJS or wkhtmltopdf don't gain any benefits from multi-process architecture, so using single-process WebKit 1 API allows them to have less resource footprint and simpler flow of execution.

    Q: I've heard that WebKit engine is not relevant anymore, since the crowd is working on Blink these days!

    A: This is not true. Despite of the Google departure, WebKit remains on of the leading browser engines, and is progressing at a fast pace. If you don't believe, read on! Also you may want to read release announcements of Safari Technology Preview and WebKitGTK, which higlight other WebKit features under development.

    Now let's see what can we do with QtWebkit in 2016!

    JavaScript engine improvements and ES2015 status


    Most of features are supported now (QtWebKit 5.6 has 10% rating). Note that WebKit is the first web engine providing proper tail calls, which means you can enjoy functional programming without unnecessary stack growth in tail recursion!

    WebKit gained new tier of JavaScript JIT compilation, called FTL. First implementation was based on LLVM compiler infrastructure, but now we are shipping B3 compiler which is more lightweight, does not pull in additional dependencies, and also compiles faster. FTL usually gets activated for computationally-intensive JS code, and is especially useful for running native code compiled to ASM.js.

    Code produced by JavaScript JIT now uses normal C stack, reducing overall memory usage and fragmentation.

    JIT compiler now uses background threads, so compilation does not block execution of other code.

    New (and old) CSS properties

    Web standards evolve rapidly, and more and more CSS properties find their way into specification. Most of them have already been available for a long time, but used -webkit vendor prefix as they were non-standard extensions at a time of their introduction, and now they (finally!) have formal description which all vendor are obliged to follow (though sometimes standardization process changes behavior of old properties). Standardized properties are available without vendor prefixes, and web page authors start actively using these new spelling.

    Unfortunately, sometimes they break compatiblity with old browsers, which implement prefixed properties, with disastrous consequences. Here are screenshots of site that uses unprefixed flexbox properties, defined in CSS3:

      QtWebKit 5.6

      QtWebKit TP3

      CSS Selector JIT

      Besides JavaScriptCore, WebKit now features yet another JIT compiler. Its aim is to speed up application of CSS style sheet to elements of page DOM, so called style resolution. Average performance gain is about 2x, however for complex selector and/or pages with lots of DOM nodes gain may be substantially larger.

      Selector JIT also makes querySelector() and querySelectorAll() faster, but speed up factor may differ.


      This is new CSS property, allowing page author to create "drop cap" effect without much hassle. In order to make this effect work correctly with calligraphic fonts, Qt 5.8 (not yet released) is required.

      Other improvements

      • Responsive images support (<picture> element, srcset and sizes attributes)
      • ellipse()method in Canvas API
      • CSS selectors ::read-writeand ::read-only
      • HTML <template>element 
      • APNG images

      We also support following web features with experimental status and only for GStreamer media player backend:
      • Media Source Extensions
      • WebAudio

        The path ahead

        Unfortunately, porting Qt-specific code to the new WebKit is not always easy, and we had to disable certain features until all code behind them is ported properly. So far, following prominent features are not yet working:
        • QML API
        • WebGL and CSS 3D transforms
        • Accelerated compositing
        • Private browsing
        However, don't be discouraged! Work is in progress and we hope to get these feature available soon. But we are short on manpower so we cannot work on many things in parallel. If you want to get your favorite feature ready sooner rather than later, please join our project. We have a lot of work to do, most items don't require any prior knowledge of WebKit, and some even don't require you to know C++ (yes, there is work for those of you who know only HTML + CSS + basic JavaScript, or only Python). Another way to help us is to report bugs you have found, or help to track down known issues.

        You can follow development of QtWebKit at GitHub repository; however if you want to obtain get bleeding edge sources, use another repository - the latter one is much smaller than original, but still contains all files required to build QtWebKit. See our wiki for build instruction and additional information about the project.


        Today is 10 years since the first chunk of QtWebKit code have been merged into WebKit repository. See https://bugs.webkit.org/show_bug.cgi?id=10466 for more details.


        Technology Preview 3 is now available: release notes, tarball. Binaries for Windows and macOS will be uploaded a bit later.

        Introducing the Qt Lite project—Qt for any platform, any thing, any size

        by Nils Christian Roscher-Nielsen (Qt Blog)

        We believe in a future of great software and hardware, developed together, delivered quickly, and that you can have fun in the process. Embedded development should be just as simple as all other software development, and you should immediately see the result of your ideas running on your device.

        The amount of devices and things surrounding us are rapidly increasing, becoming more intelligent and requiring software that runs on a greater variety of hardware—everything from IoT devices with or without a screen, smart watches through to high end smart TVs and industrial grade PCs. As the requirements and the world of software development is changing so does Qt. We have taken action and are now unveiling the Qt Lite Project. This is a whole range of changes to Qt, allowing you to strip Qt down and bring in exactly what you need in order to create your device for more or less any platform and any thing – regardless of size. Qt Lite is neither a separate product nor a fork of Qt—it is all built into Qt allowing us to efficiently develop and maintain it as part of the whole Qt framework. As such, many of these changes will benefit all Qt users, but especially those targeting resource-constrained devices.

        For the past 20 years, Qt has been used on a massively wide range of operating systems and embedded devices. It didn’t take long before embedded Linux was as important for Qt as its desktop counterpart, but many other embedded operating systems have also followed this trend, and Qt has supported a wide range of Linux, Microsoft and various real time operating systems (RTOS).

        However, to efficiently utilize Qt on these operating systems, and especially on those embedded devices—special as they often are—it has sometimes been challenging and time consuming to configure Qt to efficiently use the different hardware components, available libraries, and strip out the parts of Qt and the OS that are not needed.

        Over the past six months we have looked at many of these challenges—and more—and been working on making Qt a much more targeted framework that will facilitate the whole development cycle and lifetime of embedded device based products. In this blog post we will look at some of the changes, we have made as well as the path beyond that. All of these efforts are part of “Project Qt Lite”.

        The configuration system

        We know that Qt is being used in many different projects, varying industries and for vastly different purposes. So making one change, or one optimal version of Qt is not feasible. Therefore the starting point, and the biggest code change coming as a part of our embedded effort for Qt 5.8, is a new configuration system. When we introduced Qt 5, we had a lot of focus on the modularization of Qt, so it was less monolithic. The modules became less dependent on each other, could easily be developed, tested and deployed independently. But configuring the content of each module was still difficult, so optimizing for a resource-constrained embedded system was not as straight forward as we would like it to be. If you needed a specific feature, like a specific way of handling internationalization or audio functionality, or broader multimedia features, you often needed to add in several new modules, where you would only use a fraction of the functionality. Enabling one single feature exclusively required a lot of manual tweaking, and that took a lot of time.

        The new configuration system in Qt, allows your define the content you need from each module in much more detail for your project and easily allows for feature based tailoring of the Qt modules. We are starting with enabling this fully for Qt Core, Qt Network, Qt GUI, Qt QML and Qt Quick. You can now fine tune which features from these modules you want to include in your project. There is no longer any need to include unnecessary features. We will also expand this to be more granular and cover more modules in the time to come.

        Developer Workflow

        Moving forwards we want to put focus on a development workflow that has optimization in mind from the very beginning. In a world where hardware is getting cheaper, most frameworks do not care much for footprint or memory consumption — all libraries are included from the get go, all features enabled and options checked. This makes feature development simple, but optimization so much harder. Qt Lite now allows you to start with a minimal deployable configuration, and allows you to simply add in any additional feature you will require while developing your project.

        This leaves you in complete control, with a continuous understanding of the consequences of your actions, and allows for transparency of the development project throughout the team. How big is the application developed becoming? Is this web browser really needed? And do cutting these corners actually make sense? Every included feature and added module will be immediately visible, and you will know how it affects the overall footprint of the application.

        To facilitate this, we will start by provide two different reference configurations as a part of Qt Lite:

        Firstly, a full prototyping environment, like for example the configuration behind our demo images as they are shipped with Qt for Device Creation today. This is a great starting point for a mid-cost, low volume distribution for example, it has all features enabled and can quickly and easily be used in products.

        In addition to that we also want to add another Qt configuration that is as minimal as possible. This will provide a great starting point for software that needs a smaller footprint, high performance and still be delivered quickly to the market. By significantly reducing the time spent on optimization at the end of the project, products can have a much faster time-to-market.

        No Open GL Requirement

        One of the main drivers behind the Qt Quick and QML technology, was to introduce a rendering architecture optimized for OpenGL. However, that also meant that OpenGL became a requirement for all Qt Quick based projects. For several good reasons, we see the need for cheaper, more efficient or specially certified hardware that does not support OpenGL. In Qt we have therefore introduced a fully integrated, supported and efficient 2D Software Renderer for Qt Quick. This allows you to use all the power of the QML language to create beautiful user interfaces on embedded devices without OpenGL hardware available.

        The Qt Quick 2D renderer can work in software only, but it is also designed to utilize accelerated 2D operations, for devices that packs a little bit more punch, but still doesn’t have full OpenGL support.


        Along with the new configuration system, we have also developed a new graphical tool for configuring, selecting and setting various options when building Qt. These configurations can be saved and reused. This will also make it easier to modify your configurations for new hardware, or changing requirements.

        The Qt configuration tooling is now even more powerful and feature rich than ever before. By making all the options available easily accessible, integrating the documentation and providing reasonable starting default configurations for various use cases, you get a simple and efficient way to squeeze a lot more juice out of your existing projects.

        We are currently working on a way to sort configuration options into groups, so that you can easily see which configurations need to work together to enable use cases like internationlization, multimedia, web capabilities or other features. You can of course save these configurations and profiles, to continue using with other builds, version so of Qt, or new hardware. These tools will be integrated as a part of Qt for Device Creation.


        A major part of our focus, is on extending the available hardware that you can easily and efficiently use to deploy Qt based applications. There are several devices and project types that can benefit from our current efforts. A typical example can be devices with RAM and Flash in the 32MB or even 16 MB area, with the intention to go much lower in the future. Also, there is no longer any need for OpenGL hardware to use Qt Quick, which extends the number of devices where Qt can be used significantly.

        The main usage of this is still expected to the be Cortex A based architecture, or similar, but we are also aiming at the ARM Cortex M7, as one example.

        And the list goes on

        There is a myriad of other features all enhancing the embedded developer experience and device creation workflow on resource-constrained devices, coming with Qt 5.8. We are further developing the Qt Quick Controls 2, that are specially designed for touch-enabled devices, and are introducing many new features as well as improvements and new themes.

        We have put a lot of effort into our new Over-the-Air update mechanism. It is also a part of Qt for Device Creation for Qt 5.8, and we have already blogged about it in great detail. This is a part of our continuous push to make device creator’s life simpler, shorten time-to-market, and reduce the total cost of a project, by providing an extremely powerful way of managing your device life cycle.

        The Qt Wayland based compositor makes it simple to create fully fledged multi applications devices. But we are also improving EGLFS, and enhancing the multi-screen capabilities.

        And the Qt Emulator that ships with Qt Creator makes it very simple to quickly iterate over designs and optimize applications, even without the target hardware available to all developers in the project.

        An open road ahead

        We have for a long time been putting a lot of emphasis on the embedded space, for example with our Qt for Device Creation product, and we will continue this effort relentlessly. And we don’t want that effort just to be an internal project, but we want you to know about it. Because it is all about you, and what you can achieve when creating your products. Our aim is to improve Qt, making it more light weight, easier to use, and performing better than ever before. To achieve this, we need your feedback.

        We will continue our work making Qt a better framework for embedded projects of all kinds, running on devices in a wide range of industries. We have many exciting plans and we are working with some really interesting customers to bring great projects to the market. Examples being the Automotive systems based on Qt, the usage in the Avionics industry and the work we do with home appliances amongst many other Qt based projects. IoT is another important part of our strategy ahead, and making sure that all devices can be developed with a Qt based platform, communicate over supported protocols and that software can easily be extended to the next generation device is extremely important in a wide range of industries today.

        The next stage of Qt Lite—as soon as the essentials are in place—will be along three major lines.

        Firstly, code optimizations to improve the run time performance and the RAM consumption. This will require a lot of code changes, in many different places of Qt. Some of these changes might not be fully source compatible with Qt, but believe that such embedded projects can make that sacrifice for the sake of performance. This is important—but difficult—work, and some of our best developers are on it.

        Secondly we will spend a lot of time on the configuration of the full stack, not just the Qt libraries. With Qt for Device Creation we offer an out-of-the-box embedded Linux stack based on Yocto. We will also extend the new configuration system to cover and optimize the complete Linux stack as well as the Qt build. This will allow you to easily and efficiently improve the total footprint, boot time and complexity of your system, not just the Qt bits.

        The third avenue of improvement will be to more fully integrate all the tooling around this, to bring all the elements into the same tool, and integrate this into Qt Creator. We think this can improve not only the developer experience, but also the communication in the whole team, provide more transparency towards other stakeholders and reduce the total time/cost of a project.

        In summary, we have now laid the foundation of how to more efficiently address embedded development, and how to make the most of resource-constrained hardware. A configuration architecture that makes it simpler to build Qt according to your needs, and improves the performance on resource-constrained devices. We have improved Qt a lot. But we have also staked out on a clear path towards further improvements. The focus forwards will be on making Qt even faster, smaller and easier to work with. We are very much looking forwards to your feedback and feature requests, and hope all your projects are successful. If you are interested in participating in that future, to provide feedback or learn more about this, both our CTO Lars Knoll and myself will be talking about this subject at the Qt World Summit in San Francisco, October 18-20. We are looking forwards to seeing you there, and gaining your feedback!

        The post Introducing the Qt Lite project—Qt for any platform, any thing, any size appeared first on Qt Blog.


        QtCon_logo_headerI’ve just booked flights and hotel for QtCon. It is going to be great to see all the Qt people and some of my fellow Pelagicorians from our Munich office. For those who want to hear me speak, I’ll share my view of the world on Friday at 11:30.

        The Qt Quick Graphics Stack in Qt 5.8

        This is a joint post with Andy. In this series of posts we are going to take a look at some of the upcoming features of Qt 5.8, focusing on Qt Quick.

        OpenGL… and nothing else?

        When Qt Quick 2 was made available with the release of Qt 5.0, it came with the limitation that support for OpenGL (ES) 2.0 or higher was required.  The assumption was that moving forward OpenGL would continue its trajectory to be the hardware acceleration API of choice for both desktop, mobile and embedded development. Fast forward a couple years down the road to today, and the graphics acceleration story has gotten more complicated.  One assumption we made was that the price of embedded hardware with OpenGL GPUs would continue to drop and they would be ubiquitous.  This is true, but at the same time there are still embedded devices available without OpenGL-capable GPUs where customers continue to wish to deploy Qt Quick applications.  To remedy this we released the Qt Quick 2D Renderer as separate plugin for Qt Quick in Qt 5.4.

        At the same time it turned out that Qt Quick applications deployed on a wide range of machines including older systems often have issues with OpenGL due to missing or unavailable drivers, on Windows in particular. Around Qt 5.4 the situation got improved with the ability to dynamically choose between OpenGL proper, ANGLE, or a software OpenGL rasterizer. However, this does not solve all the problems and full-blown software rasterizers are clearly not an option for low-end hardware, in particular in the embedded space. All this left us with the question of why not focus more on the platforms’ native, potentially better supported APIs (for example, Direct3D), and why not improve and integrate the 2D Renderer closer with the rest of the Qt Quick instead of keeping it a separate module with a somewhat arcane installation process.

        Come other APIs

        Meanwhile, the number of available graphics hardware APIs has increased since the release of Qt Quick 2. Now rather than the easy to understand Direct3D vs OpenGL choice, there is a new generation of lower level graphics APIs available: Vulkan, Metal, and Direct3D 12. So for Qt 5.8 we decided to explore how we can make Qt Quick more future proof, as introduced in this previous post.


        The main goal for the ScenegraphNG project was to modularize the Qt Quick Scene graph API and remove the OpenGL dependencies in the renderer.  By removing the strong bindings to OpenGL and enhancing the scenegraph adaptation layer it is now possible to implement additional rendering backends either built-in to Qt Quick itself or deployed as dynamically loaded plugins. OpenGL will still be the default backend with full compatibility for all existing Qt Quick code. The changes are not just about plugins and moving code around, however. Some internal aspects of the scenegraph, for instance the material system, exhibited a very strong OpenGL coupling which could not be worked around in a 100% compatible manner when it comes to the public APIs. Therefore some public scenegraph utility APIs got deprecated and a few new ones got introduced. At the time of writing work is still underway to modularize and port some additional components, like the sprite and particle systems, to the new architecture.

        To prove that the changes form a solid foundation for future backends, Qt 5.8 introduces an experimental Qt Quick backend for Direct3D 12 on Windows 10 (both traditional Win32 and UWP applications). In the future it will naturally be possible to create a Vulkan backend as well, if it is deemed beneficial. Note that all this has nothing to do with the approaches for integrating custom rendering into QWidget-based or plain QWindow applications. There adding Vulkan or D3D12 instead of OpenGL is possible already today with the existing Qt releases, see for instance here and here.

        Qt Quick 2D Renderer, integrated

        The Qt Quick 2D Renderer was the first non-OpenGL renderer, but when released, it lived outside of the qtdeclarative code base (which contains the QtQml and QtQuick modules) and carried a commercial-only license. In Qt 5.7 the Qt Quick 2D Renderer was made available under GPLv3, but still as a separate plugin with the OpenGL requirement inherited from Qt Quick itself. In practice this got solved by building Qt against dummy libGLESv2 library, but this was neither nice nor desirable long-term. With Qt 5.8 the Qt Quick 2D renderer is merged into qtdeclarative as the built-in software rendering backend for the Qt Quick SceneGraph. The code has also been relicensed to have the same licenses as QtDeclarative. This also means that stand-alone 2D renderer plugin is no longer under development and the qtdeclarative-render2d repository will become obsolete in the future.

        Supercharging the 2D Renderer: Partial updates

        The 2D Renderer, which is now referred to mostly as the software backend (or renderer or adaptation), is getting one huge new feature that was not present in the previous standalone versions: partial updates. Previously it would render the entire scene every frame from front to back, which meant that a small animation in a complicated UI could be very expensive CPU-wise, especially when moving towards higher screen resolutions. Now with 5.8 the software backend is capable of only rendering what has changed between two frames, so for example if you have a blinking cursor in a text box, only the cursor and area under the cursor will be rendered and copied to the window surface, not unlike how the traditional QWidgets would operate. A huge performance improvement for any platform using the software backend.

        QQuickWidget with the 2D Renderer

        Another big feature that the new software backend introduces with Qt 5.8 is support for QQuickWidget. The Qt Quick 2D Renderer was not available for use in combination with QQuickWidget, which made it impossible for apps like Qt Creator to fall back to using the software renderer. Now because of the software renderer’s closer integration with QtDeclarative it was possible to enable support for the software renderer with QQuickWidget. This means that applications using simple Qt Quick scenes without effects and heavy animation can use the software backend in combination with QQuickWidget and thus avoid potential issues when deploying onto older systems (think the OpenGL driver hassle on Windows, the trouble with remoting and X forwarding, etc.). It is important to note that not all types of scenes will perform as well with software as they do with OpenGL (think scrolling larger areas for instance) so the decision has to be made after investigating both options.

        No OpenGL at all? No problem.

        One big limitation of the Qt Quick 2D Renderer plugin was that in order to build QtDeclarative, you still had to have OpenGL headers and libraries available. So on devices that did not have OpenGL available you had to use provided “dummy” libraries and headers to trick Qt into building QtDeclarative and then enforce your developers not to call any code that could call into OpenGL. This always felt like a hack, but with the hard requirement in QtDeclarative there was no better options available. Until now. In Qt 5.8 this is not an issue because QtDeclarative can now be built without OpenGL. In this case the software renderer becomes the default backend instead of OpenGL. So whenever Qt is configured with -no-opengl or the development environment (sysroot) lacks OpenGL headers and libraries, the QtQuick module is no longer skipped. In 5.8 it will build just fine and default to the software backend.

        Switching between backends

        Now that there are multiple backends that can render Qt Quick we also needed to provide a way to switch between which API is used. The approach Qt 5.8 takes mirrors how QPA platform plugins or the OpenGL implementation on Windows are handled: the Qt Quick backend can be changed on a per-process basis during application startup. Once the first QQuickWindow, QQuickView, or QQuickWidget is constructed it will not be possible to change it anymore.

        To specify the backend to use, either set the environment variable QT_QUICK_BACKEND (also known as QMLSCENE_DEVICE that is inherited from previous versions) or use the C++ API of the static functions QQuickWindow provides. When no request is made, a sensible default will be used. This is currently the OpenGL backend, except in Qt builds that have OpenGL support completely disabled.

        As an example, let’s force the software backend in our application:

        int main(int argc, char **argv)
            // Force the software backend always.
            QGuiApplication app(argc, argv);
            QQuickView view;

        Or launch our application with the D3D12 backend instead of the default OpenGL (or software):

        C:\MyCoolApp>set QT_QUICK_BACKEND=d3d12

        To verify what is happening during startup, set the environment variable QSG_INFO to 1 or enable the logging category qt.scenegraph.general. This will lead to printing a number of helpful log messages to the debug or console output, depending on the type of the application. To monitor the debug output, either run the application from Qt Creator or use a tool like DebugView.

        With an updated version of the Qt 5 Cinematic Experience demo the result is something like this:

        Qt 5 Cinematic Experience demo app running on Direct3D 12

        Qt 5 Cinematic Experience demo application running on Direct3D 12

        Everything in the scene is there, including the ShaderEffect items that provide a HLSL version of their shaders. Unsupported features, like particles, are gracefully ignored when running with such a backend.

        Now what happens if the same application gets launched with QT_QUICK_BACKEND=software?

        Qt5 Cinematic Experience demo running on the Software backend

        Qt5 Cinematic Experience demo application running on the Software backend

        Not bad. We lost the shader effects as well, but other than that the application is fully functional. And all this without relying on a software OpenGL rasterizer or other extra dependencies. No small feat for a framework that started out as a strictly OpenGL-based scene graph.

        That’s it for part one. All this is only half of the story – stay tuned for part two where are going to take a look at the new Direct3D 12 backend and what the multi-backend Qt Quick story means for applications using advanced concepts like custom Quick items.

        The post The Qt Quick Graphics Stack in Qt 5.8 appeared first on Qt Blog.

        From Visual Studio Add-In to Qt VS Tools (Beta)

        It has been almost three years since the latest official release of the Qt Visual Studio Add-in, but now we have something new to show you: Qt VS Tools. You can download the Beta version from Qt Downloads for testing. We are happy to be able to tell you that the package size has gone down from 200MB to 7MB.

        In the future, we plan to make Qt VS Tools available in the Visual Studio Gallery and directly installable from within Visual Studio 2013 and 2015. Note that we have dropped support for older Visual Studio versions. Also, before installing Qt VS Tools, make sure to uninstall the old Qt Visual Studio Add-in, because the two do not play well together.

        When you start using Qt VS Tools, you will find the Qt New Item and New Project templates in Templates|Visual C++|Qt. Do not use any items from the wizards named Qt5, because they are artifacts of the Add-in.

        Main Changes

        • Supports Visual Studio 2013 and 2015
        • Major code refactoring, build system updates, and code cleanup
        • New wizard system based on the Visual Studio extension system
        • Out of the box support for Qt Type C++ Debugger Visualizers (natvis)

        Known Issues

        • Missing QML support
        • Missing F1 help support
        • Missing localization support
        • Supports only installed versions of Qt

        Get Qt VS Tools Beta

        The Qt Company has prepared convenient installers for the Qt VS Tools Beta, in the hopes that you will download, test and provide feedback so that we can deliver an awesome final product. To try out the new features you can download it from your Qt Account or from download.qt.io. For any issues you may find with the Beta, please submit a detailed bug report to bugreports.qt.io (after checking for duplicates). You are also welcome to join the discussions in the Qt Project mailing lists, development forums and to contribute to Qt.

        The post From Visual Studio Add-In to Qt VS Tools (Beta) appeared first on Qt Blog.

        Gamescom 2016: Meet Us in Cologne!

        Gamescom 2016 is just around the corner and the V-Play team will be in attendance again this year. Want to get the inside track on cross-platform mobile game development? Meet us in Cologne and you can take a look at the latest game development features from V-Play!

        Request a Gamescom Meeting!

        You can meet us at Gamescom or in the Cologne area on Wednesday 17th and Thursday 18th August. Let us know if you’ll be there because we’d love to meet you for a chat!

        There’ll be two new V-Play showcases at this year’s event for you. This opportunity is only at Gamescom; get your questions answered instantly!

        What to Expect at Gamescom!

        You can get an in-depth look at some of V-Play’s newest features at Gamescom 2016. We’ll be giving exclusive presentations on V-Play Multiplayer and the V-Play Platformer Level Editor.

        V-Play Multiplayer

        The latest addition to the V-Play Game Engine enables easy multiplayer integration in less than 100 lines of code. It includes intelligent matchmaking with an ELO rating system, a friend system, in-game chat, cloud synchronization and social features!

        This feature has been shown to boost engagement & retention metrics far beyond industry standards!

        Note: You can see an example of V-Play Multiplayer in action by downloading ONU, an open-source multiplayer game from V-Play.



        The V-Play Platformer Level Editor

        This feature turns any platformer game into a Super Mario Maker-style game where players have the ability to create and share their own levels from their mobile device. With a live item editor and instant testing, it’s also a great tool for reducing development time.

        The V-Play Platformer Level Editor is the best way to capitalize on user-generated content and create a community around your game!

        Note: Try the V-Play Platformer Level Editor for yourself by downloading the open-source game example!



        V-Play Roadmap & Upcoming Additions

        Gamescom 2016 is the best place to find out about the V-Play Roadmap and what to expect for the rest of 2016. You can put your questions to the V-Play team in-person and find out how we plan to further expand the game engine’s capabilities!

        Get in Touch

        If you’ve met us at events like Gamescom before, we’d love to reconnect with you.

        Available time slots will soon run out. Schedule a meeting right now!

        Request a Gamescom Meeting!


        More Posts like This

        16 Great Sites Featuring Free Game Graphics for Developers

        game graphics

        16 Great Websites Featuring Free Game Sounds for Developers

        game sounds

        The 13 Best Qt, QML & V-Play Tutorials and Resources for Beginners

        tutorials capture

        21 Tips That Will Improve Your User Acquisition Strategy

        User Acquisition

        7 Useful Tips That Will Improve Your App Store Optimization

        User Acquisition


        The post Gamescom 2016: Meet Us in Cologne! appeared first on V-Play Engine.

        QReadWriteLock gets faster in Qt 5.7

        In Qt 5.0 already, QMutex got revamped to be fast. In the non contended case, locking and unlocking is basically only a simple atomic instruction and it does not allocate memory making it really light. QReadWriteLock however did not get the same optimizations. Now in Qt 5.7, it gets on par with QMutex.


        QReadWriteLock's purpose is about the same as the one of QMutex: to protect a critical section. The difference is that QReadWriteLock proposes two different locking modes: read or write. This allows several readers to access the critical section at the same time, and therefore be potentially more efficient than a QMutex. At least that was the intention. But the problem is that, before Qt 5.7, QReadWriteLock's implementation was not as optimized as QMutex. In fact, QReadWriteLock was internally locking and unlocking a QMutex on every call (read or write). So QReadWriteLock was in fact slower than QMutex, unless the read section was held for a very long time, under contention.

        For example, the internals of QMetaType were using a QReadWriteLock for the QMetaType database. This makes sense because that database is accessed very often for reading (every time one creates or operates on a QVariant) and very seldom accessed for writing (only when you need to register a new type the first time it is used). However, the QReadWriteLock locking (for read) was so slow that it took a significant amount of time in some QML application that use lots of QVariants, for example with Qt3D.
        It was even proposed to replace QReadWriteLock by a QMutex within QMetaType. This would have saved 40% of the time of the QVariant creation. This was not necessary because I improved QReadWriteLock in Qt 5.7 to make it at least as fast as QMutex.


        QMutex is itself quite efficient already. I described the internals of QMutex in a previous article. Here is a reminder on the important aspects on QMutex:

        • sizeof(QMutex) == sizeof(void*), without any additional allocation.
        • The non-contended case is basically only an atomic operation for lock or unlock
        • In case we need to block, fallback to pthread or native locking primitives

        QReadWriteLock in Qt 5.7

        I optimized QReadWriteLock to bring it on par with QMutex, using the the same implementation principle.

        QReadWriteLock only has one member: a QAtomicPointer named d_ptr. Depending on the value of d_ptr, the lock is in the following state:

        • When d_ptr == 0x0 (all the bits are 0): unlocked and non-recursive. No readers or writers holding nor waiting on it.
        • When d_ptr & 0x1 (the least significant bit is set): one or several readers are currently holding the lock. No writers are waiting and the lock is non-recursive. The amount of readers is (d_ptr >> 4) + 1.
        • When d_ptr == 0x2: we are locked for write and nobody is waiting.
        • In any other case, when the two least significant bits are 0 but the remaining bits contain data, d_ptr points to a QReadWriteLockPrivate object which either means that the lock is recursive, or that it is currently locked and threads are possibly waiting. The QReadWriteLockPrivate has a condition variable allowing to block or wake threads.

        In other words, the two least significant bits encode the state. When it is a pointer to a QReadWriteLock, those two bits will always be 0 since pointers must be properly aligned to 32 or 64 bits addresses for the CPU to use them.

        This table recap the state depending on the two least significant bits:

        if d_ptr is fully 0, then the lock is unlocked.
        00 If d_ptr is not fully 0, it is pointer to a QReadWriteLockPrivate.
        01 One or several readers are currently holding the lock. The amount of reader is (d_ptr >> 4) + 1.
        10 One writer is holding the lock and nobody is waiting

        We therefore define a few constants to help us read the code.

        enum {
            StateMask = 0x3,
            StateLockedForRead = 0x1,
            StateLockedForWrite = 0x2,
        const auto dummyLockedForRead = reinterpret_cast<QReadWriteLockPrivate *>(quintptr(StateLockedForRead));
        const auto dummyLockedForWrite = reinterpret_cast<QReadWriteLockPrivate *>(quintptr(StateLockedForWrite));
        inline bool isUncontendedLocked(const QReadWriteLockPrivate *d)
        { return quintptr(d) & StateMask; }

        Aside: The code assumes that the null pointer value is equal to binary 0, which is not guaranteed by the C++ standard, but holds true on every supported platform.


        The really fast case happens when there is no contention. If we can atomically swap from 0 to StateLockedForRead, we have the lock and there is nothing to do. If there already are readers, we need to increase the reader count, atomically. If a writer already holds the lock, then we need to block. In order to block, we will assign a QReadWriteLockPrivate and wait on its condition variable. We call QReadWriteLockPrivate::allocate() which will pop an unused QReadWriteLockPrivate from a lock-free stack (or allocates a new one if the stack is empty). Indeed, we can never free any of the QReadWriteLockPrivate as another thread might still hold pointer to it and de-reference it. So when we release a QReadWriteLockPrivate, we put it in a lock-free stack.

        lockForRead actually calls tryLockForRead(-1), passing -1 as the timeout means "wait forever until we get the lock".

        Here is the slightly edited code. (original)

        bool QReadWriteLock::tryLockForRead(int timeout)
            // Fast case: non contended:
            QReadWriteLockPrivate *d;
            if (d_ptr.testAndSetAcquire(nullptr, dummyLockedForRead, d))
                return true;
            while (true) {
                if (d == 0) {
                    if (!d_ptr.testAndSetAcquire(nullptr, dummyLockedForRead, d))
                    return true;
                if ((quintptr(d) & StateMask) == StateLockedForRead) {
                    // locked for read, increase the counter
                    const auto val = reinterpret_cast<QReadWriteLockPrivate *>(quintptr(d) + (1U<<4));
                    if (!d_ptr.testAndSetAcquire(d, val, d))
                    return true;
                if (d == dummyLockedForWrite) {
                    if (!timeout)
                        return false;
                    // locked for write, assign a d_ptr and wait.
                    auto val = QReadWriteLockPrivate::allocate();
                    val->writerCount = 1;
                    if (!d_ptr.testAndSetOrdered(d, val, d)) {
                        val->writerCount = 0;
                    d = val;
                // d is an actual pointer;
                if (d->recursive)
                    return d->recursiveLockForRead(timeout);
                QMutexLocker lock(&d->mutex);
                if (d != d_ptr.load()) {
                    // d_ptr has changed: this QReadWriteLock was unlocked before we had
                    // time to lock d->mutex.
                    // We are holding a lock to a mutex within a QReadWriteLockPrivate
                    // that is already released (or even is already re-used). That's ok
                    // because the QFreeList never frees them.
                    // Just unlock d->mutex (at the end of the scope) and retry.
                    d = d_ptr.loadAcquire();
                return d->lockForRead(timeout);


        Exactly the same principle, as lockForRead but we would also block if there are readers holding the lock.

        bool QReadWriteLock::tryLockForWrite(int timeout)
            // Fast case: non contended:
            QReadWriteLockPrivate *d;
            if (d_ptr.testAndSetAcquire(nullptr, dummyLockedForWrite, d))
                return true;
            while (true) {
                if (d == 0) {
                    if (!d_ptr.testAndSetAcquire(d, dummyLockedForWrite, d))
                    return true;
                if (isUncontendedLocked(d)) {
                    if (!timeout)
                        return false;
                    // locked for either read or write, assign a d_ptr and wait.
                    auto val = QReadWriteLockPrivate::allocate();
                    if (d == dummyLockedForWrite)
                        val->writerCount = 1;
                        val->readerCount = (quintptr(d) >> 4) + 1;
                    if (!d_ptr.testAndSetOrdered(d, val, d)) {
                        val->writerCount = val->readerCount = 0;
                    d = val;
                // d is an actual pointer;
                if (d->recursive)
                    return d->recursiveLockForWrite(timeout);
                QMutexLocker lock(&d->mutex);
                if (d != d_ptr.load()) {
                    // The mutex was unlocked before we had time to lock the mutex.
                    // We are holding to a mutex within a QReadWriteLockPrivate that is already released
                    // (or even is already re-used) but that's ok because the QFreeList never frees them.
                    d = d_ptr.loadAcquire();
                return d->lockForWrite(timeout);


        The API has a single unlock for both read and write so we don't know if we are unlocking from reading or writing. Fortunately, we can know that with the state encoded in the lower bits. If we were locked for read, we need to decrease the reader count, or set the state to 0x0 if we are the last one. If we were locked for write we need to set the state to 0x0. If there is a QReadWriteLockPrivate, we need to update the data there, and possibly wake up the blocked threads.

        void QReadWriteLock::unlock()
            QReadWriteLockPrivate *d = d_ptr.load();
            while (true) {
                Q_ASSERT_X(d, "QReadWriteLock::unlock()", "Cannot unlock an unlocked lock");
                // Fast case: no contention: (no waiters, no other readers)
                if (quintptr(d) <= 2) { // 1 or 2 (StateLockedForRead or StateLockedForWrite)
                    if (!d_ptr.testAndSetRelease(d, nullptr, d))
                if ((quintptr(d) & StateMask) == StateLockedForRead) {
                    Q_ASSERT(quintptr(d) > (1U<<4)); //otherwise that would be the fast case
                    // Just decrease the reader's count.
                    auto val = reinterpret_cast<QReadWriteLockPrivate *>(quintptr(d) - (1U<<4));
                    if (!d_ptr.testAndSetRelease(d, val, d))
                if (d->recursive) {
                QMutexLocker locker(&d->mutex);
                if (d->writerCount) {
                    Q_ASSERT(d->writerCount == 1);
                    Q_ASSERT(d->readerCount == 0);
                    d->writerCount = 0;
                } else {
                    Q_ASSERT(d->readerCount > 0);
                    if (d->readerCount > 0)
                if (d->waitingReaders || d->waitingWriters) {
                } else {
                    Q_ASSERT(d_ptr.load() == d); // should not change when we still hold the mutex


        Here is the benchmark that was run: https://codereview.qt-project.org/167113/. The benchmark was run with Qt 5.6.1, GCC 6.1.1. What I call Qt 5.7 bellow is in fact Qt 5.6 + the QReadWriteLock patch so it only compares this patch.


        This benchmark compares different types of lock by having a single thread running in a loop 1000000 times, locking and unlocking the mutex and doing nothing else.

        QReadWriteLock (Qt 5.6) 38 ms ███████████████████
        QReadWriteLock (Qt 5.7) 18 ms █████████
        QMutex 16 ms ████████
        std::mutex 18 ms █████████
        std::shared_timed_mutex 33 ms ████████████████▌

        Contented Reads

        This benchmark runs as much threads as logical cores (4 in my cases). Each thread will lock and unlock the same mutex 1000000 times. We do a small amount of work inside and outside the lock. If no other work was done at all and the threads were only locking and unlocking, we would have a huge pressure on the mutex but this would not be a fair benchmark. So this benchmark does a hash lookup inside the lock and a string allocation outside of the lock. The more work is done inside the lock, the more we disadvantage QMutex compared to QReadWriteLock because threads would be blocked for longer time.

        QReadWriteLock (Qt 5.6) 812 ms ████████████████████▍
        QReadWriteLock (Qt 5.7) 285 ms ███████▏
        QMutex 398 ms ██████████
        std::mutex 489 ms ████████████▎
        std::shared_timed_mutex 811 ms ████████████████████▎

        Futex Version

        On platforms that have futexes, QMutex does not even need a QMutexPrivate, it uses the futexes to hold the lock. Similarly, we could do the same with QReadWriteLock. I made an implementation of QReadWriteLock using futex (in fact I made it first before the generic version). But it is not in Qt 5.7 and is not yet merged in Qt, perhaps for a future version if I get the motivation and time to get it merged.

        Could we get even faster?

        As always, nothing is perfect and there is always still room for improvement. A flaw of this implementation is that all the readers still need to perform an atomic write at the same memory location (in order to increment the reader's count). This causes contention if there are many reader threads. For cache performance, we would no want that the readers write to the same memory location. Such implementations are possible and would make the contended case faster, but then would take more memory and might be slower for the non-contended case.


        These benchmarks shows the huge improvement in QReadWriteLock in Qt 5.7. The Qt classes have nothing to envy to their libstdc++ implementation. std::shared_timed_mutex which would be the standard equivalent of a QReadWriteLock is surprisingly slow. (I heard rumors that it might get better.)
        It is optimized for the usual case of Qt with relatively low contention. It is taking a very small amount of memory and makes it a pretty decent implementation of a read write lock.

        In summary, you can now use QReadWriteLock as soon as there are many reads and seldom writes. This is only about non recursive mutex. Recursive mutex are always slower and should be avoided. Not only because they are slower, but it is also harder to reason about them.

        Opt-in header-only libraries with CMake

        Using a C++ library, particularly a 3rd party one, can be complicated affair. Library binaries compiled on Windows/OSX/Linux can not simply be copied over to another platform and used there. Linking works differently, compilers bundle different code into binaries on each platform etc.

        This is not an insurmountable problem. Libraries like Qt distribute dynamically compiled binaries for major platforms and other libraries have comparable solutions.

        There is a category of libraries which considers the portable binaries issue to be a terminal one. Boost is a widespread source of many ‘header only’ libraries, which don’t require a user to link to a particular platform-compatible library binary. There are also many other examples of such ‘header only’ libraries.

        Recently there was a blog post describing an example library which can be built as a shared library, or as a static library, or used directly as a ‘header only’ library which doesn’t require the user to link against anything to use the library. The claim is that it is useful for libraries to provide users the option of using a library as a ‘header only’ library and adding preprocessor magic to make that possible.

        However, there is yet a fourth option, and that is for the consumer to compile the source files of the library themselves. This has the
        advantage that the .cpp file is not #included into every compilation unit, but still avoids the platform-specific library binary.

        I decided to write a CMake buildsystem which would achieve all of that for a library. I don’t have an opinion on whether good idea in general for libraries to do things like this, but if people want to do it, it should be easy as possible.

        Additionally, of course, the CMake GenerateExportHeader module should be used, but I didn’t want to change the source from Vittorio so much.

        The CMake code below compiles the library in several ways and installs it to a prefix which is suitable for packaging:

        cmake_minimum_required(VERSION 3.3)
        # define the library
        add_library(library_static STATIC ${library_srcs})
        add_library(library_shared SHARED ${library_srcs})
        add_library(library_iface INTERFACE)
        add_library(library_srcs INTERFACE)
        target_sources(library_srcs INTERFACE
        # install and export the library
          EXPORT library_targets
          INCLUDES DESTINATION include
        install(EXPORT library_targets
          NAMESPACE example_lib::
          DESTINATION lib/cmake/example_lib
        install(FILES example_lib-config.cmake
          DESTINATION lib/cmake/example_lib

        This blog post is not a CMake introduction, so to see what all of those commands are about start with the cmake-buildsystem and cmake-packages documentation.

        There are 4 add_library calls. The first two serve the purpose of building the library as a shared library and then as a static library.

        The next two are INTERFACE libraries, a concept I introduced in CMake 3.0 when it looked like Boost might use CMake. The INTERFACE target can be used to specify header-only libraries because they specify usage requirements for consumers to use, such as include directories and compile definitions.

        The library_iface library functions as described in the blog post from Vittorio, in that users of that library will be built with LIBRARY_HEADER_ONLY and will therefore #include the .cpp files.

        The library_srcs library causes the consumer to compile the .cpp files separately.

        A consumer of a library like this would then look like:

        cmake_minimum_required(VERSION 3.3)
        find_package(example_lib REQUIRED)
        ## uncomment only one of these!
        # target_link_libraries(myexe 
        #     example_lib::library_static)
        # target_link_libraries(myexe
        #     example_lib::library_shared)
        # target_link_libraries(myexe
        #     example_lib::library_iface)

        So, it is up to the consumer how they consume the library, and they determine that by using target_link_libraries to specify which one they depend on.

        Qt Creator 4.1 RC1 released

        We are pleased to announce the release of Qt Creator 4.1.0 RC1.

        Read the beta blog post for an overview of the new features coming in the 4.1 release. Since then we have been fixing bugs and polishing the new features. We believe that what we have now is very close to what we can release as the final 4.1 release soon.

        A few visual improvements that happened since the Beta are the addition of the dark and light Solarized editor color schemes, a new “Qt Creator Dark” editor color scheme as companion for the new Flat Dark theme and the polishing of the new Flat Dark and Flat Light themes.

        See some items of the above list in action:
        Flat Dark Theme - Qt Creator 4.1Flat Light Theme - Qt Creator 4.1

        Now is the best time to head over to our download page or the Qt Account Portal, get and install the RC and give us your feedback, preferably through our bug tracker. You also find us on IRC on #qt-creator on chat.freenode.net, and on the Qt Creator mailing list.

        Known issue #1
        Unfortunately, an incompatibility between MSVC 2015 Update 3 and Clang 3.8 causes an issue with the Clang Static Analyzer plugin. We will try to iron this out until the final release.

        Known issue #2
        The changes-4.1.0 file in the source package does not contain what happened (and who additionally contributed) between 4.1.0-beta1 and this 4.1.0-rc1. Here is the updated changes-4.1.0.md

        The post Qt Creator 4.1 RC1 released appeared first on Qt Blog.

        Coin – Continuous Integration for Qt

        Testing is important for every software product, and Qt is no different. It is quite intriguing how much work goes into testing to ensure that a Qt release is the very best. Although I’m not directly involved with getting the release into your hand, I’ve lately learned a lot about the infrastructure (Continuous Integration or COIN) used by The Qt Company to build, test, and package. SHA-1s in this post are made up and resemble real SHA-1s only by coincidence (Coin-cidence?).

        Qt is supported on a variety of platforms, and we need to make sure that it actually works on all of them. It’s no longer realistic to expect that everyone contributing to Qt has all supported platforms available all the time. It would also take a tremendous amount of time to check that one patch does not break any code on any of the platforms. Therefore, we chose to build a centralized infrastructure commonly known as a continuous integration system that allows us to build and test changes on all platforms.

        Let’s assume we have a patch (or “change” in Gerrit terminology) that is approved,  and we want this change to become part of the official Qt release. After approval, Gerrit offers to “stage” each change that is targeted to a particular branch.
        This is where the continuous integration infrastructure becomes active. The system starts testing changes by moving them from the “staged” state to “integrating” in Gerrit. Integrating changes are being built on a variety of platforms and the automated tests of the module are run in succession. The system will finally approve or reject the tested change(s), which is again visible in Gerrit.

        We looked at various tools that allow continuous integration to be run in a convenient and easy fashion and eventually concluded that none of the existing tools really fit our needs. I’d like to go into more details of what we noted down as requirements, but in this post, I’ll focus on just one important aspect: modularization. With the advent of Qt 5, we modularized our code base (you can find many modules on code.qt.io), but until recently, we still tested as if Qt was just one monolithic blob.

        We improved the time for integrations by taking advantage of modularization. The idea is to build the module to be tested and its dependencies as needed. Let’s assume we want to change Qt Declarative, the module containing the Qt QML and Qt Quick libraries.  Coin keeps bare clones (see --bare in the git docs) of Qt’s Git repositories around and updates them as they change. With the copies of the repositories it can quickly find out about dependencies and provide the source code to the build machines. It checks the module for a “sync.profile” file containing the module’s dependencies (some of the details, such as the name, “sync.profile” and its syntax, will change in the future as we’re trying to make the files describing the dependencies nicer). In the case of qtdeclarative, we find qtxmlpatterns and qtbase are required to build the module. Both of these modules are then checked for the latest state of the respective branch.
        In the end, we have a list of modules and their SHA-1s as a tree structure. We find that we’ll need qtbase at abcdef, qtxmlpatterns at def123 and qtdeclarative at badbab.

        As Coin runs the integrations for qtbase, it’s bound to have built and tested qtbase/abcdef before. We keep a cache of recent builds that are tested successfully, so instead of re-building qtbase, we can simply get it from the cache, skipping the build entirely. For qtxmlpatterns, we check if a build of qtxmlpatterns/def123 with the exact same qtbase/abcdef is around, but assuming qtbase recently changed, it’s unlikely, even though qtxmlpatterns might be unchanged for a while. We want to guarantee that all modules are consistently built on top of the same base artifacts, thus qtxmlpatterns/def123 gets rebuilt, if it hasn’t already been built for that SHA-1 of qtbase. The qtdeclarative SHA-1 comes from the staging branch and will be new, so it will be built.

        Now the dependencies of the Git modules are clear. Coin looks up a list of platforms that the change is to be tested on; these are the reference platforms for the branch that we target. It then creates a lot of jobs – called work items inside the Coin code. Each work item will be a build or test of a particular module on a platform. Build items create artifacts of the result, the compiled libraries and headers of the module are then added to the cache. For our example, the first round of build items will be qtbase/abcdef on all 27 platforms that are currently supported for the 5.7 branch. Then, there is a round of 27 qtxmlpatterns/def123 builds, each of them dependent on the build of qtbase/abcdef. After that, there are 27 qtdeclarative/badbab builds based on the qtxmlpatterns/def123 builds. Once the building is complete, testing for qtdeclarative/badbab finally begins on each respective build. For the three module plus one round of testing we get (3 + 1) * 27 = 108 jobs which need to pass for a single change to make it into Qt. At this point we have all work items which need to be processed.

        The next step is the actual running of the work items. It starts with launching the qtbase builds. In our example these are done from the start (Coin finds previous artifacts that can be used). We just finished the first 27 items in no time (stat’ing a few files on disk). We create these jobs in order to have a system that can start with an empty hard disk; it will in that case start by creating the missing artifacts by compiling qtbase.

        When a work item is finished, items depending on it will start immediately. Coin creates virtual machines in our VSphere instance (or waits until it has the capacity to create and run them). Once we have a VM with the right OS running, we can start to launch the build on it. The build is just a list of instructions (if you look at our build logs (https://testresults.qt.io/coin), you’ll see a list of instructions: set some environment variables, go to the right directory, run configure/make/nmake/jom and others.
        Once all instructions have run, the result is compressed and uploaded. Some builds will finish earlier, for them we move on to the next module on the same platform while still waiting for others to finish building.

        The qtdeclarative builds run in the same fashion, starting as soon as each dependency is done. Once the builds are done, the testing can start. It’s quite similar to the build step: Coin downloads the module which has now been built along with its dependencies. Test machines end up downloading qtbase, qtxmlpatterns and qtdeclarative, get the qtdeclarative sources (also a compressed file fresh from our git repository cache). After getting the needed artifacts, the machine runs instructions, along the lines of “make check”. Assuming all of our 108 jobs finished successfully, Coin approves the build branch in Gerrit and the qtdeclarative repository is updated.

        The post Coin – Continuous Integration for Qt appeared first on Qt Blog.

        QRPC: A Qt remoting library

        This project of mine has been around for quite a while already. I’ve never much publicised it, however, and for the past year the code hasn’t seen any changes (until a few days ago, anyway). But related to my current job, I’ve found a new need for remote procedure calls, so I came back to the code after all.

        QRPC is a remoting library tightly integrated with Qt: in simplified terms, it’s an easy way to get signal-slot connections over the network. At least that was its original intention, and hence comes the name (QRPC as in “Qt Remote Procedure Call”).

        But actually, it’s a little more than that. QRPC pipes the whole of Qt’s meta-object functionality over a given transport. That can be a network socket, unix domain socket or something more high-level like ZeroMQ. But it could, for example, also be a serial port (not sure why anybody would want that, though).
        The actual remoting system works as follows: A server marks some objects for “export”, meaning they will be accessible by clients. Clients can then request any such object by its identifier. The server then serialises the whole QMetaObject hierarchy of the exported object and sends it to the client. There, it is reconstructed and used as the dynamic QMetaObject of a specialised QObject. Thus, the client not only has access to signals and slots, but also to properties and Q_CLASSINFOs (actually anything that a QMetaObject has to offer). The property support also includes dynamic properties, not only the statically defined QMetaProperties.
        Method invocations, signal emissions and property changes are serialised and sent over the transport – after all, a QMetaObject without the dynamics isn’t that useful ;).

        To give an impression of how QRPC is used, here’s an excerpt from the example included in the repository:


        Widget::Widget(QWidget *parent) :
            QWidget(parent), ui(new Ui::Widget),
            localServer(new QLocalServer(this)),
            exporter(new QRPCObjectExporter(this))
            // Export the QSpinBox with the identifier "spinbox".
            exporter->exportObject("spinbox", ui->spinBox);
            // Handle new connections, serve under the name "qrpcsimpleserver".
            connect(localServer, &QLocalServer::newConnection, this, &Widget::handleNewConnection);
        void Widget::handleNewConnection()
            // Get the connection socket
            QLocalSocket *socket = localServer->nextPendingConnection();
            // Create a transport...
            QRPCIODeviceTransport *transport = new QRPCIODeviceTransport(socket, socket);
            // ... and the server communicating over the transport, serving objects exported from the
            // exporter. Both the transport and the server are children of the socket so they get
            // properly cleaned up.
            new QRPCServer(exporter, transport, socket);


        Widget::Widget(QWidget *parent) :
            ui(new Ui::Widget), socket(new QLocalSocket(this)), rpcClient(0)
            // ...
            // Connect to the server on button click
            connect(ui->connect, &QPushButton::clicked, [this]() {
            // Handle connection events
            connect(socket, &QLocalSocket::connected, this, &Widget::connectionEstablished);
            connect(socket, &QLocalSocket::disconnected, this, &Widget::connectionLost);
        void Widget::connectionEstablished()
            // Create a transport...
            QRPCIODeviceTransport *transport = new QRPCIODeviceTransport(socket, this);
            // ... and a client communicating over the transport.
            QRPCClient *client = new QRPCClient(transport, this);
            // When we receive a remote object (in this example, it can only be the spinbox),
            // synchronise the state and connect the slider to it.
            connect(client, &QRPCClient::remoteObjectReady, [this](QRPCObject *object) {
                connect(ui->horizontalSlider, SIGNAL(valueChanged(int)), object, SLOT(setValue(int)));
            // Request the spinbox.
            // Clean up when we lose the connection.
            connect(socket, &QLocalSocket::disconnected, client, &QObject::deleteLater);
            connect(socket, &QLocalSocket::disconnected, transport, &QObject::deleteLater);
            // ...

        Argument or property types have to be registered metatypes (same as when using queued connections) and serialisable with QDataStream. The QDataStream operators also have to be registered with Qt’s meta type system (cf. QMetaType docs).
        At the moment, one shortcoming is that references to QObjects (i.e. “QObject*”) can’t be transferred over the network, even if the relevant QObjects are “exported” by the server. Part of the problem is that Qt doesn’t allow us to register custom stream operators for “QObject*”. This could be solved by manually traversing the serialised values, but that will negatively impact the performance. Usually one can work around that limitation, though.

        Furthermore, method return values are intentionally not supported. Supporting them would first require a more complex system to keep track of state and secondly impose blocking method invocations, both of which run contrary to my design goals.

        In some final remarks, I’d like to point out two similar projects: For one, there’s QtRemoteObjects in Qt’s playground which works quite similarly to QRPC. I began working on QRPC at around the same time as QtRemoteObjects was started and I didn’t know it existed until quite some time later. Having glanced over its source code, I find it to be a little too complex and doing too much for my needs without offering the same flexibility (like custom transports). I must admit that I haven’t checked it out in more detail, though.
        Then there’s also QtWebChannel which again offers much of the metaobject functionality over the web, but this time specifically geared towards JavaScript/HTML5 clients and Web apps. I thought about reusing part of its code, but ultimately its focus on JS was too strong to make this a viable path.

        Collecting network traffic, ØMQ and packetbeat

        As part of running infrastructure it might make sense or be required to store logs of transactions. A good way might be to capture the raw unmodified network traffic. For our GSM backend this is what we (have) to do and I wrote a client that is using libpcap to capture data and sends it to a central server for storing the trace. The system is rather simple and in production at various customers. The benefit of having a central server is having access to a lot of storage without granting too many systems and users access, central log rotation and compression, an easy way to grab all relevant traces and many more.

        Recently the topic of doing real-time processing of captured data came up. I wanted to add some kind of side-channel that distributes data to interested clients before writing it to the disk. E.g. one might analyze a RTP audio flow for packet loss, jitter, without actually storing the personal conversation.

        I didn't create a custom protocol but decided to try ØMQ (Zeromq). It has many built-in strategies (publish / subscribe, round robin routing, pipeline, request / reply, proxying, ...) for connecting distributed system. The framework abstracts DNS resolving, connect, re-connect and exposes very easy to build the standard message exchange patterns. I opted for the publish / subscribe pattern because the collector server (acting as publisher) does not care if anyone is consuming the events or data. The message I sent are quite simple as well. There are two kind of multi-part messages, one for events and one for data. A subscriber is able to easily filter for events or data and filter for a specific capture source.

        The support for Zeromq was added in two commits. The first one adds basic zeromq context/socket support and configuration and the second adds sending out the events and data in a fire and forget manner. And in a simple test set-up it seems to work just fine.

        Since moving to Amsterdam I try to attend more meetups. Recently I went to talk at the local Elasticsearch group and found out about packetbeat. It is program written in Go that is using a PCAP library to capture network traffic, has protocol decoders written in go to make IP re-assembly and decoding and will upload the extracted information to an instance of Elasticsearch.  In principle it is somewhere between my PCAP system and a distributed wireshark (without the same amount of protocol decoders). In our network we wouldn't want the edge systems to directly talk to the Elasticsearch system and I wouldn't want to run decoders as root (or at least with extended capabilities).

        As an exercise to learn a bit more about the Go language I tried to modify packetbeat to consume trace data from my new data interface. The result can be found here and I do understand (though I am still hooked on Smalltalk/Pharo) why a lot of people like Go. The built-in fetching of dependencies from github is very neat, the module and interface/implementation approach is easy to comprehend and powerful.

        The result of my work allows something like in the picture below. First we centralize traffic capturing at the pcap collector and then have packetbeat pick-up data, decode and forward for analysis into Elasticsearch. Let's see if upstream is merging my changes.

        Injeqt 1.1 and testing Qt applications

        I've developed Injeqt as a way to improve Kadu quality. Its code was filled by singletons and hidden dependencies. Now, with Injeqt 1.1 released and Kadu 4.0 getting close to release I can tell that I've fulfilled my goals.

        First step in making Kadu testable was to get rid of singletons and replace them with injected objects. With quick grep I discovered that the phase ::instance() had more than 2600 occurrences with more than 150 singleton classes in core alone (not counting plugins). It took me almost two months just to add Injeqt setters to each class that used these singletons and fix things that broke during that phase.

        One of immediately visible benefit of this effort was that dependencies between classes were suddenly obvious (just look at header file for INJEQT_SET setters). And what follows - realization that some of them do not make any sense at all. And some classes have way too much dependencies (several classes with 10 or more dependencies and tens of classes with over 5). So its a good starting point for refactoring.

        But you can't just get a 500000 lines project and refactor it. So I did what I think is the most reasonable way to it - refactor as you go (and add unit tests!).

        This brings us to the core of this post - how easy it is to test classes and services using Injeqt.

        First feature that I've added with this mindset was JumpList support on Windows. JumpList are additional menu items that shows in context menu over taskbar buttons, just like here:

        Idea was simple - for Kadu it should display all currently open and recently used chats in two groups. This feature requires two services - OpenChatRepository and RecentChatRepository - these holds lists of open/recent chats (just wrapped sets with signals in reality). As these classes didn't exist at this time, I've implemented them and added tests (as these classes do not depend on anything, Injeqt is not used there).

        Then more interesting things came - JumpList abstract class to act as an adaptor for Qt's QWinJumpList and WindowsJumpListService to to handle the chat repositories and the JumpList instance. Thanks to JumpList being an abstract interface it was easy to create JumpListStub and test the whole thing in windows-jump-list-service.test.cpp file. The core of this tests is in makeInjector method that contains private class module with all classes required to execute tests - including our stub class:

        WindowsJumpListServiceTest::makeInjector() const
        class module : public injeqt::module

        auto modules =

        return injeqt::injector{std::move(modules)};

        So I was able to recreate all required dependencies for WindowsJumpListService with ease. If OpenChatRepository or RecentChatRepository had their own dependencies it would be neccessary to mock them too, but fortunately it was not the case.

        If you would like to test Injeqt if your own project or if you just have any questions, feel free to send me email and I'll be happy to answer.

        Release 2.9.0: V-Play Multiplayer Launch

        V-Play Multiplayer is officially released.

        Following its successful beta release, you can now access V-Play Multiplayer when you update V-Play – even as a free user. This new feature can be integrated into your game with less than 100 lines of code!

        V-Play Multiplayer allows you to create real-time and turn-based multiplayer games for all supported V-Play platforms. This includes iOS, Android and Windows Phone, as well as Windows, Linux and Mac OS X. You can also enjoy an intelligent matchmaking system with ELO rating, an interactive chat feature, push notifications, cloud synchronization of player profiles & many social features.

        V-Play Multiplayer has already been used to launch a successful 4-player card game, ONU, developed internally by V-Play. ONU has been available on the App Store and Google Play Store since the beginning of July and has garnered 100,000+ downloads within the first 3 weeks of its release. This is thanks to the multiplayer feature and word-of-mouth marketing. The player retention rates and engagement metrics are also way above industry standards, thanks to the multiplayer features.

        The full source code for ONU, based on the popular card game UNO, is available for free in the V-Play SDK. As a developer, you can use the full source code as a best practice for multiplayer integration and create your own multiplayer games within a few days.


        To get V-Play Multiplayer, update to the latest version of V-Play or download it here for free!

        Update Now!

        Real-Time & Turn-Based Multiplayer Support

        V-Play Multiplayer supports both real-time and turn-based gameplay, so you can use it to make many different types of multiplayer games. It’s perfect for making player-vs-player games like ‘Words with Friends’ or games with a large amount of players, such as ‘Clash of Clans’.

        Many of the most successful mobile games are intended for a multiplayer audience, such as ‘Draw Something’ and ‘Hay Day’. With V-Play Multiplayer, you can now rival these games and make whatever kind of multiplayer game you like. It’s also perfect for making games like:

        • Hearthstone: Heroes of Warcraft
        • Quiz Up
        • Pokemon Go
        • NBA Jam
        • Worms 4
        • Modern Combat 4
        • Minecraft: Pocket Edition
        • Muffin Knight

        Best of all, your multiplayer game will work on iOS, Android and Desktop devices using a single codebase. You can code your game once and publish it anywhere! See the new feature in action here:

        Matchmaking & ELO Rating System

        V-Play Multiplayer includes a matchmaking feature that allows you to play with your friends or join already running games. It also includes an ELO Rating System that matches players against opponents of a similar skill level to ensure competitive gameplay.


        The matchmaking system also helps you to create games with specific opponents, like your friends. This feature is sure to boost your player engagement as players can enjoy regular gameplay with their own friends, as well as millions of players worldwide.

        ELO Rating System

        The ELO rating system ensures you play with opponents of a similar skill level. This is a great tool for improving your games’ retention rates. With this ELO Rating System in place, players of all levels are guaranteed challenging gameplay for as long as possible. This means you can create a game that continues to entertain weeks after a player’s first engagement.

        Social Features: Friend System & Leaderboards

        V-Play Multiplayer lets you add your friends or make new friends with players you meet in-game. To make playing with friends easier, you see your friends at the top of the players list when starting a new game.

        player selection

        You can also compare your highscore with your friend’s in a dedicated Friends leaderboard. The standard leaderboard shows players from all over the world, even on different platforms.


        These leaderboards are sure to increase your retention and engagement rates as players compete against friends and international players to reach the top of the rankings.

        Messaging: Interactive Chat & Push Notifications

        V-Play Multiplayer features a messaging system so you can chat with your friends, even if they’re not online. This makes it easy to discuss games or arrange future matches. Best of all, V-Play Multiplayer sends Push Notifications to players when they receive new messages. Push Notifications are sent when players receive a game invite or a new friend request. This means your players can engage each other at any time, whether they’re online or not. This is a simple but effective way to boost user engagement within your game.

        push notifications

        These Push Notifications enable you to use a ‘Late Join’ feature within the matchmaking system. When creating a new game, you can invite your offline friends and begin without waiting. They will receive a Push Notification saying “Let’s Play a Game!” while you play, and they can join you when they’re ready. Once they click the blue play button in the interactive chat, they’ll be brought straight to the game room. Players never need to wait or miss out on the fun again!

        push 2

        Player Profiles & Cloud Synchronization

        V-Play Multiplayer lets you create your own player profile. You can upload a profile picture, set a username and decide if your national flag is displayed. It’s simple to do and no additional logins are required.

        Furthermore, all player data, such as high scores or earned achievements, get synced across platforms and devices thanks to built-in cloud synchronization. This enables you to use your player profile across as many devices as you like. So you can start playing a game on your iPad and continue where you left off on your Android phone.


        The player profile gives users personalization options and leads to a much improved player experience. As a community builds around your game, it allows players to stand out on leaderboards and create their own in-game identity.

        In-Game Chat

        V-Play Multiplayer allows players to communicate during gameplay with an in-game chat feature. Players can use it to discuss game results with one another, message their friends or chat about the latest news with people from all over the world, right in a running game.

        Chat Window

        The in-game chat feature adds a strong social element to any multiplayer game and creates an engaging experience for young and old players alike.

        Cross-Platform Multiplayer Compatibility

        Just like all V-Play features, V-Play Multiplayer is a cross-platform solution. Your game will work on both iOS and Android devices, but it also means players on iOS devices can play against Android users and vice versa.


        V-Play Multiplayer also works on Windows Phone and Desktop platforms, including Windows, Mac OS X and Linux.

        Easy Integration

        The V-Play Multiplayer component can be included in your game with less than 100 lines of code. This means you can integrate this feature into your game in less than 10 minutes.

        You can use the documentation found here to find out how to add this feature to your game. Just copy and paste the code snippet into the GameWindow of your game to add the V-Play Multiplayer feature!

        You can see a preview of the V-Play Multiplayer code and a sample multiplayer application here:

        Multiplayer Example Multiplayer Example

         import VPlay 2.0
         import QtQuick 2.0
         GameWindow {
           id: gameWindow
           VPlayGameNetwork {
             id: gameNetwork
             gameId: 285 // create your own gameId in the Web Dashboard
             secret: "AmazinglySecureGameSecret"
             multiplayerItem: multiplayer
           VPlayMultiPlayer {
             id: multiplayer
             appKey: 'dd7f1761-038c-4722-9f94-812d798cecfb'
             gameNetworkItem: gameNetwork
             multiplayerView: multiplayerView
             playerCount: 2
             onGameStarted: { // this signal is emited when the multiplayer game starts
               matchmakingScene.visible = false // hide the matchmaking scene
               gameScene.visible = true // and show the game scene instead
           Scene { // our matchmaking scene
             id: matchmakingScene
             VPlayMultiplayerView { // adds the default multiplayer UI
               id: multiplayerView
           Scene { // our game scene
             id: gameScene
             visible: false // hidden at startup, shown when the multiplayer game starts
             StyledButton {
               anchors.centerIn: parent
               // we use property bindings to check if it's our turn
               enabled: multiplayer.myTurn // enable this button only if it is my turn
               color: multiplayer.myTurn ? "lightgreen" : "grey"
               text: multiplayer.amLeader ? "PING" : "PONG" // host: "PING", player: "PONG"
               onClicked: {
                 multiplayer.triggerNextTurn() // finished turn on button click

        ONU – Free & Open-Source Multiplayer Example Game

        To help make your own multiplayer games, you can take a look at this new game example in the V-Play Sample Launcher and in the free V-Play SDK. ONU is a turn-based card game for up to 4 players and has similar gameplay to the popular card game UNO. This sample will show you how to introduce the V-Play Multiplayer component to your own game.


        You can play ONU and watch the multiplayer features in action, even without downloading the V-Play SDK. It’s available now on the App Store and Google Play.

        App Store Google Play

        Other Multiplayer Examples

        There are more multiplayer examples with full source code available to help you to get started:

        You can find these examples in the free V-Play SDK in the folder <Your V-PlaySDK>/Examples/V-Play/examples/multiplayer.

        For the full changelog of this release 2.9.0 see here.


        How to Update

        Test out these new features by following these steps
        Step 1

        Open the V-Play SDK Maintenance Tool in your V-Play SDK directory. Choose “Update components” and finish the update process to get V-Play 2.9.0 as described in the V-Play Update Guide

        If you haven’t installed V-Play yet, you can do so now with the latest installer from here.

        Step 2

        The V-Play Sample Launcher allows you to quickly test and run all the open-source examples and demo apps & games that come with the V-Play SDK, from a single desktop application.

        After installing V-Play, you can start the V-Play Sample Launcher from the application shortcut in your V-Play SDK directory.

        Sample Launcher-v-play

        Now you can explore all of the new features included in V-Play 2.9.0!



        More Posts like This

        How to Make a Game like Super Mario Maker with Our New Platformer Level Editor

        super mario level editor blog

        16 Great Sites Featuring Free Game Graphics for Developers

        game graphics

        The 13 Best Qt, QML & V-Play Tutorials and Resources for Beginners

        tutorials capture

        21 Tips That Will Improve Your User Acquisition Strategy

        User Acquisition

        The post Release 2.9.0: V-Play Multiplayer Launch appeared first on V-Play Engine.

        snappy sensors

        Sensors are an important part of IoT. Phones, robots and drones all have a slurry of sensors. Sensor chips are everywhere, doing all kinds of jobs to help and entertain us. Modern games and game consoles can thank sensors for some wonderfully active games.

        Since I became involved with sensors and wrote QtSensorGestures as part of the QtSensors team at Nokia, sensors have only gotten cheaper and more prolific.

        I used Ubuntu Server, snappy, a raspberry pi 3, and the senseHAT sensor board to create a senseHAT sensors snap. Of course, this currently only runs in devmode on raspberry pi3 (and pi2 as well) .

        To future proof this, I wanted to get sensor data all the way up to QtSensors, for future QML access.

        I now work at Canonical. Snappy is new and still in heavy development so I did run into a few issues. First up was QFactoryLoader which finds and loads plugins, was not looking in the correct spot. For some reason, it uses $SNAP/usr/bin as it's QT_PLUGIN_PATH. I got around this for now by using a wrapper script and setting QT_PLUGIN_PATH to $SNAP/usr/lib/arm-linux-gnueabihf/qt5/plugins

        Second issue was that QSensorManager could not see it's configuration file in /etc/xdg/QtProject which is not accessible to a snap. So I used the wrapper script to set up  XDG_CONFIG_DIRS as $SNAP/etc/xdg

        [NOTE] I just discovered there is a part named "qt5conf" that can be used to setup Qt's env vars by using the included command qt5-launch  to run your snap's commands.

        Since there is no libhybris in Ubuntu Core, I had to decide what QtSensor backend to use. I could have used sensorfw, or maybe iio-sensor-proxy but RTIMULib already worked for senseHAT. It was easier to write a QtSensors plugin that used RTIMULib, as opposed to adding it into sensorfw. iio-sensor-proxy is more for laptop like machines and lacks many sensors.
        RTIMULib uses a configuration file that needs to be in a writable area, to hold additional device specific calibration data. Luckily, one of it's functions takes a directory path to look in. Since I was creating the plugin, I made it use a new variable SENSEHAT_CONFIG_DIR so I could then set that up in the wrapper script.

        This also runs in confinement without devmode, but involves a simple sensors snapd interface.
        One of the issues I can already see with this is that there are a myriad ways of accessing the sensors. Different kernel interfaces - iio,  sysfs, evdev, different middleware - android SensorManager/hybris, libhardware/hybris, sensorfw and others either I cannot speak of or do not know about.

        Once the snap goes through a review, it will live here https://code.launchpad.net/~snappy-hwe-team/snappy-hwe-snaps/+git/sensehat, but for now, there is working code is at my sensehat repo.

        Next up to snapify, the Matrix Creator sensor array! Perhaps I can use my sensorfw snap or iio-sensor-proxy snap for that.

        Swift 4.0 (Beta): What’s new

        Those of you keeping an eye on the Swift or Isode Twitter accounts will have noticed that a beta release of the new Swift 4.0 is now available for download.

        Swift 4.0 includes a number of important functional changes compared to Swift 3.0 as well as a significant change to the look and feel of the product.

        The main changes are listed in the changelog but there are two big changes that you’ll notice immediately on launching this Swift beta:

        Better Chat Monitoring

        Swift already makes it very easy to monitor events in multiple chat rooms through the use of keyword highlighting rules. In response to requests from a number of users we’ve supplemented this with the addition of a “trellis” layout option, allowing multiple chats and rooms to be tiled instead of being exclusively displayed as tabs within a single window.

        Trellis Layout

        This new option (Change Layout from the View menu) allows the user to define the number and arrangement of tiles to be displayed simultaneously and then move chats or rooms into an appropriate position. The trellis layout option and the existing tabbed layout option can be flexibly combined.

        New Chat Design

        We’ve introduced a new, cleaner chat design which we believe will enable users (especially in MUC rooms) to keep better track of their own contributions to conversations allows for better display of message receipts and better indication of unread messages.

        The Swift 4.0 beta is available for Windows, MAC OS X, Ubuntu & Debian Linux. Please email (swift@swift.im) or tweet (@swift_im) any feedback you have to help us further improve Swift.

        Different malloc implementations for Qt apps

        TL;DR:tcmallocis faster at startup than the normal malloc when tested with a sample Qt app on an embedded device.
        The system under test hereby was a sample automotive app: TheQt application managerwith theNeptune UIrunning embedded Linux on a BoundaryDevices Nitrogen 6 Quadcore (ARMv7) with 1 GB of memory:
        The malloc implementations tested were:
        standard malloc implementation from glibc
        tcmalloc(part of Google perftools)
        jemalloc(used in FreeBSD, among others)
        Unfortunately using jemalloc resulted in a bus error, so there are no results for that implementation. The errorbacktraced to C++11 atomicsand was not investigated further.
        Plugging different malloc implementations into a Qt application is easy by using LD_PRELOAD. Furthermore, to simulate a cold boot, the Linux caches were cleared before each run with the following command:
        echo 3 > /proc/sys/vm/drop_caches
        This makes Linux drop its file system caches, which makes the startup slower and thereby the relative speedup by using different mallocs smaller. However it is closer to the real world scenario of powering on a device.
        The measured value was startup time, i.e. the time from the beginning of the main() function until the first time something is drawn onto the screen (i.e. the first time theframeSwapped() signal of QQuickWindowis called). This yields a roughly 300 ms speedup when using tcmalloc:
        This seems like a good speedup, since the startup of a QML app consists of parsing many QML files, so the amount of waiting for file I/O might be considerable. It would be interesting to check whether the relative speedup is higher when using the QML compiler.
        Memory usage is roughly the same, with a slight advantage for standard malloc:
        The measured value was the "Rss" field of the proc file system ("grep Rss: /proc/`pidof appman`/smaps") to only measure the amount of memory actually present in RAM, as opposed to the reserved size.
        Another pitfall when measuring memory usage is to only look at the "heap" sections of smaps, which apparently only tracks memory allocations made via the (s)brk commands, while anonymously mmap'ed pages are not marked with "heap".
        The "other" value in the diagram above includes sections of shared libraries and mmap'ed files.
        What has not been measured was memory usage and fragmentation when running the program for a longer time, which seems to be the focus of jemalloc.
        A big thanks goes to toPelagicorefor supplying the hardware and helping with appman installation.

        Connectivity options for mobile M2M/IoT/Connected devices

        Many of us deal or will deal with (connected) M2M/IoT devices. This might be writing firmware for microcontrollers, using a RTOS like NuttX or a full blown Unix (like) operating system like FreeBSD or Yocto/Poky Linux, creating and building code to run on the device, processing data in the backend or somewhere inbetween. Many of these devices will have sensors to collect data like GNSS position/time, temperature, light detector, measuring acceleration, see airplanes, detect lightnings, etc.

        The backend problem is work but mostly "solved". One can rely on something like Amazon IoT or creating a powerful infrastructure using many of the FOSS options for message routing, data storage, indexing and retrieval in C++. In this post I want to focus about the little detail of how data can go from the device to the backend.

        To make this thought experiment a bit more real let's imagine we want to build a bicycle lock/tracker. Many of my colleagues ride their bicycle to work and bikes being stolen remains a big tragedy. So the primary focus of an IoT device would be to prevent theft (make other bikes a more easy target) or making selling a stolen bicycle more difficult (e.g. by easily checking if something has been stolen) and in case it has been stolen to make it more easy to find the current location.


        Let's assume two different architectures. One possibility is to have the bicycle actively acquire the position and then try to push this information to a server ("active push"). Another approach is to have fixed installed scanning stations or users to scan/report bicycles ("passive pull"). Both lead to very different designs.

        Active Push

        The system would need some sort of GNSS module, a microcontroller or some full blown SoC to run Linux, an accelerator meter and maybe more sensors. It should somehow fit into an average bicycle frame, have good antennas to work from inside the frame, last/work for the lifetime of a bicycle and most importantly a way to bridge the air-gap from the bicycle to the server.

        Push architecture

        Passive Pull

        The device would not know its position or if it is moved. It might be a simple barcode/QR code/NFC/iBeacon/etc. In case of a barcode it could be the serial number of the frame and some owner/registration information. In case of NFC it should be a randomized serial number (if possible to increase privacy). Users would need to scan the barcode/QR-code and an application would annotate the found bicycle with the current location (cell towers, wifi networks, WGS 84 coordinate) and upload it to the server. For NFC the smartphone might be able to scan the tag and one can try to put readers at busy locations.

        The incentive for the app user is to feel good collecting points for scanning bicycles, maybe some rewards if a stolen bicycle is found. Buyers could easily check bicycles if they were reported as stolen (not considering the difficulty of how to establish ownership).

        Pull architecture

        Technology requirements

        The technologies that come to my mind are Barcode, QR-Code, play some humanly not hearable noise and decode in an app, NFCZigBee6LoWPANBluetooth, Bluetooth Smart, GSM, UMTS, LTE, NB-IOT. Next I will look at the main differentiation/constraints of these technologies and provide a small explanation and finish how these constraints interact with each other.

        World wide usable

        Radio Technology operates on a specific set of radio frequencies (Bands). Each country may manage these frequencies separately and this can lead to having to use the same technology on different bands depending on the current country. This will increase the complexity of the antenna design (or require multiple of them), make mechanical design more complex, makes software testing more difficult, production testing, etc. Or there might be multiple users/technologies on the same band (e.g. wifi + bluetooth or just too many wifis).

        Power consumption

        Each radio technology requires to broadcast and might require to listen or permanently monitor the air for incoming messages ("paging"). With NFC the scanner might be able to power the device but for other technologies this is unlikely to be true. One will need to define the lifetime of the device and the size of the battery or look into ways of replacing/recycling batteries or to charge them.


        Different technologies were designed to work with sender/receiver being away at different min/max. distances (and speeds but that is not relevant for the lock nor is the bandwidth for our application). E.g. with Near Field Communication (NFC) the workable range is meters while with GSM it will be many kilometers and with UMTS the cell size depends on how many phones are currently using it (the cell is breathing).

        Pick two of three

        Ideally we want something that works over long distances, requires no battery to send/receive and the system is still pushing out the position/acceleration/event report to servers. Sadly this is not how reality works and we will have to set priorities.

        The more bands to support, the more complicated the antenna design, production, calibration, testing. It might be that one technology does not work in all countries or that it is not equally popular or the market situation is different, e.g. some cities have city wide public hotspots, some don't.

        Higher power transmission increases the range but increases the power consumption even more. More current will be used during transmission which requires a better hardware design to buffer the spikes, a bigger battery and ultimately a way to charge or efficiently replace batteries.

        Given these constraints it is time to explore some technologies. I will use the one already mentioned at the beginning of this section.


        Technology Bands Global coverage Range Battery needed Scan Device needed Cost of device Arch. Comment
        Barcode/QR-Code Optical Yes Centimeters No App scanning barcode required extremely low Pull Sticker needs to be hard to remove and visible, maybe embedded to the frame
        Play audio Non human hearable audio Yes Centimeters Yes App recording audio moderate Pull Button to play audio?
        NFC 13.56 Mhz Yes Centimeters No Yes extremely low Pull Privacy issues
        RFID Many Yes, but not on single band Centimeters to meters Yes Receiver required low Pull Many bands, specific readers needed
        Bluetooth LE 2.4 Ghz Yes Meters Yes Yes, but common low Pull/Push Competes with Wifi for spectrum
        ZigBee Multiple Yes, but not on single band Meters Yes Yes mid Push Not commonly deployed, software more involved
        6LoWPAN Like ZigBee Like ZigBee Meters Yes Yes low Push Uses ZigBee physical layer and then IPv6. Requires 6LoWPAN to Internet translation
        GSM 800/900, 1800/1900 Almost besides South Korea, Japan, some islands Kilometers Yes No moderate Push Almost global coverage, direct communication with backend possible
        UMTS Many Less than GSM but South Korea, Japan Meters to Kilometers depends on usage Yes No high Push Higher power usage than GSM, higher device cost
        LTE Many Less than GSM Designed for kilometers Yes No high Push Expensive, higher power consumption
        NB-IOT (LTE) Many Not deployed Kilometers Yes No high Push Not deployed and coming in the future. Can embed GSM equally well into a LTE carrier


        Both a push and pull architecture seem to be feasible and create different challenges and possibilities. A pull architecture will require at least Smartphone App support and maybe a custom receiver device. It will only work in regions with lots of users and making privacy/tracking more difficult is something to solve.

        For push technology using GSM is a good approach. If coverage in South Korea or Japan is required a mix of GSM/UMTS might be an option. NB-IOT seems nice but right now it is not deployed and it is not clear if a module will require less power than a GSM module. NB-IOT might only be in the interest of basestation vendors (the future will tell). Using GSM/UMTS brings its own set of problems on the device side but that is for other posts.

        GammaRay 2.5 release

        GammaRay 2.5 has been released, the biggest feature release yet of our Qt introspection tool. Besides support for Qt 5.7 and in particular the newly added Qt 3D module a slew of new features awaits you, such as access to QML context property chains and type information, object instance statistics, support for inspecting networking and SSL classes, and runtime switchable logging categories.

        We also improved many existing functionality, such as the object and source code navigation and the remote view. We enabled recursive access to value type properties and integrated the QPainter analyzer in more tools.

        GammaRay is now also commercially available as part of the Qt Automotive suite, which includes integration with QtCreator for convenient inspection of embedded targets using Linux, QNX, Android or Boot2Qt.

        Download GammaRay

        The post GammaRay 2.5 release appeared first on KDAB.

        KDStateMachineEditor 1.1.0 released

        KDStateMachineEditor is a Qt-based framework for creating Qt State Machine metacode using a graphical user interface. It works on all major platforms and is now available as part of the Qt Auto suite.

        The latest release of KDAB’s KDStateMachineEditor includes changes to View, API and Build system.


        • Button added to show/hide transition labels
        • Now using native text rendering
        • Status bar removed


        • API added for context menu handling (cf. StateMachineView class)

        Build system

        • Toolchain files added for cross-compiling (QNX, Android, etc.)
        • Compilation with namespaced Qt enabled
        • Build with an internal Graphviz build allowed (-DWITH_INTERNAL_GRAPHVIZ=ON)

        KDStateMachineEditor Works on all major platforms and has been tested on Linux, OS X and Windows.

        Prebuilt packages for some popular Linux distributions can be found here.

        Homebrew recipe for OSX users can be found here.

        The post KDStateMachineEditor 1.1.0 released appeared first on KDAB.

        KDAB contributions to Qt 5.7

        Hello, and welcome to the usual appointment with a new release of Qt!

        Qt 5.7 has just been released, and once more, KDAB has been a huge part of it (we are shown in red on the graph):

        Qt Project commit stats, up to June 2016. From http://www.macieira.org/blog/qt-stats/

        Qt Project commit stats, up to June 2016. From http://www.macieira.org/blog/qt-stats/

        In this blog post I will show some of the outstanding contributions by KDAB engineers to the 5.7 release.

        Qt 3D

        The star of Qt 5.7 is the first stable release of Qt 3D 2.0. The new version of Qt 3D is a total redesign of its architecture into a modern and streamlined 3D engine, exploiting modern design patterns such as entity-component systems, and capable to scale due to the heavily threaded design. This important milestone was the result of a massive effort done by KDAB in coordination with The Qt Company.


        If you want to know more about what Qt 3D can do for your application, you can watch this introductive webinar recorded by KDAB’s Dr. Sean Harmer and Paul Lemire for the 5.7 release.

        Qt on Android

        Thanks to KDAB’s BogDan Vatra, this release of Qt saw many improvements to its Android support. In no particular order:

        • Qt can now be used to easily create Android Services, that is, software components performing background tasks and that are kept alive even when the application that started them exits. See here for more information.
        • The QtAndroidExtras module gained helper functions to run Runnables on the Android UI thread. They are extremely useful for accessing Android APIs from C++ code that must be done on Android UI thread. More info about this is available in this blog post by BogDan.
        • Another addition to the QtAndroidExtras module is the QtAndroid::hideSplashScreen function, which allows a developer to programmatically hide the splash screen of their applications.
        • The QtGamepad module gained Android support.

        Performance and correctness improvements

        A codebase as big as Qt needs constant fixes, improvements and bugfixes. Sometimes these come from bug reports, sometimes by reading code in order to understand it better, and in some other cases by analyzing the codebase using the latest tools available. KDAB is committed to keeping Qt in a great shape, and that is why KDAB engineers spend a lot of time polishing the Qt codebase.

        Some of the results of these efforts are:

        • QHash gained equal_range, just like QMap and the other STL associative container. This function can be used to iterate on all the values of a (multi)hash that have the same key without performing any extra memory allocation. In other words, this code:
          // BAD!!! allocates a temporary QList 
          // for holding the values corresponding to "key"
          foreach (const auto &value, hash.values(key)) {

          can be changed to

          const auto range = hash.equal_range(key);
          for (auto i = range.first; i != range.second; ++i) {

          which never throws (if hash is const), expands to less code and does not allocate memory.

        • Running Qt under the Undefined Behavior Sanitizer revealed dozens of codepaths where undefined behaviour was accidentally triggered. The problems ranged from potential signed integer overflows and shift of negative numbers to misaligned loads, invalid casts and invalid calls to library functions such as memset or memcpy. KDAB’s Senior Engineer Marc Mutz contributed many fixes to these undefined behaviours, fixes that made their way into Qt 5.6.1 and Qt 5.7.
        • Some quadratic loops were removed from Qt and replaced with linear or linearithmic ones. Notably, an occurrence of such loops in the Qt Quick item views caused massive performance degradations when sorting big models, which was fixed in this commit by KDAB’s engineer Milian Wolff.
        • Since Qt 5.7 requires the usage of a C++11 compiler, we have starting porting foreach loops to ranged for loops. Ranged for loops expand to less code (because there is no implicit copy taking place), and since compilers recognize them as a syntactic structure, they can optimize them better. Over a thousand occurrences were changed, leading to savings in Qt both in terms of library size and runtime speed.
        • We have also started using C++ Standard Library features in Qt. While Qt cannot expose STL types because of its binary compatibility promise, it can use them in its own implementation. A big advantage of using STL datatypes is that they’re generally much more efficient, have more features and expand to a lot less code than Qt counterpart. For instance, replacing some QStack usages with std::stack led to 1KB of code saved per instance replaced; and introducing std::vector in central codepaths (such as the ones in QMetaObjectBuilder) saved 4.5KB.
        • While profiling Qt3D code, we found that the mere act of iterating over resources embedded in an application (by means of QDirIterator) uncompressed them. Then, reading a given resource via QFile uncompressed it again. This was immediately fixed in this commit by KDAB’s Director of Automotive, Volker Krause.

        Other contributions

        Last but not least:

        • It is now possible to use the Qt Virtual Keyboard under QtWayland compositors.
        • The clang-cl mkspec was added. This mkspec makes it possible to build Qt using the Clang frontend for MSVC. Stay tuned for more blog posts on this matter. 🙂
        • A small convenience QFlag::setFlag method was added, to set or unset a flag in a bitmask without using bitwise operations.

        About KDAB

        KDAB is a consulting company dedicated to Qt and offering a wide variety of services and providing training courses in:

        KDAB believes that it is critical for our business to invest in Qt3D and Qt, in general, to keep pushing the technology forward, ensuring it remains competitive.

        The post KDAB contributions to Qt 5.7 appeared first on KDAB.

        Four Habit-Forming Tips to Faster C++

        Are you a victim of premature pessimisation? Here’s a short definition from Herb Sutter:

        Premature pessimization is when you write code that is slower than it needs to be, usually by asking for unnecessary extra work, when equivalently complex code would be faster and should just naturally flow out of your fingers.

        Despite how amazing today’s compilers have become at generating code, humans still know more about the intended use of a function or class than can be specified by mere syntax. Compilers operate under a host of very strict rules that enforce correctness at the expense of faster code. What’s more, modern processor architectures sometimes compete with C++ language habits that have become ingrained in programmers from decades of previous best practice.

        I believe that if you want to improve the speed of your code, you need to adopt habits that take advantage of modern compilers and modern processor architectures—habits that will help your compiler generate the best-possible code. Habits that, if you follow them, will generate faster code before you even start the optimisation process.

        Here’s four habit-forming tips that are all about avoiding pessimisation and, in my experience, go a long way to creating faster C++ classes.

        1) Make use of the (named-) return-value optimisation

        According to Lawrence Crowl, (named-) return-value optimisation ((N)RVO) is one of the most important optimisations in modern C++. Okay—what is it?

        Let’s start with plain return-value optimization (RVO). Normally, when a C++ method returns an unnamed object, the compiler creates a temporary object, which is then copy-constructed into the target object.

        MyData myFunction() {
            return MyData(); // Create and return unnamed obj
        MyData abc = myFunction();

        With RVO, the C++ standard allows the compiler to skip the creation of the temporary, treating both object instances—the one inside the function and the one assigned to the variable outside the function—as the same. This usually goes under the name of copy elision. But what is elided here is the temporary and the copy.

        So, not only do you save the copy constructor call, you also save the destructor call, as well as some stack memory. Clearly, elimination of extra calls and temporaries saves time and space, but crucially, RVO is an enabler for pass-by-value designs. Imagine MyData was a large million-by-million matrix. There mere chance that some target compiler could fail to implement this optimisation would make every good programmer shy away from return-by-value and resort to out parameters instead (more on those further down).

        As an aside: don’t C++ Move Semantics solve this? The answer is: no. If you move instead of copy, you still have the temporary and its destructor call in the executable code. And if your matrix is not heap-allocated, but statically sized, such as a std::array<std::array<double, 1000>, 1000>>, moving is the same as copying. With RVO, you mustn’t be afraid of returning by value. You must unlearn what you have learned and embrace return-by-value.

        Named Return Value Optimization is similar but it allows the compiler to eliminate not just rvalues (temporaries), but lvalues (local variables), too, under certain conditions.

        What all compilers these days (and for some time now) reliably implement is NRVO in the case where there is a single variable that is passed to every return, and declared at function scope as the first variable:

        MyData myFunction() {
            MyData result;           // Declare return val in ONE place
            if (doing_something) {
                return result;       // Return same val everywhere
            // Doing something else
            return result;           // Return same val everywhere
        MyData abc = myFunction();

        Sadly, many compilers, including GCC, fail to apply NRVO when you deviate even slightly from the basic pattern:

        MyData myFunction() {
            if (doing_something)
                return MyData();     // RVO expected
            MyData result;
            // ...
            return result;           // NRVO expected
        MyData abc = myFunction();

        At least GCC fails to use NRVO for the second return statement in that function. The fix in this case is easy (go back to the first version), but it’s not always that easy. It is an altogether sad state of affairs for a language that is said to have the most advanced optimisers available to it for compilers to fail to implement this very basic optimisation.

        So, for the time being, get your fingers accustomed to typing the classical NRVO pattern: it enables the compiler to generate code that does what you want in the most efficient way enabled by the C++ standard.

        If diving into assembly code to check whether a particular patterns makes your compiler drop NRVO isn’t your thing, Thomas Brown provides a very comprehensive list of compilers tested for their NRVO support and I’ve extended Brown’s work with some additional results.

        If you start using the NVRO pattern but aren’t getting the results you expect, your compiler may not automatically perform NRVO transformations. You may need to check your compiler optimization settings and explicitly enable them.

        Return parameters by value whenever possible

        This is pretty simple: don’t use “out-parameters”. The result for the caller is certainly kinder: we just return our value instead of having the caller allocate a variable and pass in a reference. Even if your function returns multiple results, nearly all of the time you’re much better off creating a small result struct that the function passes back (via (N)RVO!):

        That is, instead of this:

        void convertToFraction(double val, int &numerator, int &denominator) {
            numerator = /*calculation */ ;
            denominator = /*calculation */ ;
        int numerator, denominator;
        convertToFraction(val, numerator, denominator); // or was it "denominator, nominator"?

        You should prefer this:

        struct fractional_parts {
            int numerator;
            int denominator;
        fractional_parts convertToFraction(double val) {
            int numerator = /*calculation */ ;
            int denominator = /*calculation */ ;
            return {numerator, denominator}; // C++11 braced initialisation -> RVO
        auto parts = convertToFraction(val);

        This may seem surprising, even counter-intuitive, for programmers that cut their teeth on older x86 architectures. You’re just passing around a pointer instead of a big chunk of data, right? Quite simply, “out” parameter pointers force a modern compiler to avoid certain optimisations when calling non-inlined functions. Because the compiler can’t always determine if the function call may change an underlying value (due to aliasing), it can’t beneficially keep the value in a CPU register or reorder instructions around it. Besides, compilers have gotten pretty smart—they don’t actually do expensive value passing unless they need to (see the next tip). With 64-bit and even 32-bit CPUs, small structs can be packed into registers or automatically allocated on the stack as needed by the compiler. Returning results by value allows the compiler to understand that there isn’t any modification or aliasing happening to your parameters, and you and your callers get to write simpler code.

        3) Cache member-variables and reference-parameters

        This rule is straightforward: take a copy of the member-variables or reference-parameters you are going to use within your function at the top of the function, instead of using them directly throughout the method. There are two good reasons for this.

        The first is the same as the tip above—because pointer references (even member-variables in methods, as they’re accessed through the implicit this pointer) put a stick in the wheels of the compiler’s optimisation. The compiler can’t guarantee that things don’t change outside its view, so it takes a very conservative (and in most cases wasteful) approach and throws away any state information it may have gleaned about those variables each time they’re used anew. And that’s valuable information that can help the compiler eliminate instructions and references to memory.

        Another important reason is correctness. As an example provided by Lawrence Crowl in his CppCon 2014 talk “The Implementation of Value Types”, instead of this complex number multiplication:

        template <class T> 
        complex<T> &complex<T>::operator*=(const complex<T> &a) {
           real = real * a.real – imag * a.imag;
           imag = real * a.imag + imag * a.real;
           return *this;

        You should prefer this version:

        template <class T> 
        complex<T> &complex<T>::operator*=(const complex<T> &a) {
           T a_real = a.real, a_imag = a.imag;
           T t_real =   real, t_imag =   imag; // t == this
           real = t_real * a_real – t_imag * a_imag;
           imag = t_real * a_imag + t_imag * a_real;
           return *this;

        This second, non-aliased version will still work properly if you use value *= value to square a number; the first one won’t give you the right value because it doesn’t protect against aliased variables.

        To summarise succinctly: read from (and write to!) each non-local variable exactly once in every function.

        4) Organize your member variables intelligently

        Is it better to organize member variables for readability or for the compiler? Ideally, you pick a scheme that works for both.

        And now is a perfect time for a short refresher about CPU caches. Of course data coming from memory is very slow compared to data coming from a cache. An important fact to remember is that data is loaded into the cache in (typically) 64-byte blocks called cache lines. The cache line—that is, your requested data and the 64 bytes surrounding it—is loaded on your first request for memory absent in the cache. Because every cache miss silently penalises your program, you want a well-considered strategy for ensuring you reduce cache misses whenever possible. Even if the first memory access is outside the cache, trying to structure your accesses so that a second, third, or forth access is within the cache will have a significant impact on speed. With that in mind, consider these tips for your member-variable declarations:

        • Move the most-frequently-used member-variables first
        • Move the least-frequently-used member-variables last
        • If variables are often used together, group them near each other
        • Try to reference variables in your functions in the order they’re declared
        • Keep an eye out on alignment requirements of member-variables, lest you waste space on padding

        Nearly all C++ compilers organize member variables in memory in the order in which they are declared. And grouping your member variables using the above guidelines can help reduce cache misses that drastically impact performance. Although compilers can be smart about creating code that works with caching strategies in a way that’s hard for humans to track, the C++ rules on class layout make it hard for compilers to really shine. Your goal here is to help the compiler by stacking the deck on cache-line loads that will preferentially load the variables in the order you’ll need them.

        This can be a tough one if you’re not sure how frequently things are used. While it’s not always easy for complicated classes to know what member variables may be touched more often, generally following this rule of thumb as well as you can will help. Certainly for the simpler classes (string, dates/times, points, complex, quaternions, etc) you’ll probably be accessing most member variables most of the time, but you can still declare and access your member variables in a consistent way that will help guarantee that you’re minimizing your cache misses.


        The bottomline is that it still takes some amount of hand-holding to get a compiler to generate the best code. Good coding-habits are by no means the end-all, but are certainly a great place to start.

        The post Four Habit-Forming Tips to Faster C++ appeared first on KDAB.

        Programmation Qt Quick (QML)

        Paris, le 22 – 26 Août

        En août offrez-vous une formation Qt en français avec un expert.

        Apprenez les techniques de développement d’applications graphiques modernes, en utilisant la technologie Qt Quick (basée sur le langage QML) ainsi que la technologie objet Qt/C++.

        “Mon équipe C++ a été ravie de cette formation. J‘espère pouvoir implémenter Qt dans nos applis ASAP.” CGG Veritas, Massy, France

        Découvrir plus!

        Voyez autres retours clients.


        The post Programmation Qt Quick (QML) appeared first on KDAB.