Skip to main content

Optimizing Wine on OS X

I've been doing some performance analysis of EVE running under Wine on OS X. My main test cases are a series of scenes run with the EVE Probe - our internal benchmarking tool. This is far more convenient than running the full EVE client, as it focuses purely on the graphics performance and does not require any user input.

Wine Staging

One thing I tried was to build Wine Staging. On its own, that did not really change anything. Turning on CSMT, on the other hand, made quite a difference, taking the average frame time down by 30% for the test scene I used. While the performance boost was significant there were also significant glitches in the rendering, with parts of the scene flickering in and out. Too bad - it means I can't consider this yet for EVE, but I will monitor the progress of this.

OpenGL Profiler

Apple has the very useful OpenGL profiler available for download. I tried running one of the simpler scenes under the profiler to capture statistics on the OpenGL calls made.

One thing that stood out to me was the high number of glGetError calls. Disabling them turned out to be easy enough, by recompiling Wine with -DWINE_NO_DEBUG_MSGS=1 added to CFLAGS. Running that same scene again took the number of glGetError calls down from 7.4 million to 18 thousand. The overall impact on performance was not significant, though.

The profiler can also capture a full trace of all OpenGL calls made by an application. Looking at the full trace can be interesting, if somewhat daunting. A minute long capture of this fairly simple scene yields 5.8M function calls. I plan to do further tests later on, with ultra simple scenes to better understand how DirectX calls get translated to OpenGL calls.

Multithreaded OpenGL engine

Apple's OpenGL implementation has something called the multithreaded OpenGL engine. This needs to enabled explicitly. The multithreaded OpenGL engine then creates a worker thread and transfers some of its calculations to that thread. On a multicore system (as Macs are today), this allows internal OpenGL calculations performed on the CPU to act in parallel with Wine, improving performance.

It took me a little while to figure out where to put the code to enable this, and it didn't help that the CGLEnable function did not complain when I accidentally passed in an invalid context. Eventually I added the following lines in dlls/winemac.drv/opengl.c, in the function create_context:
    TRACE("Enabling the multithreaded OpenGL engine\n");
    err = CGLEnable(context->cglcontext, kCGLCEMPEngine);
    if(err != kCGLNoError)
         WARN("Enabling the multithreaded OpenGL engine failed\n");

Enabling the multithreaded engine did help with performance although initial tests indicated the opposite. It wasn't until I tested with the glGetError calls disabled and the multithreaded engine enabled that I saw some positive results. The boost in performance was more significant for the heavier scenes, such as our deathcube of 1000 ships.

The future

I'm just starting to get to know the Wine code base and it's hard to say what optimization opportunities are in there. One thing is for certain, though, that I will spend a lot more time analyzing it and trying out different things, both in our codebase as well as Wine.


Popular posts from this blog

Large scale ambitions

Learning new things is important for every developer. I've mentioned  this before, and in the spirit of doing just that, I've started a somewhat ambitious project. I want to do a large-scale simulation, using  Elixir  and Go , coupled with a physics simulation in C++. I've never done anything in Elixir before, and only played a little bit with Go, but I figure,  how hard can it be ? Exsim I've dubbed this project exsim - it's a simulation done in Elixir. Someday I'll think about a more catchy name - for now I'm just focusing on the technical bits. Here's an overview of the system as I see it today: exsim  sits at the heart of it - this is the main server, implemented in Elixir. exsim-physics  is the physics simulation. It is implemented in C++, using the Bullet physics library. exsim-physics-viewer  is a simple viewer for the state of the physics simulation, written in Go. exsim-bot  is a bot for testing exsim, written in Go.

Working with Xmpp in Python

Xmpp is an open standard for messaging and presence, used for instant messaging systems. It is also used for chat systems in several games, most notably League of Legends made by Riot Games. Xmpp is an xml based protocol. Normally you work with xml documents - with Xmpp you work with a stream of xml elements, or stanzas - see for the full definitions of these concepts. This has some implications on how best to work with the xml. To experiment with Xmpp, let's start by installing a chat server based on Xmpp and start interacting with it. For my purposes I've chosen Prosody - it's nice and simple to install, especially on macOS with Homebrew : brew tap prosody/prosody brew install prosody Start the server with prosodyctl - you may need to edit the configuration file (/usr/local/etc/prosody/prosody.cfg.lua on the Mac), adding entries for prosody_user and pidfile. Once the server is up and running we can start poking at it

Mnesia queries

I've added search and trim to my  expiring records  module in Erlang. This started out as an  in-memory  key/value store, that I then migrated over to  using Mnesia  and eventually to a  replicated Mnesia  table. The  fetch/1  function is already doing a simple query, with  match_object . Result = mnesia : match_object ( expiring_records , # record { key = Key , value = '_' , expires_at = '_' }, read ) The three parameters there are the name of the table -  expiring_records , the matching pattern and the lock type (read lock). The  fetch/1  function looks up the key as it was added to the table with  store/3 . If the key is a tuple, we can also do a partial match: Result = mnesia : match_object ( expiring_records , # record { key = { '_' , " bongo " }, value = '_' , expires_at = '_' }, read ) I've added a  search/1  function the module that takes in a matching pattern and returns a list of items wh