Sunday, May 9, 2010

#4, Graphics Programming, Part 1

Graphics programming has always been a topic of interest for me as it combines two of my big passions: graphics and programming (duh!). Last week I mentioned the love I had for the miracle of digital electronics, 8051 micro-controller (In fact they were a family: 8051 was an all-in-one controller with RAM and ROM on chip; 8031 had no ROM and 8751 had EPROM). We lived happily together till mid-90s when she passed away (CPU years are even shorter than dog years!). In all that time, the only thing that came close to 8051 in my heart was the VGA card. Video Graphics Array (VGA), coming after CGA and EGA, was the first serious and worthy standard for graphics cards that allowed you to do real graphics. The EGA/VGA Programmer's Guide was my bible. It taught me the beauty of graphics done in hardware rather than software, and that is a key in the series of posts I'm going to have on the topic of graphics programming in OSX and Windows. But first, and most of this week, I have to explain some basics to make sure we are all on the same page. To those of you who already know these, sorry. Just read it as a story but don't you dare skipping ahead.

Back in the late 80s and early 90s, I was working on a real-time signal acquisition system that needed to plot the value of external signals (read through an analog-to-digital card). Kinda like what you see in the labs and hospitals. It would look like a continuously shifting plot where new data comes from one side and old data leaves from the other. This basically involves scrolling the screen. The first solution that comes to the mind of a novice programmer is to set the pixels on screen and with each new data, erase the old ones and put the new ones on, drawing the existing pixels with a shift in position. One of the earliest things you learn as a graphics programmer is that drawing pixels on screen one by one is slow. Your graphics card has a memory that holds the screen data. This memory, usually called video buffer, is not directly accessible to CPU so writing to it is slow and should be avoided as much as possible. The common technique (which my IMD-2004 students should know about) is the infamous double-buffering, i.e. using a secondary buffer in RAM (primary buffer being the one on the graphics card), drawing all pixels for a frame on the secondary buffer, and then copying the whole buffer only once. For normal operations, especially these days, this is fast enough. On our old PCs and with the real-time data we had, it wasn't. Now what this process illustrates is the graphic operation being done in software, meaning by your CPU. My first pleasant surprise with the VGA card was its ability to help you speed this up. Assuming your graphics card had more memory than needed by the screen data, you could define the starting address of the screen. Hopefully some of you can guess what this means. By changing the starting address you could scroll through data and only update the new values. I was amazed by how fast I could shift the plot on screen and it took me a while before I explained this miracle to my clueless colleagues. This was the first example I saw of doing things "in hardware" (VGA card also had the built-in support for split screens, so you could have a shifting and a static part on screen. Now that really made some other programmers jealous).

Those were of course simple examples. But that was the start of thinking about the graphics card as a processor that is capable of processing data rather than just refreshing the screen. Obviously people started to think of more advanced processing done in hardware (so your CPU time didn't have to be spent on it). In 2D graphics one example was copying multiple buffers and sprites to the graphic card and using them intelligently when needed. But the real beauty of "graphics accelerators" came with 3D. When you create a 3D world, you have XYZ values for all the points, and you define lighting, camera location and that sort of things. The image you see on screen is usually 2D though. So there is a need for some processing to map your 3D world to a 2D image. In 3D applications this mapping is called rendering (or sometimes shading although they are not exactly the same). Performing rendering in software is a time-consuming process and that's why it took the computer industry a while to have real 3D (There were cheats like the one in Doom). The real cause of advances in 3D programs was the invention of graphics cards with built-in support for 3D rendering. This support was provided to software through display drivers, the code written by hardware manufacturer that receives data and commands from software and passes them to the hardware. Eventually this resulted in Graphics Processing Unit (GPU) on graphics cards which is a real processor (now with its own programming language) that runs on the graphics card instead of the computer's motherboard.

Hold this thought for a minute.

Operating systems provide a series of software modules and standard methods for applications to use, for example for memory management, disc access, and graphics. We usually call these the Application Programming Interface (API). There are two reasons to use APIs: (1) to make it easier for programmers so they don't have to write all the code, and (2) to make sure access to system resources is done in a safe way (for example two programs don't write to each other's memory or window). In Windows the initial APIs were Win32 and display-related part of it, Graphics Device Interface (GDI). To be safe and general enough, initial graphics APIs had a lot of overhead and were quite slow. Microsoft soon realized that this can cause problem for applications that needed high-performance graphics, e.g. games. They came up with a couple of unsuccessful solutions like GameAPI, and finally developed DirectDraw, a new API that allowed direct and fast access to graphic card. It also supported new graphics accelerators which we were talking about before. DirectDraw later evolved into a set of modules called DirectX including audio, input devices, and Direct3D for 3D graphics (and also 2D in later versions by practically removing DirectDraw and having just one API sometimes called DirectX Graphics). DirectX is obviously a Windows-specific API. People in the open-source and Unix world on the other hand came up with a similar solution for 3D API called OpenGL which is now available for all major platforms and supports 2D graphics as well as 3D. Similar to Direct3D, OpenGL supports graphics accelerators too. After a couple of other approaches, Apple decided to use OpenGL as the foundation for their graphics system in Unix-based OSX. So far this means that comparing professional graphics programming in Windows and OSX comes down to comparing Direct3D and OpenGL. But there are a few complications here:

1- On Windows, you can use OpenGL or Direct3D.
2- On Windows, you can still use GDI (for apps that are not graphics-intensive).
3- On Windows (more exactly .NET framework) you may have extra APIs on top of Direct3D that simplify special things like game programming. Best example is XNA, a cross platform game development API.
4- On both Windows and OSX, you can use 3rd party APIs which in turn use native APIs of the operating system so obviously are not efficient but can be easier.
5- On Mac, OpenGL is just the start. Apple, who like Microsoft is not a big fan of open standards and interoperability (I'll talk about this general issue later), also came up with a whole series of other APIs that are OSX-specific and usually (but not always) run on top of OpenGL. These APIs are recommended methods of programming on OSX, are not portable, and have different features and performance results compared to basic OpenGL. Examples are OpenCL, Core Graphics, Core Video, Core Animation, Quartz, and Cocoa.
6- All of these APIs, as I mentioned before, are "somehow" supported on other platforms by the same company but I'm not going to get into that right now.

So what does this mean? For Windows we have standard Direct3D and Direct3D on .NET framework (I'll explain the difference shortly). For OSX we have OpenGL and all those other Apple APIs. Since OpenGL is a 3rd party add-on for Windows, I won't consider it as an option in Windows for comparison. To compare, I consider the run-time performance, complexity of programming, and available features. Finally here are some initial observations:

1- Standard Direct3D on Windows is basically a C/C++ API. When Microsoft released .NET framework as a basis for interoperability between its different platforms, they developed C# as the native language for it. Direct3D API is available for C# pretty much with the same structure as C/C++. The differences are mainly due to the differences between C# and C/C++ and usually mean the C# version is rather easier. If you go to XNA it gets even easier than that without a huge loss in performance ("huge" being the keyword here).

2- OpenGL is a portable standard. This means if you stick to it, you can compile your code for all major platforms with minimum change. On the other hand, it is more complicated compared to Direct3D and a lot more complicated compared to XNA, and it doesn't give a major performance advantage either. In fact because most hardware manufacturers consider Windows as the major non-console platform for high-performance graphics (a market thing), the Direct3D drivers are usually better than OpenGL ones which means using Direct3D is not only easier but also better performance-wise.

3- 2D Graphics APIs on OSX provide more functionality compared to Windows. Many features in Core Video, Core Animation, and Quartz are not available through Windows APIs and need extra programming and libraries. Although when it comes to 3D (e.g. games) Windows has better performance and more native features (such as XNA if we consider it native).

4- OSX C/C++ APIs are a bit more complex than the Windows ones, and considerably more complex than the .NET version.

5- As I said before, Apple is pushing for its new Cocoa framework that is based on Objective-C. I haven't had the chance to do much with it yet, but it doesn't seem to be better than the C/C++ APIs in terms of complexity or performance.

All of these are still initial observations based on typical examples like displaying an image or accessing bitmap or 3D model data. For now it seems to me that OSX provides more built-in 2D graphics features while Windows APIs have done a better job providing 3D features. In terms of code complexity and performance (specially 3D), Windows is more likely to be the winner.

My next step is focusing on 2D and 3D programming in more details, but before I finish, here are some of the new tools I found for my MBP:
1- iAntiVirus on OSX, free, for Mac-only threats so faster and lighter
2- LimeWire on OSX, free, file-sharing on Gnutella and BitTorent networks (also available on Windows)
3- HFS Explorer on Windows, free, for access to Mac HD when booting with Windows (VMWare Fusion allows such access when using virtual machine)

I'LL BE BACK!

No comments:

Post a Comment