Resurrected Entertainment

Archive for December, 2010

Microsoft Quick C Compiler

December 21, 2010

Quick C CompilerWhen I first came in contact with this compiler, I was just starting high school and eager for the challenges ahead (except for the material which didn’t interest me — basically non-science courses). When I went to pick the courses for the year, I noticed a couple which taught computer programming. The first course, which was a pre-requisite for the second, taught BASIC while the second course taught C programming. At this point in my life, I was an old hand at BASIC, so I basically breezed through first programme. The second course intrigued me much more. I was familiar with C programming from my relatively brief experience with the Amiga, but I had a lot left to learn. My high school didn’t use the Lattice C compiler, but a Microsoft C compiler instead. I located the gentleman who taught the course and he pointed me to a book called Microsoft C Programming for the PC written by Robert LaFore and the Microsoft QuickC Compiler software. I had a job delivering newspapers at the time, so I could just barely afford the book using salary and tips saved from two weeks doing hard time ($50 at the time), but the compiler was just too expensive. So I did what any highly effective teenager would do, basically I dropped really big hints around the house (including the location and price of the compiler package I wanted) until my parents purchased a copy for me on my birthday.

There are a number of differences between the BASIC and C programming languages. One of the more obscure differences lies in how the C programming language deals with special variables that can hold memory addresses. These variables are called pointers and are an integral part of the syntax and functionality of the language. BASIC did have a few special functions which could accept and address locations in memory – I’m thinking of the CALL and USR functions specifically, although there were others. However, a variable holding an address was the same as one holding any other number since BASIC lacked the concept of strong types. The grammar of the C language is also much more complex than BASIC; it had special characters and symbols to express program scope and perform unary operations, which introduced me to the concept of coding style. When a programmer first learns a particular style of coding, it can turn into a religion, but I hadn’t really been exposed to the language long enough to form an opinion. That would come later, and then be summarily discarded once I had more experience.

There were libraries of all sorts which provided functionality for working with strings, math functions, standard input and output, file functions, and so on. At the time, I thought C’s handling of strings (character data) was incredibly obtuse. Basically, I thought the need to manage memory was a complete nuisance. BASIC never required me to free strings after I had declared them, it just took care of it for me under the hood. Despite the coddling I received, I was familiar with the concept of array allocations since even BASIC had the DIM command which dimensioned array containers; re-allocation was also somewhat familiar because of REDIM. However, there were many more functions and parameters in C related to memory management, and I just thought the whole bloody thing was a real mess. The differences between heap and stack memory confused me for a while.

There were many features of the language and compiler I did enjoy, of course. Smaller and snappier programs were a huge benefit to the somewhat sluggish software produced by the QuickBASIC compiler and the BASIC interpreter. The compiled C programs didn’t have dependencies on any run-time libraries either, even though there was probably a way to statically link the QuickBASIC modules together. Pointers were powerful and were loads of fun to use in your programs, especially once I learned the addresses for video memory which introduced me to concepts like double buffering when I began learning about animation. Writing directly to video memory sounds pretty trivial to me right now, but it was so intoxicating at the time. I was more involved in game programming by then and these techniques allowed me to expand into areas I never considered. It allowed for flicker-free animation, lightning fast ASCII/ANSI window renderings via my custom text windowing library, and special off-screen manipulations that allowed me to easily zip buffers around on the screen. A number of interesting text rendering concepts came from a book entitled Teach Yourself Advanced C in 21 Days by Bradley L. Jones, which is still worth reading to this day.

At around this time, I also started to learn about serial and network communications. The latter didn’t happen until my last year at high school. Basically, I wanted to learn how to get my computers to talk to one another. It all started when I became enchanted by the id Software game called DOOM, which allowed you to network a few machines together and play against each other in a vicious winner takes all death-match style combat. Incidentally, games like Doom, Wolfenstein 3D, or Blake Stone: Aliens of Gold led me down another long-winding path: 3D graphics, but that didn’t happen until a few months later. Again, the book store came to the rescue by providing me with a book entitled C Programmer’s Guide to Serial Communications by Joe Campbell. I was somewhat familiar with programming simple software which could use a MODEM for communication, since BASIC supported this functionality through the OPEN function, but I knew very little about the specifics. Once I dug into the first few chapters, I knew that was all going to change.

Dissecting DOSBox

December 20, 2010

If you’re a gamer and have been for years, then you’ve probably heard of and quite possibly used DOSBox. If you haven’t, then let me introduce it to you. DOSBox is great little program for running all of your favorite classic games. Games which were originally built for monitors and video cards which since been retired, and legacy audio systems like SoundBlaster 16, Audio Galaxy, or Gravis Ultrasound. Specifically, it supports games and programs which were written for the MS-DOS or compatible operating system. Although, the software specializes in supporting games, you may have success in running other programs. Although it doesn’t make any guarantees regarding these legacy applications, which can require different features provided by unsupported drivers, I have had success in running complex software like the DJGPP compiler but run into a bit of trouble when running an older version of the TDE (Thomson-Davis Editor).

I’ll wait here while you go and download your copy of the source code…

Now that you’ve downloaded DOSBox source and presumably unpacked it, you are ready to get your hands dirty. This isn’t going to be an article about how to use DOSBox, but rather, how does it work under the hood, exactly? What are the major software gears and wheels used for handling such programs? Why don’t all of your favourite games have 100% compatibility?

The first thing to understand about DOSBox is that it’s an emulator for x86 CPU instructions, floating point unit instructions, and various functions within MS-DOS compatible operating systems. Specifically, it emulates functions around Interrupt 21H (hexadecimal) and a couple around 20H, 25H, 26H, and 27H. It also installs an NULL handler for interrupts 28H and 29H which do nothing. But before we get ahead of ourselves, let’s take a step back and look at the various modules that make up the program.

CPU Emulation. At the heart of any emulation program is the CPU emulation core. Every program (.EXE or .COM file) on your DOS powered computer contains machine code and data. When using DOSBox, it’s the CPU emulator’s job to process that machine code; therefore, each program that is teased apart and executed by DOSBox arrives at this sacred chunk of memory sooner or later.

DOSBox can be configured to emulate the x86 core in a few different ways. Therefore, when we talk about emulation for CPU cores, we’re either talking about emulating each instruction found in the program one at a time, or the emulator will choose to batch process these instructions and operands and translate them to native instructions for direct execution on the host CPU (they call this mode ‘dynamic’). This direct execution mode can provide into better performance on some machines, but may be slower on others so the original emulation mode is still available.

Emulation can be a little confusing if you’ve never heard of it before, so let’s go over it once again. An executable program contains a bunch or binary data representing machine operation codes (or op codes), operands (arguments to that opcode), and data. This information can be fed to your CPU by your operating system, or they can be read by another program, just like any other file, and interpreted one byte at a time. In the last case, it would be the emulation program which decides how to implement the functionality of the CMPSB instruction (used for comparing bytes in memory), for example, rather than the CPU providing the hardware implementation. It’s this act of interpretation that differentiates emulation from direct execution on the host processor.

FPU emulation. In the world of electronic gaming, the software powering those fantastic explosions, shattering those fragile glass windows, and hurling those flying projectiles often need to do a series of calculations to determine values for acceleration, direction, and manipulating various bits of trigonometry. These calculations can involve irrational numbers like PI (3.141592654…), which I’m sure you all remember from school. In programming terms, these numbers are often stored in variables which follow a standard method of encoding the information about the significand and the exponent (along with the sign of the number); one such standard is IEEE 754. Let’s not get too mired in the intricacies of how floating points are stored, which is quite boring after all and not conducive to an interesting read on a Friday afternoon. Instead, let’s evade this topic and push on to describing operation codes in which floating point numbers are used as operands (parameters or arguments to a function). In this case, these op codes may need to be emulated just like the ones used by the CPU. Only this time, if the encoding standard differs from the host processor (in other words, if it doesn’t use IEEE 754 and you’re using a PC to run DOSBox), then the emulator will need to convert that format into something it can use natively, which an expensive operation (in terms of adding processing cycles and increasing execution time).

The take away piece of information for all this is that floating point emulation can be slow in the worst case scenario, and fast in the best case. In either case, it’s not a good topic at parties so let’s just slowly walk away from it…

Hardware emulation. This is certainly one of the more interesting layers in DOSBox and also the most prone to hacks and tweaks within the source code. The science of taking an analog output and reproducing it digitally is prone to approximations, since the nature of analog is approximate and being digital is exactly the opposite. Most of the analog operations come from sound cards where the device is capable of producing a variety of analog wave forms which is used to create music and sound effects.

The operating system layer. Before the days of operating systems employing graphical user interfaces to shield the user from arcane console commands, and providing a host of time wasting games like Solitaire and Mind-sweeper, a sizable chunk of the PC market used DOS. Whether that was PC-DOS, MS-DOS, or DR-DOS is not really that important since the other versions typically remained closely compatible with MS-DOS. The DOS platform offered a host of utility programs to the user, along with a few drivers which included support for specific implementations for memory management, mouse drivers, and access to generic CD-ROM drives. Users could always install a specific version of a driver for piece of hardware they bought, like a Sound Blaster card, and after setting a jumper of two to configure interrupt and DMA channels, they would be off to the races. Unless something went horribly wrong…

Unbeknownst to many users but knownst to a few geeks around the Megaverse, the motherboard BIOS code provides a set of default drivers so that the boot sequence and the operating system can access a few essential devices like the hard drive, floppy drive, and keyboard when they start up. These drivers are usually ignored or replaced by the operating system so that it can provide its own, more advanced versions, but some of them are still used when your Windows operating system boots into safe mode, for example. The kernel, which is a core component of any operating system, provides mechanisms for switching between drives, accessing disks and partitions, and in the case of DOS, providing fixed names for devices like “LPT1:” (printer), “COM1:” (communications port), or “NUL:” (the abyss). These device names and indeed the drivers themselves provided a level of abstraction for the user and higher-level programs. The user could print a text file, for example, by issuing a command like “TYPE FILE.TXT > LPT1:” directly as a shell command, but they could also use a program like WordPerfect which has its own set of specialized printer drivers, so that it could do more tasks requiring advanced printing like graphics and italicized or bold lettered text.

DOSBox provides limited support for a few of these commands, but really it’s only enough to get your games up and running since that is its modus operandi after all. These commands can take one of two forms: an executable program or a keyword command available in the shell. I provide the list of available keyword commands in the Shell section below.

The interrupt layer. Much of the hard work when creating DOSBox probably rose up from the requirements around the CPU/FPU emulation, DOS and hardware abstraction support, and the support for interrupts. The bulk of the core system code for the DOS lies in supporting the features for every required interrupt function. Interrupts are specific routines which can accept parameters from the calling program and then return the results of the function in special variables. All of this happens by loading up certain register variables and invoking the interrupt CPU instruction. If you were to embed a small assembler routine into a C function, it may look like this:

void mouse_hide(void)
{
   asm {
      mov ax,02h;
      int 33h;
   }
}

In this example, the value “02H” is being loaded into the AX register which represents the interrupt function to hide the mouse cursor, and the interrupt “33H” is an entry in the interrupt vector table to access available mouse functions (it acts like and index). DOSBox supports many interrupt functions, but their focus is around the ones necessary to run your favourite games. The important thing to remember about interrupts is that they do exactly that, they interrupt the CPU and force it to run the requested function.

Without going into too many technical details, interrupt authors generally follow two design rules: the functions must execute quickly and you shouldn’t call an interrupt from within another interrupt. The programmers working on DOSBox need to implement those interrupt functions in whatever way makes the most amount of sense on the running platform. So, if the game invokes an interrupt requesting a change of screen resolution and color mode, then the DOSBox emulator needs to adjust the resolution of the game window and invoke software support for VGA and EGA video modes, or a nice CGA video mode with a four-colour palette. Pretty.

The abstract front-end layer. This would be the charming side of DOSBox, if the project actually provided a graphical user interface out of the box. Instead, they have designed it one level deeper and abstracted the program’s front-end so that it could use different media and windowing libraries provided by the host’s operating system. By default, it uses the SDL library (SDL stands for Simple DirectMedia Layer) to handle the creation of the application container, window frame, sound, input functions and graphics modes. And lucky for them, SDL is available for Linux, Mac OS X, and Windows (and varied support for other platforms too), so there may never be a reason to move to a different library… until the SDL project is retired or if their senior programmer gets hit by a bus. If the project didn’t use a library like SDL, then the application would be tied to a specific set of operating system libraries, or it would need to provide an assortment of implementation modules for each OS target that followed a nice, clean little interface… like the ones provided by SDL.

I’ll pause while you give a big hug to the people working on that project. Don’t you feel better now? Wait, it wouldn’t be right to ignore DOSBox, since they are the stars of this little side show. Let’s spread the love around and try not to get too messy in the process.

The scaler layer. Strictly speaking, I wouldn’t really call this a layer, but the architecture does abstract it somewhat, and a lot of people like this feature so it’s worth discussing it a bit. When you decide to fire your favourite game up in DOSBox, you may notice the Window it creates is a little on the small side depending on your host’s current resolution. Wouldn’t it be nice if you could make the window bigger and still have it look good? That’s the job for the scaling routines. They take what would normally be a pixelated image (assuming you don’t like that sort of thing) and smooth out some of the rough spots. As with most scaling routines (other than piecewise-constant algorithms like nearest neighbour), there can be a bit of blurring but if the algorithms use a smaller set of surrounding pixels for their sample set, like an EPX or Scale2x routine then the result looks quite good and still maintains an acceptable level of sharpness and detail.

The shell layer. The shell provides a sub-set of the total native keyword commands available to DOS: DIR, CHDIR, ATTRIB, CALL, CD, CHOICE, CLS, COPY, DEL, DELETE, ERASE, ECHO, EXIT, GOTO, HELP, IF, LOADHIGH, LH, MKDIR, MD, PATH, PAUSE, RMDIR, RD, REM, RENAME, REN, SET, SHIFT, SUBST, TYPE, and VER. It also provides an execution environment for running batch files.

Hopefully, when you fire up your next DOSBox powered game (a number of product use this software, including game services like Steam), you’ll think of the long hours and tedious bits of programming that went into developing this stellar product, and maybe choose to send a bit more love their way this Christmas season.