Apple II LLVM-MOS Port, Part 1

by: TheHans255

August 8, 2023

One of the first devices I ever learned to write software for was the Apple //e, an 8-bit home computer based on the MOS 6502 processor and packaged with BASIC. I put myself through all the ups and downs of GOTO and two-character variable names in order to write games and build my software skills. I loved that thing a lot and still do today, and have recently added parts such as an interface card that adds an Ethernet jack. It still works great today!

A picture of my Apple //e running The Oregon Trail

So, when I recently learned of an initiative, called LLVM-MOS, to target the powerful LLVM optimizing compiler for 6502 platforms, I knew I had to use it to write some Apple software with it. And since, as of this writing, the current list of supported platforms does not include the Apple II family, there was just one thing to do: write in that support myself!

Try it yourself!

My work is available on Github if you'd like to try it out!

https://github.com/TheHans255/apple-ii-port-work

If you'd like to compile and run the examples yourself, here's a step-by-step how-to-guide.

Download the prerequisite software, or relevant alternatives:
- CMake. This is used to set up Makefiles from the project templates.
- The LLVM-MOS SDK, downloaded to a directory of your choice. This includes the LLVM-MOS compiler and related tools for actually producing the executables.
- A ProDOS 2.4.2 disk image. ProDOS is one of several major disk operating systems for the Apple II, and the OS that the examples are linked against.
- AppleCommander, a Java application for manipulating Apple II disk images (plus a Java runtime for running it). This will be used to get the programs onto a disk once you've compiled them.
- An Apple //e emulator, such as AppleWin (which also works great under Wine on Linux and macOS). This is used to actually run your disks!
Clone the Github repository to a directory of your choice.
Set up the CMake files with the command $ cmake -D CMAKE_C_COMPILER="/path/to/mos-common-clang" -S src -B build, where /path/to/mos-common-clang is the path to mos-common-clang in your LLVM-MOS SDK installation. (Re-run this step if you ever change the CMakeLists.txt files, such as when adding a new project).
Compile the examples with make in the build directory.
In AppleCommander, open your ProDOS disk image and click the Import button. Navigate to the .sys program in your example's build directory (e.g. the Hello program is in build/example/hello/hello.sys) and import it into the disk. (You may need to delete some other files first.) Once you're done, click Save to save the disk image.
Open up AppleWin, click the Drive 1 button on the right to load your disk, and click the Apple button to boot it up! Then, navigate to the program you loaded up and press Return!

If you want to test a change, make again, delete the previous copy of the program in AppleCommander, load up the new one, save the disk, reload the disk in AppleWin, and boot.

If you want to put your program on a real Apple II, you can use ADTPro to transfer your disk image to it. Of course, I would recommend doing this only if you can confirm that AppleWin runs the program correctly with no side effects.

Behind The Scenes

So, how did I go about doing this? Fortunately, a lot of the work for writing high-quality 6502 code has already been done by the LLVM-MOS community, and my work primarily involves targeting that work towards the Apple II.

Linking to ProDOS

First things first: since the Apple II is an embedded platform, getting your program into a format that it's prepared to run is not a trivial problem. While on Windows or Linux, I can just build an ELF or PE file, on an embedded platform, I have to put together a binary package that the platform knows how to load and execute. Fortunately, on the Apple, I have quite a few formats to choose from, due to the Apple's support for BASIC, CALL instructions, and multiple operating systems.

For the best possible end user experience, I chose to compile a ProDOS system program, which is the standard format for machine language programs on the ProDOS operating system. Rather than produce a BASIC prompt, ProDOS boots into a menu, allowing the program to be selected from a list. ProDOS also has support for large file systems (such as 3.5" floppies and hard drives, up to 32 MB) and directories, which is great for Serious Business programs such as text editors.

Producing a ProDOS executable file is simple enough - part of LLVM-MOS's compilation process includes providing a linker script that describes where parts of the program should go and what the output binary should look like. The file I have for Apple II ProDOS is listed below:

/* Apple //e ProDOS 2.4.2 Linked Script */

/* Use the first 32 bytes of zero page as imaginary registers.
 * Note that 0x20 thru 0x4F are used by the System Monitor
 * and ProDOS.
 */
__rc0 = 0x0000;
INCLUDE imag-regs.ld
ASSERT(__rc0 == 0x00, "Inconsistent zero page map.")
ASSERT(__rc31 == 0x1f, "Inconsistent zero page map.")

/* Available RAM goes from 0x0800 to 0xBF00.
 * The program is loaded at 0x2000, and the 6K below is
 * used for the soft stack.
 * Zero page is available from 0x50 to the end 
 * (0x3A to 0x4F is reserved by ProDOS).
 */
MEMORY { 
    zp : ORIGIN = 0x0050, LENGTH = (0x1000 - 0x0050) 
    ram (rw) : ORIGIN = 0x2000, LENGTH = 0x9F00
}

/* Set the operand stack to the 6K memory region behind our program */
__stack = 0x1fff;

SECTIONS {
    INCLUDE c.ld
}

OUTPUT_FORMAT { 
    TRIM(ram)
}

However, there were some other challenges that were a bit more difficult to manage than a simple, CALL-able binary. In particular, Most of the platforms currently available for LLVM-MOS either exit with a simple RTS (ReTurn from Subroutine) instruction, or by going into an endless loop (as on the NES). However, ProDOS instead requires you to make a QUIT syscall, which closes all files and reloads the selector menu. LLVM-MOS already has a "custom exit" option for programs that require this sort of thing, but I still needed to write an _exit routine for making this system call in assembly before I could build any programs.

One thing that will also be a challenge in the future is writing programs that use the Apple II's high resolution (HIRES) graphics. Because ProDOS by default places programs at address $2000, which is exactly where the HIRES frame buffers are stored, I will need to write some kind of loader program that copies the rest of the program to a different location.

Character Output

LLVM-MOS already provides a simple implementation of printf, the quintessential C routine for printing strings and numbers, alongside the simpler puts. However, because character output is different on different platforms, I need to write an implementation of putchar myself.

Fortunately for me, the Apple II firmware includes a robust character output system that not only manages a scrolling text window, but allows redirection to devices such as printers and serial cards, and ProDOS can even redirect these routines to files. In order to access this firmware, I need to call the subroutine at $FDED with the desired character to print in the accumulator register.

In my C libraries, it looks like this:

void appleii_cout(unsigned char c) {
    // Call the COUT routine located at address $FDED.
    // "asm volatile" tells LLVM that this assembly should never be optimized out,
    // "__attribute__((leaf))" tells LLVM that this assembly does not call any
    // of our other C functions (and thus it doesn't need to worry about
    // function reentrancy).
    // The '"a" (c)' expression in the third ":" section says that
    // the variable c should be loaded into the accumulator prior to making this call.
    __attribute__((leaf)) asm volatile (
        "jsr 0xfded"
        : 
        : "a" (c) 
        :
        ); 
}

void __putchar(char c) {
    // Use COUT for character output, inserting the high bit
    if (__builtin_expect(c == '\n', 0)) c = '\r';
    appleii_cout(c | -128);
}

Of course, COUT works a bit differently than how C code usually works. For one, ASCII only uses 7 out of the typical 8 bits in a byte, and C code usually leaves that extra high bit cleared while the Apple II expects that bit to be set (otherwise it uses the alternate Inverse character set). Apple II also uses the CR (Carriage Return) character while C code usually uses the LF (Line Feed) character. I do both of these translations inside the __putchar routine.

A smaller, though certainly extremely frustrating, issue I ran into with this was that none of my data was appearing correctly. With a bit of experimentation, I determined that I had broken varargs somehow, and it turned out that all I had to do was add a compiler option to initialize the "operand stack" at the start of the program (essentially, LLVM C code keeps a separate stack for variables on 6502 platforms, since the "real" stack is much too small).

ProDOS Syscalls

Once I had a basic program structure, I went through the rest of the ProDOS manual to determine if there were other things I had to do to properly register and write a program. The manual claimed that I needed to set the "system bit map", a set of bits at the end of RAM that determine what other pages are in use by ProDOS, so to get a little clarity on this, I wrote a program that made some more ProDOS system calls and saw how the bitmap was used.

A quirk with ProDOS calls in C is that the call parameters are not passed on registers or the stack. Instead, the call number and a pointer to the arguments are expected to be placed after the syscall call instruction within the code text. An example from 6502 assembly is shown below, from the ProDOS manual:

 SOURCE   FILE #01 =&#62;TESTCMD
 ----- NEXT OBJECT FILE NAME IS TESTCMD.0
 2000:        2000    1         ORG  $2000
 2000:        2000    1         ORG  $2000
 2000:                2 *
 2000:        FF3A    3 BELL    EQU  $FF3A     ;Monitor BELL routine
 2000:        FD8E    4 CROUT   EQU  $FD8E     ;Monitor CROUT routine
 2000:        FDDA    5 PRBYTE  EQU  $FDDA     ;Monitor PRBYTE routine
 2000:        BF00    6 MLI     EQU  $BF00     ;ProDOS system call
 2000:        00C0    7 CRECMD  EQU  $C0       ;CREATE command number
 2000:                8 *
 2000:20 06 20        9 MAIN    JSR  CREATE    ;CREATE "/TESTMLI/NEWFILE"
 2003:D0 08   200D   10         BNE  ERROR     ;If error, display it
 2005:60             11         RTS            ;Otherwise done
 2006:               12 *
 2006:20 00 BF       13 CREATE  JSR  MLI       ;Perform call
 2009:C0             14         DFB  CRECMD    ;CREATE command number
 200A:17 20          15         DW   CRELIST   ;Pointer to parameter list
 200C:60             16         RTS
 200D:               17 *
 200D:20 DA FD       18 ERROR   JSR  PRBYTE    ;Print error code
 2010:20 3A FF       19         JSR  BELL      ;Ring the bell
 2013:20 8E FD       20         JSR  CROUT     ;Print a carriage return
 2016:60             21         RTS
 2017:               22 *
 2017:07             23 CRELIST DFB  7         ;Seven parameters
 2018:23 20          24         DW   FILENAME  ;Pointer to filename
 201A:C3             25         DFB  $C3       ;Normal file access permitted
 201B:04             26         DFB  $04       ;Make it a text file
 201C:00 00          27         DFB  $00,$00   ;AUX_TYPE, not used
 201E:01             28         DFB  $01       ;Standard file
 201F:00 00          29         DFB  $00,$00   ;Creation date (unused)
 2021:00 00          30         DFB  $00,$00   ;Creation time (unused)
 2023:               31 *
 2023:10             32 FILENAME DFB ENDNAME-NAME ;Length of name
 2024:2F 54 45 53    33 NAME    ASC  "/TESTMLI/NEWFILE" ;followed by the name
 2034:        2034   34 ENDNAME EQU  *

This isn't the most pleasant situation for C, especially since I both want to be able to change parameters as I go and not have to reserve space for syscalls I'm not using. Fortunately, I could leverage a smattering of assembly code to make a single spot where syscalls are issued, with the call number and arguments accessible as global variables.

; Core gadget for making syscalls to ProDOS
; Set the number and param variables, then call _prodos_syscall.
; Note that none of these symbols should be used
; outside of the module that calls _prodos_syscall.
.global _prodos_syscall
.global _prodos_syscall_number
.global _prodos_syscall_param

; NOTE: PC, S, D, I, RS0, and RS10-RS15 need to be saved by us
; according to LLVM-MOS's C calling convention:
; https://llvm-mos.org/wiki/C_calling_convention
; We don't use any of those, so we're good for now.

_prodos_syscall:
    jsr 0xBF00
_prodos_syscall_number:
    .byte 0x00
_prodos_syscall_param:
    .byte 0x00, 0x00
__prodos_syscall_end:
    rts

Once I had that, I could just set those global variables in C and make syscalls as I pleased, such as this call for reading files:

char prodos_read(char ref_num, char *data_buffer, unsigned int request_count, unsigned int *trans_count) {
    struct prodos_read_param {
        char param_count;
        char ref_num;
        char *data_buffer;
        unsigned int request_count;
        unsigned int trans_count;
    } __attribute((packed)) p;
    p.param_count = 4;
    p.ref_num = ref_num;
    p.data_buffer = data_buffer;
    p.request_count = request_count;
    p.trans_count = 0;
    _prodos_syscall_number = PRODOS_SYSCALL_READ;
    _prodos_syscall_param = (void *) &p;
    char error = _prodos_syscall();
    if (error == PRODOS_ERROR_NONE) {
        *trans_count = p.trans_count;
    }
    return error;
}

A screenshot of the Apple II emulator reading a Readme

As for what that bitmap is for, it turns out that ProDOS doesn't do anything interesting with it except for prevent accidents where files are read over your program or over critical system pages. Therefore, I'm not going to worry much about it until we start considering how malloc is going to work on the Apple.

Monitor Calls

Once I had some ProDOS calls and a few Monitor calls for input and output, it felt natural to write interfaces for all of the other Monitor calls so that people could get started quickly on writing games. This turned out to be remarkably easy - all I needed to do was comb through my old Apple II manuals and make C stubs for their memory locations and registers used.

// BELL: Print a BELL character to the current output device
void appleii_bell() {
    // The "a" in the final ":" section indicates that the accumulator
    // is "clobbered", meaning that LLVM cannot rely on its value
    // staying intact.
    __attribute__((leaf)) asm volatile ("jsr 0xff3a": : : "a"); 
}

// HOME: Clear the entire text window and put the cursor in the
// upper-left corner
void appleii_home() { __attribute__((leaf)) asm volatile ("jsr 0xfc58": : :); }

// PRBYTE: Print a hexadecimal byte
void appleii_prbyte(unsigned char byte) { __attribute__((leaf)) asm volatile ("jsr 0xfdda": : "a" (byte) : "a"); }

// RDKEY: Read a character from the current input device, inserting a
// blinking cursor at the cursor location.
unsigned char appleii_rdkey() { 
    unsigned char c;
    // The "=a" (c) section indicates that the accumulator is overwritten
    // with the value that we want to go back into c - the RDKEY
    // routine returns the read character into the accumulator.
    __attribute__((leaf)) asm volatile ("jsr 0xfd0c": "=a" (c) : :);
    return c;
}

// PREAD: Read the given paddle input, returning an analog value from 0 to 255.
// Index should be between 0 to 3, or the behavior is undefined.
unsigned char appleii_pread(unsigned char index) {
    unsigned char result;
    // All three! This calls $FB1E, takes the X register as input,
    // returns the Y register as output, and destroys the accumulator and status register.
    __attribute__((leaf)) asm volatile ("jsr 0xfb1e": "=y" (result) : "x" (index) : "a", "p");
    return result; 
}

I also added macros for special memory locations, such as hardware switches and the zero page locations used to configure the Monitor subroutines:

// The top of the text window. Should range from 0 to 39 inclusive
#define APPLEII_MONITOR_WNDLFT ((volatile unsigned char *) 0x20)
// The width of the text window. This plus WNDLFT
// should not exceed 40.
#define APPLEII_MONITOR_WNDWDTH ((volatile unsigned char *) 0x21)

// ...

// Apple II keyboard data. If the high bit is set, then the
// lower seven bits contain the last key that was pressed
// since the keyboard strobe was cleared.
#define APPLEII_KEYBOARD_DATA ((volatile char *) 0xC000)
// Apple II keyboard strobe. Clear this to prepare to read
// the next key from the keyboard.
#define APPLEII_KEYBOARD_STROBE ((volatile char *) 0xC010)
// Reads the current state of VBlank on the Apple //e
// (if the bit is high, then the Apple is in VBlank)
#define APPLEIIE_VBLANK ((volatile char *) 0xC019)

All of that culminated in porting Steve Wozniak's Breakout program, originally written in Integer BASIC, to C. After ironing out some issues in the syscall definitions I was using (and remembering that Integer BASIC routines start counting from 1 while these Monitor routines count from 0), I finally had something playable!

A screenshot of the Apple II emulator with a game of Breakout in progress

What's Next?

Quite a bit, actually! Here are some of the main additions I have coming up in the near future:

Add endpoints for the rest of the ProDOS syscalls
Implement the other stdio.h routines, including fopen, fprintf, and others, in order to enable the standard C way of opening and reading files.
Implement malloc and free, in order to support C++'s new and delete operators. LLVM-MOS already comes with a simple implementation of malloc, but since ProDOS files require you to create a 1 KB buffer that's aligned to a 256-byte page boundary, I will need to write a page allocator that both malloc and the file library can use.
Add some more linker targets:
- A linker target for ProDOS that copies everything to address $6000 to open up access to HIRES pages.
- A partial linker target for dynamically loadable subroutines, which rely on the stack and STDLIB already being initialized
- A linker target that does not use ProDOS, such as a DOS 3.3 binary program.
Submit the whole thing to the LLVM-MOS SDK, so that future users can just run mos-apple-ii-clang to compile new Apple programs.

And some features to add in the farther future:

Add some HIRES graphics routines
Add an implementation of Berkeley Sockets for the aforementioned Uthernet II Ethernet Card.
Modify the getchar and putchar routines to simulate a VT100 terminal emulator, to more closely match the behavior expected by other C programs
Port some C programs! I'm particularly interested in trying a program like SSH or Telnet, so that I can utilize the Apple as a terminal for another computer.