Three Rings for the Z80

Over the past few years I’ve implemented a number of interfaces for Z80 peripherals based on the principal of the interrupt driven ring buffer. Each implementation of a ring exhibits its own peculiarities, based on the specific hardware. But essentially I have but one ring to bring them all and in the darkness bind them.

This is some background on how these interfaces work, why they’re probably fairly optimal at what they do, and things to consider if extending these to other platforms and devices.

The ring buffer is a mechanism which allows a producer and a consumer of information to do so with a timing to suit their needs, and to do it without coordinating their timing.

The Wikipedia defines a circular buffer, or ring buffer,  as a data structure that uses a single fixed-size buffer as if it were connected end-to-end. The most useful property of the ring buffer is that it does not need to have its elements relocated as they are added or consumed. It is best suited to be a FIFO buffer.

Background

Over the past few years, I’ve used the ring buffer mechanism written by Dean Camera in many AVR projects. These include interrupt driven USART interfaces, a digital audio delay loop, and a packet assembly and play-out buffer for a digital walkie-talkie.

More recently, I’ve been working with Z80 platforms and I’ve taken that experience into building interrupt driven ring buffer mechanisms for peripherals on the Z80 bus. These include three rings for three different USART implementations, and a fourth ring for an Am9511A APU.

But firstly, how does the ring buffer work? For the details, the Wikipedia entry on circular buffers is the best bet. But quickly, the information (usually a byte, but not necessarily) is pushed into the buffer by the producer, and it is removed by the consumer.

The producer maintains a pointer to where it is inserting the data. The consumer maintains a pointer to where it is removing the data. Both producer and consumer have access to a count of how many items there are in the buffer and, critically, the act of counting entries present in the buffer and adding or removing data must be synchronised or atomic.

8 Bit Optimisation

The AVR example code is written in C and is not optimised for the Z80 platform. By using some platform specific design decisions it is possible to substantially optimise the operation of a general ring buffer, which is important as the Z80 is fairly slow.

The first optimisation is to assume that the buffer is exactly one page or 256 bytes. The advantage we have there is that addressing in Z80 is 16 bits and if we’re only using the lowest 8 bits of addressing to address 256 bytes, then we simply need to align the buffer onto a single 256 byte page and then increment through the lowest byte of the buffer address to manage the pointer access.

If 256 bytes is too many to allocate to the buffer, then if we use a power of 2 buffer size, and then align the buffer within the memory so that it falls on the boundary of the buffer size, the calculation for the pointers becomes simple masking (rather than a decision and jump). Simple masking ensures that no jumps are taken, which means that the code flow or delay is constant no matter which place in the buffer is been written or read.

Note that although the number of bytes allocated to the buffer is 256, the buffer cannot be filled completely. A completely full 256 byte buffer cannot be discriminated from a zero fullness buffer. This does not apply where the buffer is smaller than the full page.

With these two optimisations in place, we can now look at three implementations of USART interfaces for the Z80 platform. These are the MC6580 ACIA , the Zilog SIO/2, and the Z180 ASCI interface. There is also the Am9511A interface, which is a little special as it has multiple independent ring buffers, and has multi-byte insertion.

Implementations

To start the discussion, let us look at the ACIA implementation for the RC2014 CP/M-IDE bios. I have chosen this file because all of the functions are contained in one file, which provides an easier overview. The functions are identical to those found in the z88dk RC2014 ACIA device directory.

Using the ALIGN key word of the z88dk, the ring buffer itself is placed on a page boundary, in the case of the receive buffer of 256 bytes, and on the buffer size boundary, in the case of the transmit buffer of 2^n bytes.

Note that although where the buffer is smaller than a full page all of the bytes in the buffer could be used, because the buffer counter won’t overflow, but I haven’t made that additional optimisation in my code. So no matter how many bytes are allocated to a buffer, one byte always remains unused.

Once the buffer is located, the process of producing and consuming data is left to either put or get functions which write to, or read from the buffer as and when they choose to. There is no compulsion for the main program flow to write or read at a particular time, and therefore the flow of code is never delayed. This is optimum from the point of view of minimising delay and maximising compute time. Additional functions such as flushpeek, and poll are also provided to simplify program flow, and init to set up the peripheral and initialise the buffers on first use.

With the buffer available then the interrupt function can do its work. Once an interrupt from the peripheral is signalled, the interrupt code checks to see whether a byte has been received. If not then the interrupt (in the case of the ACIA and ASCI) must have been triggered by the transmit hardware becoming available.

If in fact a byte has been received by the peripheral then the interrupt code recovers the byte, and checks there is room in the buffer to store it. If not, then the byte is simply abandoned. If there is space, then the byte is stored, and the buffer count is incremented. It is critical that these two items happen atomically, which in the case of an interrupt is the natural situation.

If the transmission hardware has signalled that it is free, then the buffer is checked for an available byte to transmit. If none is found then the transmit interrupt is disabled. Otherwise the byte is retrieved from the buffer and written to the transmit hardware while the buffer count is decremented.

If the transmit buffer count reaches zero when the current byte is transmitted, then the interrupt must disable further transmit interrupts to prevent the interrupt being called unnecessarily (i.e. with the buffer fullness being empty).

Multi-byte Receive

Both the SIO and ASCI have multi-byte hardware FIFO buffers available. This is to prevent over-run of the hardware should the CPU be unable to service the receive interrupt in sufficient time. This could happen if the CPU is left with its general interrupt disabled for some time.

In this situation, the SIO receive interrupt and the ASCI interrupt have the capability to check for additional bytes before continuing.

Transmit cut-through

One additional feature worth discussing is the presence of a transmit cut-through, which minimises delay when writing the “first byte”. Because the Z80 processor is relatively slow compared to a serial interface, it is common for the transmit interface to be idle when the first byte of a sequence of bytes is written. In this situation writing the byte into the transmit buffer, and then signalling a pseudo interrupt (by calling the interrupt routine) would be very costly. In the case of the first byte it is much more effective simply to cut-through and write directly to the hardware.

Atomicity

For the ring buffer to function effectively, the atomicity of specific operations must be guaranteed. During an interrupt in Z80 further interrupts are typically not permitted, so within the interrupt we have a degree of atomicity. The only exception to this rule is the Z80 Non Maskable Interrupt (NMI), but since this interrupt is not compatible with CP/M it has never been used widely and is therefore not a real issue.

For the buffer get function the only concern is that the retrieval of a byte is atomically linked to the number of bytes in the buffer.

For the put function it is similar, however as the transmit interrupt needs to be enabled by the put function atomcity is required to ensure that this process is not interrupted.

Interrupt Mode

Across the three implementations there are three different Z80 interrupt modes in play. The Motorola ACIA is not a Zilog Z80 peripheral, so it can only signal a normal interrupt, and can therefore (without some dirty tricks) only work in Interrupt Mode 1. For the RC2014 implementation it is attached to INT or RST38 and therefore when an interrupt is triggered it is up to the interrupt routine to determine why an interrupt has been raised. This leads to a fairly long and slow interrupt code.

The Z180 ASCI has two ports and is attached to the Z180 internal interrupt structure, which works effectively similarly to the Z80 Interrupt Mode 2, although it is actually independent from the Z80 interrupt mode. Each Z180 internal interrupt is separately triggered, however it still cannot discern between a receive and a transmit event. So the interrupt handling is essentially similar to that of the ACIA.

The Zilog SIO/2 is capable of being attached to the Z80 in Interrupt Mode 2. This means that the SIO is capable of being configured to load the Z80 address lines during an interrupt with a specific vector for each interrupt cause. The interrupts for transmit empty, received byte, transmit error, and receive error are all signalled separately via an IM2 Interrupt Vector Table. This leads to concise and fast interrupts, specific to the cause at hand. The SIO/2 is the most efficient of all the interfaces described here.

Multi-byte buffers

For interest, the Am9511A interface uses two buffers, one for the one byte commands, and one for the two byte operand pointers. The command buffer is loaded with actions that the APU needs to perform, including some special (non hardware) commands to support loading and unloading operands from the APU FILO.

A second Am9511A interface also uses two buffers, one for one byte commands, and one for either two or four byte operands. This mechanism in not as nice as storing pointers as in the above driver, but is required for situations where the Z180 is operating with paged memory.

I’ve revised this above solution again and do it with three byte operand (far) pointers, as that makes for a much simplified user experience. The operands don’t have to be unloaded by the user. They simply appear auto-magically…

Old Sunshine – migrating Ultra5 to Sparc64

SunLogoI pulled a Sun Microsystems Ultra5 machine out of the e-waste some time ago, and have been running various versions of debian sparc or Ubuntu on it for the last few years. The final version was debian Wheezy, the current old old stable.

However, since there is no further work on the debian old old stable, and as there was no working https support in any browser, it was time to upgrade to a more modern release. But, for sparc processors I couldn’t find anything suitable. Solaris 10 was last upgraded in 2013. A path was illuminated when I read this email from John Paul from June 2016, asking the 82 remaining users of debian sparc distribution to migrate to the Sparc64 port. I guess I was one of those last 82 hold outs. And that was 2 years ago. So, Sparc64 became the target port.

Hardware

The Ultra5 I pulled from the e-waste has been improved over the years, and it is now no longer a machine that could have be purchased from Sun. I’ve added 1GB of 50ns RAM, by cutting (hacking in the literal sense) the 2nd hard drive carrier to make space, and have upgraded the CPU to 440MHz, which was only supported in the Ultra10.

IMG_1390

CBfggB3VAAA6JM2

I disconnected the jet engine cooling fan, and replaced it with a quiet slow fan sitting on top of the CPU heat sink, and replaced the hard drive by a 64GB PATA SSD.

IMG_1346

I’ve also added a PGX64 video card and a USB card to enable modern devices to be supported.

My final hack was to convert the NVRAM to use a VERY LARGE battery. The NVRAM is used to store the MAC address, and other important system configurations. The standard chip has about a 2 year lifetime, if the machine is mainly turned off. The new Lithium Ion CR123A battery should last about 150 years.

Ce23P58VAAEC89k

Sparc64

Following a number of false starts, the upgrade to Sparc64 went very easily. The April 4th netinstall ISO is good, and can be used as a reference. Of course newer ISOs will be regularly provided, but at least I’m sure that a working machine can be built from the Internet Archive April 4th snapshot. From the OpenBoot command line.

> boot cdrom expert libata.dma=0

IMG_1370

The instructions for the upgrade are very standard debian, using the netinstall ISO. The only special instruction is to enter the archive mirror details.

  * when prompted to enter mirror data, use the following:
    - mirror: http://ftp.ports.debian.org
    - directory: /debian-ports/

IMG_1371

IMG_1372

The installer automatically realises that the hard disk controller is incapable of DMA and configures it to PIO4 mode. Later, the radeon modesetting can be prevented by adding an options line to the local.conf file.

IMG_1368

At this point you should have a working Sparc64 installation.

Using the PGX64

The PGX64 has some additional memory, which allows higher screen resolutions than the inbuilt PGX24 graphics. But, unfortunately, it is no faster than standard.

In order to get it to work without conflict, it is necessary to disable to inbuilt PGX24 device, located on PCI Bus B location 2, by configuring the pcib-probe-list.

IMG_1736.jpg

Building a Desktop

Getting the Ultra5 to be a desktop machine again requires a working X11 graphics adapter. The PGX64 (and the inbuilt PGX24) graphics use the ATI Rage chip, supported by the mach64 driver.

> sudo apt-get install xserver-xorg-video-mach64

IMG_1376

Unfortunately, sometime around 2013, the mach64 driver support disappeared. Around the time that the security aspects of Linux kernel were tightened.

Loading the mach64 driver, which is still supported on Sparc64, produces an error when loading.

From /var/log/Xorg.0.log, the driver is unable to map its mmio aperture.

[ 84.251] (II) MACH64(0): Creating default Display subsection in Screen section
 "Default Screen Section" for depth/fbbpp 24/32
[ 84.251] (==) MACH64(0): Depth 24, (--) framebuffer bpp 32
[ 84.252] (==) MACH64(0): Using XAA acceleration architecture
[ 84.252] (EE) Unable to map mmio aperture. Invalid argument (22)
[ 84.252] (WW) MACH64: Mach64 in slot 2:1:0 could not be detected!
[ 84.252] (II) UnloadModule: "mach64"
[ 84.253] (EE) Screen(s) found, but none have a usable configuration.
Fatal server error:
[ 84.253] (EE) no screens found(EE)

The only options are to rebuild a kernel with the security disabled, or to find another method of getting a working display driver.

Fortunately, it is possible to use the old framebuffer method for driving the ATI Rage graphics chip. A xorg.conf needs to built, to direct the xserver to load the fbdev driver.

Section "ServerLayout"
    Identifier "Xorg Ultra5"
    Screen 0 "Screen0" 0 0
EndSection

Section "Monitor"
    Identifier "S24B420B"
    VendorName "Samsung"
    ModelName "Samsung S24B420B"
    HorizSync 30 - 81
    VertRefresh 56 - 63
    DisplaySize 518 324
    Option "DPMS" "true"
EndSection

Section "Device"
    Identifier "PGX64"
    Driver "fbdev"
#   Driver "mach64"
    Card "ATI Rage Pro - Sun PGX64"
#   Option "DMAMode" "sync"
#   Option "ForcePCIMode" "true"
#   Option "BufferSize" "8"
#   Option "ExaNoComposite" "true"
EndSection

Section "Screen"
    Identifier "Screen0"
    Device "PGX64"
    Monitor "S24B420B"
    DefaultDepth 24
    SubSection "Display"
        Depth 8
        Modes "1920x1200" "1920x1080" "1600x900" "1600x1200" "1440x900" "1280x1024"
    EndSubSection
    SubSection "Display"
        Depth 24
        Modes "1440x900" "1280x1024"
    EndSubSection
EndSection

Section "Module"
    Load "type1"
    Load "freetype"
EndSection

Section "DRI"
    Mode 0666
EndSection

This above xorg.conf gets the required outcome. A listing from Xorg.0.log below.

[ 704.327] (II) LoadModule: "fbdev"
[ 704.328] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so
[ 704.329] (II) Module fbdev: vendor="X.Org Foundation"
[ 704.329] compiled for 1.19.0, module version = 0.4.4
[ 704.329] Module class: X.Org Video Driver
[ 704.329] ABI class: X.Org Video Driver, version 23.0
[ 704.329] (II) FBDEV: driver for framebuffer: fbdev
[ 704.330] (II) Loading sub module "fbdevhw"
[ 704.330] (II) LoadModule: "fbdevhw"
[ 704.332] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so
[ 704.333] (II) Module fbdevhw: vendor="X.Org Foundation"
[ 704.333] compiled for 1.19.6, module version = 0.0.2
[ 704.333] ABI class: X.Org Video Driver, version 23.0
[ 704.335] (**) FBDEV(0): claimed PCI slot 2@0:1:0
[ 704.335] (II) FBDEV(0): using default device
[ 704.336] (**) FBDEV(0): Depth 24, (--) framebuffer bpp 32
[ 704.336] (==) FBDEV(0): RGB weight 888
[ 704.336] (==) FBDEV(0): Default visual is TrueColor
[ 704.336] (==) FBDEV(0): Using gamma correction (1.0, 1.0, 1.0)
[ 704.337] (II) FBDEV(0): hardware: ATY Mach64 (video memory: 8176kB)
[ 704.337] (II) FBDEV(0): checking modes against framebuffer device...
[ 704.337] (II) FBDEV(0): mode "1440x900" ok
[ 704.337] (II) FBDEV(0): mode "1280x1024" ok
[ 704.337] (II) FBDEV(0): checking modes against monitor...
[ 704.338] (--) FBDEV(0): Virtual size is 1440x1024 (pitch 1440)
[ 704.338] (**) FBDEV(0): Default mode "1440x900": 106.5 MHz (scaled from 0.0 MHz), 55.9 kHz, 59.9 Hz
[ 704.338] (II) FBDEV(0): Modeline "1440x900"x0.0 106.50 1440 1520 1672 1904 900 903 909 934 -hsync +vsync (55.9 kHz d)
[ 704.338] (**) FBDEV(0): Default mode "1280x1024": 108.0 MHz (scaled from 0.0 MHz), 64.0 kHz, 60.0 Hz
[ 704.338] (II) FBDEV(0): Modeline "1280x1024"x0.0 108.00 1280 1328 1440 1688 1024 1025 1028 1066 +hsync +vsync (64.0 kHz d)
[ 704.338] (**) FBDEV(0): Display dimensions: (518, 324) mm
[ 704.339] (**) FBDEV(0): DPI set to (70, 80)
[ 704.339] (II) Loading sub module "fb"
[ 704.339] (II) LoadModule: "fb"
[ 704.340] (II) Loading /usr/lib/xorg/modules/libfb.so
[ 704.341] (II) Module fb: vendor="X.Org Foundation"
[ 704.341] compiled for 1.19.6, module version = 1.0.0
[ 704.342] ABI class: X.Org ANSI C Emulation, version 0.4
[ 704.342] (**) FBDEV(0): using shadow framebuffer
[ 704.342] (II) Loading sub module "shadow"
[ 704.342] (II) LoadModule: "shadow"
[ 704.343] (II) Loading /usr/lib/xorg/modules/libshadow.so
[ 704.344] (II) Module shadow: vendor="X.Org Foundation"
[ 704.345] compiled for 1.19.6, module version = 1.1.0
[ 704.345] ABI class: X.Org ANSI C Emulation, version 0.4
[ 704.345] (==) Depth 24 pixmap format is 32 bpp
[ 704.392] (==) FBDEV(0): Backing store enabled
[ 704.396] (**) FBDEV(0): DPMS enabled

That is all that is required to get the desktop working.

Experimenting with both xfce and lxde, the lxde desktop is clearly faster. But, unfortunately neither desktop is particularly workable as the framebuffer display driver is quite slow. Responsiveness compared to debian Wheezy, which used the mach64 accelerated driver, is poor.

Next Steps

Just purchased an old Sun XVR-100 PCI adapter to give it a go. The Sun XVR-100 is an ATI Radeon 7000 64 MByte board that contains a SUN ROM to allow it to be recognised by OpenBoot, and to work in the Sun environment.

IMG_1509

OpenBoot show-devs

After configuring the OpenBoot to boot with the XVR-100 as the default screen and, to avoid conflicts, disabling the internal PGX graphics PCI interface, we are welcomed by the following boot screen.

IMG_1510

Ultra5 – OpenBoot with XVR-100

So it is looking good! But unfortunately, the radeon driver and radeonfb drivers are not working properly. The first sign of trouble is early in dmesg when BAR locations can’t be allocated.

IMG_1511

And then again later in the boot sequence the radeonfb driver complains that it can’t map the XVR-100 ROM and being unable to claim BAR 6 to assign a bridge window.

IMG_1512

And this leads to the xorg xserver loading the radeon driver but then being unable to properly address the XVR-100, so it bails out leaving us with no X screen. Luckily, the console is still working.

To be continued.

Cyber eyes and how to get them

Ever since I can remember, I’ve been substantially myopic or short sighted. As a child, I would lie awake at night waiting for any kind of night-time creature to emerge from the blur and eat me, before I had a chance to see it and run. But luckily, when wearing the prescribed lenses, my corrected visual acuity has always been on the better side of average. This has meant that I’m quite acutely aware of what good sight look like, and what it looks like when it is bad.

Growing up, I remember each new prescription would snap the twigs in distant trees back into my consciousness. Something that most people walking around blissfully with uncorrected vision would never perceive.

So for the past 40 years, I’d woken up, put on my glasses, lived an entire life, removed my glasses, and gone back to sleep. But sadly, last year, something changed. I contracted the dreaded presbyopia disease. Presbyopia is not really a disease. But it is a sure sign that I was getting old. Really old. Old enough that for the first time in my life I needed to have reading glasses, as well as normal distance glasses. This is a major problem, as I was always walking around with the wrong glasses on my face.

So why not bifocal lenses, or for that matter graduated progressive or multi-focal lenses? Well for me it comes down to visual acuity. I am not at all happy to have parts of my vision obscured, and need to look at people down my nose, or whatever is needed to get the small piece of corrected vision between me and the object of interest. So, I needed a viable solution.

I’ve written here about obtaining my personal cyber eyes because there are plenty of medical reports and advertorial information sources out there, speaking highly of the outcomes. But, not very many individuals have actually written about their own experience of vision enhancement.

PRK or LASIK

For some time I’ve considered, and discarded, the idea of PRK or LASIK as being “a bit of a bodge” at best, and a long term science experiment at worst. In my view, scarring the cornea to adjust the optical characteristics of an optical device with a lens just seems in every way wrong headed. Adjusting the lens characteristics should be the essence of the solution. Reports of the reduction in dilated low-contrast visual acuity (i.e. the nighttime) from LASIK do not reduce my perception that it is a bodge.

Perhaps 15 years ago I considered the idea of getting intraocular lenses which, although being very expensive, seemed like the right way to solve my problem of myopia, with no other vision issues. So 6 months ago with this resolution in mind, off to the surgeon I went.

Intraocular Lens (IOL)

Following a very short discussion, the surgeon disqualified me as a candidate for the intraocular lens (IOL) procedure. This relates back to the original reason why I presented, being presbyopia. Simply put, there is no reason place an auxiliary lens adjacent to an old immobile human lens. For younger people the IOL is IMHO the right way to go, to preserve all of the options for future surgical advancement. But, there is another procedure that is prescribed for older presbyoptic myopia suffers, like me.

Cataract or Clear Lens Replacement (CLR)

Once you have the onset of presbyopia, then there is little that can be done with the existing human eye lens. Because of weakness in the muscles of the eye, it becomes a fixed focal distance device, that suffers from UV aging, and degeneration. At a certain age, most (all) people begin to suffer from changes to the lens through clouding, which is termed cataract. The signs of cataract development can be detected from when you’re about 50 in most people, although the symptoms in vision may not be apparent for another 15 to 20 years.

Following up on the discussion with my surgeon, his team had found the indication of impending cataract in my lenses. This means that at some stage within the next 20 years I would need to have cataract operations to correct my vision. So the question was simply, when?

Coincidentally, the conversation I was having at that point was around Clear Lens Replacement, which is a cosmetic surgery undertaken to remediate the vision of people without the signs of lens degeneration and cataracts. The surgical procedure for Clear Lens Replacement and Cataract Removal is identical (for all intents and purposes, noting I’m not an optical surgeon and there are certainly things I don’t know about).

The Procedure

The procedure consists of making a small 1.5mm to 2mm long slice in the edge of the cornea, with the iris fully dilated. The surgeon then slices up the old lens, and vacuums it out of the lens pouch. He then injects a self unfurling lens through the slit and tucks the edges safely under the iris. The operation takes about 20 minutes under strong sedation, and is not accompanied by any pain, or even discomfort (in my case).

Waking up from the snooze, it is incredible to actually see things sharply, though everything is clouded and somewhat unstable. Later it became obvious that the blur was mainly caused by the plastic pirate eye-patch I got to wear home (and at nights for a week).

In my case, it took nearly 3 days before the iris dilating drugs wore off, and my eye could function properly in the presence of strong light. This issue led to two symptoms. Initially my iris was so dilated that stray light was entering the optical pathway, and causing “lens flare”. Later, there was just a sensitivity to outside light intensity. By the 4th day this effect had worn off, and my vision was pretty much perfect.

The interim two weeks with just one eye corrected were quite difficult. My eyes had nearly 4 dioptre difference in prescription, and the eye with the stronger prescription was operated on first. I tried to use my normal glasses with one lens popped out. This meant that that my brain had to accommodate an 8 dioptre change in retinal image size in my corrected eye with the image presented by the uncorrected eye, to be able to integrate binocular vision. Basically, I couldn’t do it. So for reading I used a piece of cardboard in my reading glasses to obscure sight in my corrected eye, and for outdoors I just left my uncorrected eye to fend for itself. Possibly using a contact lens in the uncorrected eye may work if needed, as a contact lens impacts the retinal image size to a much lesser degree than glasses.

Visual Acuity

Well it is now one month since my first operation, and two weeks since my second. And I have to report that the procedure was a success. My visual acuity is sufficiently high that I don’t need any distance correction. I can read two lines below the 6/6 (or 20/20 in Imperial) which is the equivalent of 6/4, the vision of a young person.

My surgeon was aiming for -0.25 dioptre in both eyes, as the margin for error from the mechanical calculation is 0.5 dioptre. Better surgeons will use their experience to temper the calculation and prescription and have a higher chance of getting a good outcome. My right eye (after two months) has settled down from -0.5 immediately after surgery to -0.25 dioptre. My left eye is at +0.25 after a month,  and its resolving capability has been slightly improved. This is a very good outcome, and the surgeon is very happy with himself.

So how do my eyes work in the short range with effectively a fixed focus? Well this was the big question that I was worried about before this whole endeavour. Would I be able to see the speedometer whilst driving? Could I read my wristwatch, or see my phone? Well there the answer is mostly, yes. It is amazing (to me at least) how closely the human eye resembles a pinhole camera in practice, and doesn’t need to be actively focused. Although, there is no escaping the need to use reading glasses for detailed close work.

Technical Comments

My eyes had quite different prescriptions, so my surgeon installed products of quite different types, from different manufacturers. I’m sure my experience is not typical so I’ll note down the issues I’ve seen over the past few weeks.

PhysIOL MicroPure

PhysIOL
MicroPure

Right eye was corrected for substantial myopia, with a PhysIOL MicroPure aspheric lens with a square edge. The square edge is to prevent the regrowth of lens cells from interfering with the replacement lens surface.

I find this square edge causes total internal reflection and halo effects in darkness with strongly off centre lighting. An example of the issue is down-lighting in a lift. I’m told this halo effect will disappear within 6 months, as the eye assimilates the lens (I presume), and reduces the interface TIR effect.

Zeiss AT Torbi 709M

Zeiss
AT Torbi 709M

Left eye was corrected for astigmatism and myopia, with a Zeiss AT Torbi 709M toric lens. The lens is very comfortable and doesn’t have the same reflection issues as the right eye, but potentially I’ll have to watch for regrowth of human lens cells which would obscure my vision.

An acquaintance experienced cell regrowth. He noticed cling wrap being layered across his vision from about 20 months post operation. The resolution is to blast the regrowing lens cells with a femtosecond laser to burn them away. This is done in the surgeon’s chair. During the original operation some human lens cells are left to support the new lenses in a pocket, but after the new lens has been grown or scarred into place, these old cells can be safely blasted out of the way. Vision is actually improved by removing them entirely.

Within 24 hours of left eye operation, I noticed a colour perception effect, where my left eye was seeing colours much (somewhat) colder than my right eye. A bit like the difference between a daylight globe and a warm-white globe. I was worried I was “seeing things.” It was only after  I obtained the technical details of the lenses the surgeon had used, it became clear that the right eye has a UV and blue light filtered lens, but the left eye lens is unfiltered.

There is no clear preference among surgeons as to which is better, with unfiltered vision potentially leading to better sleep, but blue filtered vision being more closely aligned to young vision. I actually prefer my cold eye colours, but I also prefer daylight globes in my home. In normal daily binocular vision, there is no discrepancy to speak of. In all cases, sunglasses are recommended for outdoor vision, anyway.

Supermarket Glasses

One of the goals for this procedure was to enable me to use $5 supermarket reading glasses, and not be bound to special prescription lenses. In fact, that is the outcome I’ve obtained.

There is a handy formula or hack that my surgeon disclosed. The dioptre measurement on the supermarket readers is equivalent to the inverse of the focal distance at which they work best (assuming you’re perfect at infinity, which I am now). So a +1 dioptre lens will focus at 1/1 metres. A +3 dioptre lens will focus at 1/3 metres or 33 centimetres.

This then is a perfect result. I can (or have already) purchased many $5 glasses and left them where I need them, with the right focal length. +1 for the computer screen, +1.5 for reading, and +3.5 for electronics inspection. If my reading glasses get lost or old, who cares? In fact, after being used to wearing the same pair of glasses for several years, it is quite entertaining to be able to buy new glasses every week, just to have a fresh look!

Recommendation

To anyone who has lived with significant near or farsightedness throughout their life, just hope for the day that you can be diagnosed with the indications of cataract disease, and go with the replacement as soon as you can. There doesn’t seem to be a downside for doing this. But there are substantial upsides to achieve:

  1. Yes, the shower floor is very grimy.
  2. And yes, that’s a spider on the bedroom wall over there.

I’m going to be hanging on waiting for my upgrade to HUD, and integration with Alexis or Siri, so that I can finally remember names.

One Year Review

About 14 months following surgery, I was beginning to find that my vision from the right eye was reducing. Not at a particular distance, but rather generally. Fine detail both near and far was substantially degraded with respect to the vision in my left eye.

This was caused by cells growing on the inside surface of the new lens. As a foreign body my eye was trying to encapsulate it and scar over it. A thickening layer of new cells was being grown over the lens surface.

Fortunately, the cure for this is to undertake a capsulotomy, which removes the cells from the inside surface of the lens. The procedure takes only a few minutes and, in my case, is immediately effective in returning full vision.

capsulotomy-procedure-330x220

I’d note that my surgeon did note that he has stopped using the PhysIOL MicroPure IOL since my operations, as cell regrowth happens far too often and is far more prevalent than with the Zeiss lens in my left eye. It would have been nice not to have been an experiment.

The colour perception effect remains, but is only noticeable when I’m looking for it. So absolutely not a problem.

Now, I’m back to 6/4 visual acuity, and will provide further updates if needed.

Yet Another Z180 (YAZ180 v2)

Testing on the YAZ180 v1 , shown below, is now complete. I don’t want to use it for further driver and platform development, because the PLCC socket for the 256kB Flash is becoming worn-out.

It will continue to operate as an augmented Nascom Basic machine, with an integrated Intel HEX loader (HexLoadr) supporting direct loaded assembler or C applications.

img_0626

YAZ180 v1 at full configuration.

The new PCB for the YAZ180 v2 has been ordered.

These are some screenshots of the new PCB.

 

Update

Pi Day, March 14 2017.

After dwelling on the fact that the V2 PCB was really just a clean up the V1 PCB, with no additional features, I decided not to build the beautiful new PCBs that arrived today.

But rather, to create a new PCB with additional features.

 

New Features

When I originally designed the YAZ180 the breakout for the 82C55 was simply an interim design, to enable me to test the board. I was thinking of making an Arduino style pin-out, or something along those lines. But this is something much better.

Recently, after reading Paul’s page on interfacing an IDE drive to an 8051 microprocessor with the 82C55, I decided that adding IDE to the YAZ180 was a must-have feature.

So there is a new connector on the YAZ180 to break out the 82C55 pins, in IDE 44-pin 2mm format. I have not followed the design provided by Paul exactly. I’d note that his design and the earlier design by Peter Fraasse were specialist designs, which don’t support the generalised usage of the 82C55 chip, beyond the IDE functionality.

By the above statement I mean that in Mode 1 and Mode 2 for Port A and Port B, the PC2, PC4, and PC6 pins of the 82C55 device are designated registered strobe input pins /STB in input mode, or peripheral acknowledge /ACK in output mode. If an inverting output buffer is connected on these lines, then the registered input and output mode capability is lost. This would restrict the functionality of the 82C55 to simply Mode 0, being the mode that is used to create the IDE functionality.

As I’ve connected the three IDE address selection pins to PC2, PC4, and PC6, and these pins are not passed through an inverting buffer in the design, it is possible to use the 82C55 in any of its modes, and therefore to use the IDE 2.5″ 44-pin form factor to connect the YAZ180 82C55 ports to extension PCBs of any type or design.

As a connected IDE drive or other extension board may need to interrupt the CPU, I have connected the IDE INTRQ pin to the remaining inverting buffer to provide an input to the CPU on /INT0. As the /INT0 (or actually the INTRQ) input terminates on the IDE header, either a IDE drive through INTRQ, or either of the two 82C55 INTR pins, PC3 or PC0, can originate the interrupt.

I have reconfigured the Am9511A-1 to use the /NMI interrupt, as previously the /INT0 was configured.

The new YAZ180 v2 PCB has been ordered. YAZ180_v2_Schematic.

Happy Pi Day.

Update – RetroChallenge Day 1

I’ve decided to enter the RetroChallenge 2017/04 and my challenge is to read and write to an IDE drive using the newly configured IDE interface on the YAZ180v2. But before I can write the code for the IDE interface, there’s a bit of building and testing that needs to be done.

The new PCBs arrived a few days ago, and they look great. But Arduino Day and the first day of the RetroChallenge 2017/04, 1st April, seemed like a good day to lay them out.

P1080754

New PCBs. 2oz Copper, 2mm thick. Opulent.

I was hoping to lay build several boards at once, but somehow I forgot that there was only one RAM and one FT245 device in my component stocks. That means that I had to satisfy myself with just one board for now.

Note the suitably Retro PowerMac (circa 2001) driving the layout guide screen.

P1080755

Adhoc Workspace

This is the board just before cooking. Respect to anyone who notices the substantial noob layout mistake. Anyway, after a small smokey explosion, everything was rectified.

P1080758

Two YAZ180 versions, side by side.

This is the finished build of the YAZ180 v2. Looks very tasty. Retro goodness.

P1080761.JPG

Fully populated YAZ180 v2 PCB.

I’m still working on fixing an issue with my code, which I noticed when experimenting with the Am9511A APU, and inserting an Interrupt Jump Table. Basically, I’m getting jumps to odd or  random locations, which is detected buy filling unused locations with 0x76, the HALT, OpCode. The most common address where the HALT is executed at is 0x00C3.

Previously, I’d been filling unused locations with 0xFF, the RST 38H OpCode shared with the INT0 location 0x0038, which was causing the APU to be triggered inappropriately. This issue has me snookered. I can’t move on, in the software sense, until it is resolved .

 

This slideshow requires JavaScript.

Update – RetroChallenge Day 8

Well this week was one of the most frustrating weeks ever, in terms of time spent vs. results obtained.

There are two major projects in hand. 1. Getting the YAZ180 v2 running, and 2. resolving the software issue plaguing my initialisation code.

Hardware issues

Bringing up a new piece of hardware is never easy. Initially, nothing can be trusted to work, and everything needs to be checked against the design, and then even the design checked for correctness. Bringing up the YAZ180 v1 was very time consuming, because I had to develop the PLD design during the process, as well as checking that all the hardware was sorking as it should. I thought that bringing up the YAZ180 v2 would be easy. Just solder it together and win. But it has not been so simple.

Essentially, after a week of working on this every evening, I don’t know why it is not working correctly. All the standard things, volts, clock, stuck address and data lines, etc are all working correctly. But it still doesn’t work. And, it may not be just one thing that is wrong, but if anything is not perfect it just won’t work.

After a few days of testing, I found that I’d programmed the PLD devices with an old version of the CUPL code. Nearly right, but not exactly right. Once I’d isolated that issue, by ensuring the new GAL devices worked perfectly in the V1 board, I thought it would be enough. But no. There’s still something wrong.

My current thought is that somehow, either electro-static damage or heat damage, the RAM is unreliable. But, I’m not sure enough of this to unsolder the RAM device and replace it. I’ll be spending this weekend on resolving this problem.

Software issues

Because of the effort I’ve been putting into resolving the hardware issues, I’ve not been able to solve the software issue apparent in the YAZ180 initialisation and serial code. I’ve documented the issue on Github.

My lesson learned is NOT to fill unused memory with 0xFF bytes. This causes RST 38H jumps to the INT0 location when the PC is incorrectly loaded, and can be very distracting. Best to fill unused memory with either 0x76 HALT bytes, to see where things became broken, or with 0xC9 RET bytes to just float over the underlying issue.

I’ll need to fix this properly, but it has consumed several weeks of effort, and I’m not much closer to resolution.

Update – RetroChallenge Day 10

The weekend was unkind, but today some new eyes (literally) have brought successes.

Hardware Issues

After doing quite a bit of further testing, I’m fairly sure that I’ve damaged the RAM and will need to replace it. So, I’ve ordered a hot-air solder gun. Should have had one for a long time. Finally, I’ve got a round-‘tuit. I’ll have to order some replacement components too, which will result in being able to make additional boards as well.

Software Issues

Finally, I’ve resolved my issue. What we had here was a classic “failure to understand”. Somewhat embarrassed to leave this here for Internet eternity.

  • Z80 vectors are supported by a JUMP table.
  • Z180 vectors are supported by an ADDRESS table.

Insert JP instructions into an address table and you will have a very very bad day.

Or in my case, quite a few of them.

This issue cost so much time. But at least on the up side, I’ve written robust Z80 and Z180 vector tables, improved my ASCI code, and cleaned up initialisation code, in trying to track this down.

Also finally, I now understand. Which is the entire point, anyway.

Update – RetroChallenge Day 17

Following up on the success of last weekend, I was hoping to have a lot of achievement to write about today. Unfortunately, it has been a grind this week too.

I have been distracted back into the original project that unearthed my previous software problem, and led me along the path to getting a much better understanding of the Z180 CPU, and then solving the issue. The original project was building an interrupt driven driver for the Am9511A-1 Arithmetic Processing Unit.

I’ve spent pretty much the past week on this code, and digging through it with a fine tooth comb. I’m now of a belief that my Am9511A driver code is correct, but my hardware is not correct and may never be correct.

The issue lies with the requirement for the Am9511A to have the Address lines and Chip Select signal remain valid for 30ns following raising of the Write signal. Unfortunately, the Z180 only maintains valid address lines for 5ns following Write. This means that writing to the AM9511A APU is very much a hit and miss affair, with miss being the most likely outcome. I’m still thinking about ways to bodge this to work. But, I think that it may just be too hard to get the old APU to work with a modern CPU. More on this later.

This week I’ll be working on the PaulMon IDE code, and migrating it from 8051 to Z80 nomenclature, and trying to get it to compile.

Update – RetroChallenge Day 21

Well the last couple of days have been exciting, as I found a way to make the Am9511A APU work. A hint from a fellow competitor (on working with the MC6809 CPU) inspired me to look further for information on options to fix the hardware interface.

The Z180 E CLOCK

The Z180 has an almost undocumented feature, called the E Clock. Yes, it is documented in datasheet that it exists, but there’s no real background that I can find as to why it exists, except that is for a Secondary Bus Interface. This pin and signal doesn’t exist on the Z80, for example. Anyway, since it has the same name as a signal on the MC6809, I thought it might be worth looking at it. It turns out that the E Clock provides a shortened version of the WR and RD cycles. Which is exactly what we need.

One caveat however, when running at doubled PHI rates (i.e. 1:1 PHI – CLK) the shortening of the E Clock signal is not sufficient to drive the APU successfully. At 18.432MHz, the PHI/2 timing is 27ns. Therefore, the minimum of 30ns between release of WR and CS is not always held. This means that we’ll need to keep the PHI at half CLK whilst using writing data into the APU. In practice this means that using the APU requires we cut the CPU clock by 50% or PHI/2 being 57ns, to ensure the trailing 30ns is provided.

Anyway. Good news. With the revised timing, the Am9511A-1 is working.

Am9511A FDIV

Am9511A APU Floating Divide in 115us

The E Clock is not an inverted signal, so to generate the active low APU_WR signal we have to first invert it, then OR it with the WR signal. For the purposes of testing, I’ve got a little breadboard with a GAL on the side, but later I’ll build a new PCB and add in a SN74LVC1G97 little logic device to provide the APU_WR signal.

Am9511A FDIV PUPI

Am9511A APU FDIV PUPI command interval 128us

Am9511A FDIV PHI6 Cycles

Am9511A APU FDIV in 179 Phi/6 Clock cycles

So now we see the Am9511A APU FDIV floating point divide takes about 101us to 115us when running at 1.536MHz, or from the datasheet 154 to 184 clock cycles. In 101us, the Z180 CPU at 36.864MHz produces 3,723 cycles. To produce a floating point divide using the Lawrence Livermore Library requires about 13,080 cycles, according to the AM9511A Floating Point Processor Manual by Steven Cheng. Therefore, we are still substantially faster than antique software on a modern Z180!

Update – RetroChallenge Wrap Up

Well the month of RC2017/04 didn’t go quite as planned. My original intention was to have the YAZ180v2 working very quickly, then get straight down to porting Paul’s IDE code from 8051 to Z180 to get the new IDE interface working. But, there were several speed bumps along the way.

Gaining an education

Since I just started on this whole Z80 processor and assembly language programming thing a few months ago, I don’t have a long history of coding to fall back on. I had written some code for the Z80 in the RC2014 hardware, which I then tried to use on the Z180 in the YAZ180. But, there is a subtle difference in “generation” between the way the interrupt vectors work across the two machines. Obvious, once you know about it but a real “time killer” if you don’t.

Firstly, filling unused space in your assembly program with 0xFF is a very dangerous thing to do in Z80 assembler, particularly if you don’t understand that 0xFF is the op code for RST38, which is a single byte jump to the same location as the Interrupt 0 in IM1 mode. It would make more sense to fill the unused space with 0x76, which is the HALT instruction, to trap an unexpected program counter value.

Look before you leap

Secondly, the interrupt vectors on the Z80 were designed to contain code, and the PC is just loaded with the address of the vector, and execution begins from there. So for an INT0 (or RST38) execution begins from 0x0038. But, the interrupt vectors on the Z180 are designed to hold an address. The difference being that an interrupt will load the PC with the contents of the two bytes at the vector, and then begin execution from there. I think this difference is a sign of the generational difference between the two implementations. One of the clearest differences I can find, anyway.

Timing is everything

One of the goals for the YAZ180 is to bring some old chips back to life, in a modern platform. Along with the TIL311, GAL16v8, and 82C55 devices, the Am9511A holds pride of place as the very first arithmetic processing unit ever made. I’ve invested far too much time in getting the Am9511A to work, but it is important to me that my project can make it work.

I believed that I had devices that were specified to run at 3MHz but which in fact didn’t. That may be incorrect. More likely was that I wasn’t driving them properly, because my timing was out. I will need to go back and test them all again.

Here the issue is that the Am9511A requires extended validity of data and chip select signal, following the validity of the write signal. At least 30ns is required. This is not provided by the Z180 in its normal timing, although in the configuration I have it, coincidentally because I’d buffered the data bus, it is nearly right. Only the chip select line was being incorrectly handled.

I was nearly giving up but then a tweet from a fellow RC2017/04 competitor gave me the inspiration to look further. It turns out that the Z180 has a secondary I/O timing signal called the E Clock. This signal is not present on the Z80, and as I didn’t understand its purpose I’d left it unconnected in the YAZ180.

Whilst the Zilog datasheets on the Z180 completely gloss over the purpose of the E Clock signal, by simply not mentioning it, the original Hitachi 64180 datasheets do mention it. The original purpose of the E Clock signal was to provide timing for “a large selection of 6800 type peripheral devices” including the “Hitachi 6300 CMOS series (6221 PIA, 6350 ACIA, etc) as well as the 6500 family devices”.

In summary, the E Clock provides a signal that is half a T cycle shorter than the write signal. It means that gating the write signal with the E Clock would allow me to release the APU write signal sufficiently early to maintain the extended chip select timing required. Basically, the APU won’t operate with a Z180 T Cycle any less than 60ns, or 16.6MHz. So in my implementation, the PHI clock will need to run at half speed or 9.234MHz, whilst the Z180 is using the Am9511A. Unless I cook up another plan.

Zapped

And the final note from this month is that I believe my very poor ESD protocol has led to the destruction of the SRAM on my YAZ180v2. Therefore I had to desolder it (and it looked so nicely done) to remove it, and order some new components.

Ordering new components is always a bit of a hurdle for me. I’ve collected quite a few things that I don’t use, so I tend to ration myself on purchases vs. progress. Finally, at the end of the month I ordered more components to build further YAZ180 boards, and some spare SRAM to enable me to repair the one I have made already.

It continues to amaze me just how much difference there is in the cost to build an Arduino AVR board (basically just a chip at the most essential level), vs something like the YAZ180. The YAZ180v2 bill of materials, excluding specials like the GAL16V8D, TIL311, and Am9511A devices that I have to find of eBay, comes to over $150 Australian!!! We need to export more coal, to get the AUD dollar back up there!

And that’s it for my RC2017/04 month. Soon as the parts arrive I’ll be completing the YAZ180v2, and then testing the IDE interface. I hope that will be done before the end of 2017/05.

Update – Post RetroChallenge

Well good news. The only issue was a bad solder joint on the new SRAM chip. Now the YAZ180v2 is running, and I can get onto translating the IDE code from 8051 to Z80.

IMG_0698

YAZ180 with IDE drive attached.

I’ve sourced code from both PJRC in 8051 mnemonics for an 8255 PIO and from Retroleum in Z80 mnemonics for an 8 bit interface. Between the two of them, together with the examples from the OSDev Wiki, it should be easy to make a fairly robust implementation. And, on May 18th, the driver code was finally working.

Next activity is to integrate this into the z88dk, and then using the FAT-FS code from Elm ChaN, get the disks properly working.

Update – August 2017

Over the past few months progress has been made on various fronts with the YAZ180v2. Firstly, the IDE interface is fully working, and has been integrated into z88dk. Also, the issues with the Am9511A-1 APU have been resolved, and a working driver has been integrated into z88dk. While I still have to revise the C interface for these two pieces of code, because I’m still learning this, the development work is now done.

I am particuarly happy about getting the Am9511A APU working, as this was causing me the most technical difficulty, and stretched my understanding the most. I’m also happy that the capability in the Am9511A seems to be realised through performance improvements in arithmetic computation.

Over the coming months, I hope to resolve the remaining untested components in the YAZ180. These include the parallel programming interface, to allow the YAZ180 to be “cold loaded”, to protect against bricking the system, if a user doesn’t have a EEPROM or Flash writer. I’m still in two minds abou this feature, as the cost of the FT245 device and USB socket is about the same as a stand alone EEPROM writer, and the parallel programming interface consumes space that could be otherwise used for an SPI or USB interface, for example.

I also need to test the I2C interfaces, and debug the driver that I wrote back in May (still unused) to complete the feature set.

With this done, I’ve now done a new minor revision of the hardware, to clean up the issues that have been noted over the past months.

Open Issues

  • APU – Gate E clock with WR to produce shortened APU WR
  • NMI – remove this from the APU, and terminate high. Not CP/M compatible.
  • INTO – reconnect it to the APU.
  • 5V – Power inductor spacing.

Update – October 2017

Well, I fixed the issues and then convinced myself that there needed to be Yet Another feature added to the YAZ180, before I signed it off. So now, the v2.1 board has access to the Internet through an ESP-01S AT interface.

 

This slideshow requires JavaScript.

I sell on Tindie

Building up this new board will be November and December’s activity, together with building a complete YABIOS (Yet Another BIOS) to make use of all of the features packed into the YAZ180 board. I’ll be picking the best bits, IMHO, from the Cambridge Z88, ZX Spectrum, and CP/M 2.2 – ZSystem to build a banking capable YABIOS, supporting 1MByte of address space.

Update – August 2018

I seriously need to write another blog on the YAZ180. But I guess the Github commits just speak for themselves. There are only three things left on the to-do list. 1. finish the firmware loading program. 2. rewrite the I2C interface. And 3. implement FreeRTOS as a hypervisor allowing multiple 60kB applications to run simultaneously.

Here’s a picture of an application, which was one of the early drivers for building the YAZ180, a mass-storage platform for my HP48 calculator.

IMG_1572

YAZ180 communicating with HP48 using Kermit on ASCI1 (TTY).

Characterising Am9511A-1 APU

I’ve built a Z180 based board, supporting the AM9511A APU. Partly for historical enjoyment. Partly because the AM9511A-1 is actually still faster at arithmetic than “modern” Z80 devices.

For example, from the Am9511/Am9512 Floating Point Processor Manual by Steven Cheng, we have comparison tables. On average the Am9511A APU (at 1.966MHz) produces a hardware floating point divide in 165.9 cycles (of a 2MHz 8080 processor). Converted to my Am9511A implementation (at 2.304MHz), we have the equivalent in 141.5 cycles of the 2MHz 8080. Converted to best case modern Z180 terms (overclocked to 36.864MHz) this is 2,609 CPU cycles to return a hardware floating point divide.

To produce an equivalent software floating point divide, using the equivalent vintage LLL floating point library, requires 13,080 cycles.

This means that floating point on the 40 year old AM9511A-1 APU is still 5.0 times faster than an overclocked Z180 running antique 8080 code. Sweet!

Testing

I’m integrating an Am9511A-1 APU device into my YAZ180 build. The basic device is capable of operating at 3MHz. But, I’ve found that driving one sample at 3.072MHz doesn’t work. But, it works fine at 1.536MHz, and at 2.304MHz.

This one example has 83.33ns delay between RD or WR and PAUSE signal being operated. This means that it should comfortably operate with the minimum of one wait state when the Z8S180 is running with a 18.432MHz bus.

And, I’ve got to say that these devices run hot… OHS issue hot. There is a reason they are provided in a ceramic package. They sink 70mA at 12V plus another 70mA at 5V, and all that energy has to go somewhere.

Originally, I had created the timing for the Am9511A-1 is generated by dividing the Z8S180 18.432MHz system clock by 6. The divide FCPU by 6 for FAPU CLK 3.072MHz is done with a SN74LS92N device. And for test purposes, the Z8S180 is also half rate clocked at 9.216 MHz, producing 1.576MHz for the APU.

img_0620

YAZ180 Test Rig – Am9511A-1

Initial Testing

The test is initially pretty simple, really just a proof of life. Will they properly push and pop data at 3.072MHz?

If not, then I’ll need to redesign the YAZ180 to operate the Am9511A-1 at a lower APU clock. But, if my initial samples are just not the up to specification, then they can be secondary devices.

Ok, now what kind of devices have I got to hand, and lets see what the results are…

Front Serial ID Country & Rear ID 3.072MHz 1.576MHz
239KH2Z Malaysia 9237FP Fail Pass
301MBZP Malaysia 9252CP Fail Pass
3368YF8 Malaysia 9335CP Fail Pass
348W76S Malaysia 9347DP Fail Pass
921BDIV Philippines 8917WM Fail Pass

Kind of boring really. Pretty clear that AMD oversold the capability of its Am9511A-1 to run at 3MHz. Or, I’m not feeding them with the correct timing.

I’ll need to redesign the YAZ180v2 to provide a divided by 8 clock at 2.304MHz, and cross my fingers that they work at that clock rate.

Update September 2017

Following a substantial investment in testing of the solution in April and August, I’ve finally got a working system that can reliably use the Am9511A-1 to produce computational results.

The main issue was described in the Retro Challenge period. The Z180 CPU I/O timing is slightly different to the Z80, in that it doesn’t provide any interval between the WR/RD deselection and the CS deselection. For most peripheral devices this isn’t a problem, but for some it causes them not to work. Hitachi recognised this problem, and provided a shortened I/O signal called the “E Clock”, which can be gated together with the WR/RD signals to gain the timing desired.

The Am9511A doesn’t require the RD signal to rise before the CS signal. Its TRCS time is 0ns. However it does require the WR signal to rise 60ns (or for Am9511A-1 30ns) before the CS signal. This can be achieved by gating the WR signal together with the E Clock signal. The Z180 E Clock is 1/2 a Phi cycle shorter than the WR signal during I/O instructions.  Therefore provided that the Z180 Phi is slow enough, then the required timing can be held. At 18,432MHz, the Z180 Phi signal is 54ns long, and therefore 1/2 Phi is 27ns, which is too short to work consistently. However, the Z180 can halve its Phi speed easily, and simply by changing the CMR register the Phi clock can be reduced to 9.216MHz, which in turn gives us (theoretically) 54ns TWCS for the Am9511A-1.

In practice gating the WR signal through the E Clock provides 41ns for TWCS, which is enough to work reliably with Am9511A-1 devices.

However, as the Am9511A-1 timing is derived from the Z180 clock (divided by 8), running Phi at 9.216MHz means that the APU is in turn under clocked to 1.152MHz. We can avoid this waste of capability, by managing the Z180 clock precisely during the command interrupt generated by the Am9511A-1. On entry to the interrrupt, we reduce the Phi to allow commands (TWCS) to work, and on exit from the interrupt we return the Phi to its original setting of 18.432MHz (and the APU to 2.304MHz). This also conveniently avoids having to modify the Z180 ASCI code to manage serial communications in the light of the reduced system clock.

I have also tested the Am9511A-1 using this revised WR to CS timing, with my original clocking of divide by 6 from Z180 system clock, or 3.072MHz, and it doesn’t work. This means that we actually do need to stay within the specification, and not overclock the APU.

z88dk Driver

The z88dk supports the YAZ180 platform, so I have added a driver for the Am9511A into the device folders. The driver is based on an interrupt driven model. As noted in the Am9511A Processor Manual, there are 4 different models that can be adopted.

The Am9511A Pause signal is connected to the Z180 Wait signal, so if the CPU generates a request that requires time to fulfil, then the APU will cause the CPU to Wait by signalling Pause. Clearly this means that the CPU cannot do anything while it is in Wait mode. This model of working is called Demand Wait, and it provides the fastest APU response, but doesn’t support using the CPU effectively.

The APU status register can be read without the APU causing a Pause response. Therefore the CPU can poll the APU by reading the status register. In this situation the CPU can continue profitably, but response to APU requirements (for new operands, or commands) is limited to the polling rate. As different APU commands can be completed in different times, the optimal polling time can be difficult to calculate.

The APU End signal can be connected to a CPU interrupt, and in the case of the YAZ180 it is connected to the NMI, which is then triggered at the end of each command, when the APU is ready to receive a new command. If the APU commands are buffered in advance, then using this interrupt mechanism the APU can complete a long sentence of commands “autonomously” and in parallel to the CPU activities, interrupting only when it needs to load a new command. This is the mechanism used in the z88dk driver.

Some ideas for creating an optimal interrupt driven driver for the Am9511A were inspired by reading “An Efficient Software Driver for Am9511 Arithmetic Processor Implementation”, B. Furht and P. Lee, 1984.

The final driver option is to connect the APU Service Request signals to the CPU DMA interface. In the YAZ180 these hardware signals are connected, and if desired a DMA enabled driver could be written to take advantage of this interface. However as only 4 bytes of information can be transferred in each load or unload command, at face value there seems to be little advantage in building a DMA software interface.

The C interface for the Am9511A driver supports direct access to the Am9511A, like the asm driver, and hopefully it can be integrated into the z88dk math library options, to support transparent usage of the APU where it exists.

Currently, some simple C interface code is now available on z88dk, which allows us to do performance testing.

Performance Testing

A simple test of seeking prime numbers exercises the floating point divide and 32 bit fixed point divide. This “brute force” method is not optimised for the APU as 4x 32 bit numbers must be loaded into the APU, but only two divides and a subtraction are done, for each calculation cycle. During the operand loading process, the CPU must also be slowed down to half rate, but since this is during an non-maskable interrupt there is little effect on the overall system.

So, some comparisons. First using Nascom Basic, we have a baseline, for seeking 1000 prime numbers.

20 PRINT "LIMIT";
30 INPUT L
40 FOR N = 3 TO L
50    FOR D = 2 TO (N-1)
60      IF N/D=INT(N/D) THEN GOTO 100
70    NEXT D
80    PRINT N;
90    GOTO 110
100   PRINT ".";
110 NEXT N
120 END

200 REM 124.7 Seconds (hand timed) - Z180 36.864MHz - Nascom Basic

Adding the Am9511A to the calculation, by doing the inner loop in assembly and calling the APU for the divide and subtract duties, does speed things up.

20  PRINT "LIMIT";
30  INPUT L
40  FOR N = 3 TO L
50    IF USR(N) = 0 THEN GOTO 100
80    PRINT N;
90    GOTO 110
100   PRINT ".";
110 NEXT N
120 END

200 REM 90.4 Seconds (hand timed) - Z180 36.864MHz - Nascom Basic

This is interesting, as the Am9511A is running at 2.304MHz (straight out of 1977), yet it is STILL faster than software on a 36.864MHz Z180.

Now, what happens when using the z88dk, to do the calculation in C?

void main(void)
{
  static signed long l, n, d;
  printf("Limit: ");
  scanf("%ld", &l);
  for (n = 3; n != l; ++n)
  {
    for (d = 2; d != (n-1); ++d)
      if ((float)n/(float)d == n/d) break;
    if (d == (n-1))
    printf("%ld", n);
      else
    printf(".");
  }
}

// 67.8 Seconds (hand timed) - Z180 36.864MHz - z88dk sdcc_iy
// 62.7 Seconds (hand timed) - Z180 36.864MHz - z88dk newlib

// zcc +yaz180 -subtype=basic_dcio -vn -SO3 -lm -clib=sdcc_iy --max-allocs-per-node200000 primesC.c -o primesC -create-app
// zcc +yaz180 -subtype=basic_dcio -vn -SO3 -lm -clib=new primesC.c -o primesC -create-app

Using the assembly math routines in z88dk is now the fastest solution. But, we’d expect C to be faster than Basic. What happens if we do the divides and subtractions in the Am9511A. That’s why we’re here.

void main(void)
{
  static signed long l, n, d, r;

  apu_reset( (void *) 0x2021 ); //INITIALISE THE APU with the NMI VECTOR ADDRESS

  printf("Limit: ");
  scanf("%ld", &l);

  for (n = 3; n != l; ++n)
  {
    for (d = 2; d != (n-1); ++d)
    {
      apu_cmd_ld( &n,   APU_OP_ENT32);
      apu_cmd_ld( NULL, APU_OP_FLTD);
      apu_cmd_ld( &d,   APU_OP_ENT32);
      apu_cmd_ld( NULL, APU_OP_FLTD);
      apu_cmd_ld( NULL, APU_OP_FDIV);
      apu_cmd_ld( &n,   APU_OP_ENT32);
      apu_cmd_ld( &d,   APU_OP_ENT32);
      apu_cmd_ld( NULL, APU_OP_DDIV);
      apu_cmd_ld( NULL, APU_OP_FLTD);
      apu_cmd_ld( NULL, APU_OP_FSUB);
      apu_cmd_ld( &r,   APU_OP_REM32);

      apu_isr();        // calls the ISR to trigger the process
      apu_chk_idle();   // blocks until the APU is idle

      if (r == 0) break;
    }
    if (d == (n-1))
      printf("%ld", n);
    else
      printf(".");
  }
}

// 94.9 Seconds (hand timed) - Z180 36.864MHz - z88dk

// zcc +yaz180 -subtype=basic_dcio -vn -m -SO3 -clib=sdcc_iy --max-allocs-per-node200000 @primesC_APU.lst -o primesC_APU
// appmake +glue -b primesC_APU --ihex --clean

Well the end result is (surprisingly) similar to using the Am9511A together with Basic. The issue is that it takes a long time to load and unload the operands to the APU, and to process (cast) them, and then we only do a relatively simple operation when they’re there. Relatively speaking for this test, we’re bound by the I/O rate of the APU, which is quite slow, and this does not demonstrate actual the compute rate of the APU.

APU_OP_ENT32
APU_OP_FLTD -  56-342 Cycles
APU_OP_ENT32
APU_OP_FLTD -  56-342 Cycles
APU_OP_FDIV - 154-184 Cycles
APU_OP_ENT32
APU_OP_ENT32
APU_OP_DDIV - 196-210 Cycles
APU_OP_FLTD -  56-342 Cycles
APU_OP_FSUB -  70-370 Cycles
APU_OP_REM32

However, this is still a good result. We’re comparing a modern overclocked z180 at 36.864MHz, with an ancient device, released in 1977, running at 2.304MHz. And, we have achieved comparable results.

Update July 2018

Over the past year I’ve done quite a lot with the Am9511A. One interesting application was using it to calculate a Mandelbrot set, and generate a text image.

From the point of view of drivers, I re-wrote the interrupt based driver to use a 3 byte operand address in the ring buffer, to support access to the full memory space of the Z180.

I also uncovered a hardware bug, which was caused because I was resetting the APU clock divider with the system reset. This meant that the APU was not getting 5 clock cycles with Reset held high, and would therefor not reset reliably. Removing the reset signal from the divider chip solved this issue.

Z80 C code development with Eclipse and z88dk

I’m building a Z180 based development board called the YAZ180 for the 40th anniversary of the Z80 processor. As part of that process, I need to have a development environment that supports the Z80 and the Z180 processors. As I haven’t finished building the YAZ180 yet, I’ll be testing the development environment on the RC2014 platform in the interim.

IMG_0084

RC2014 Serial I/O & CPU

There are a couple of major differences in the workflow required to program the YAZ180 from the RC2014. The RC2014 requires an EEPROM programmer to burn the resulting HEX file into its ROM. Eventually, the YAZ180 will use a PERL program to manipulate a parallel port to programme FLASH memory. However, for the purposes of setting up a development environment they are essentially the same.

img_0404

YAZ180 Prototype

To set up the required environment, we’ll need to have:

  • A C compiler suite capable of generating HEX or BIN files for burning onto the hardware.
  • Applicable .CRT files to initialise the CPU and RAM, either Z80 or Z180 specific, so that the C environment can be properly launch.
  • Suitable library files for USART, and other interfaces, appropriate for the hardware in use.
  • Configuration to allow the correct tools and libraries to be found from within the Eclipse IDE.

A C Compiler Suite

There are only a few options for C compilers for the Z80 processor. There is a the Zilog development environment, and the SASM Softools. On the open source side there are two options worth mentioning, being the Small Device C Compiler (SDCC) and the Z88DK Small C Compiler.

There are a few reviews on the Internet of the various options, but in summary the best outcome seems to be to use the Z88DK together with the SDCC Compiler, and the “new library”.

The Z88DK team contributed this information to the RC2014 forum, which gives an overview of the options.

There are two C compilers. One C compiler is sccz80 which is derived from small C but z88dk’s version has seen continuous development over the past 30 years so it’s had most of the limitations of small C removed. For example, floating point is supported, ANSI C declarations are supported, 8/16/32-bit integers are supported and so on. It is a little short of C89 compliance with a few notable non-compliances being multi-dimensional arrays and function pointer prototyping.

The other C compiler is a patch of sdcc, another open source compiler that attempts to implement subsets of C89, C99 and C11. sdcc is an optimizing compiler and z88dk’s patch improves on sdcc’s output by supplying some Z80 bugfixes not yet incorporated into sdcc itself and by supplying a very large set of peephole rules to further improve output.

You can choose which C compiler you use by selecting the appropriate switch on the command line. In your makefile you are using sccz80. To use sdcc, “-clib=sdcc_ix” or “-clib=sdcc_iy” would appear in the compile line.

And then there are two C libraries.

The classic C library is the C library that has always shipped with z88dk. It has many crts available for it that allows compiling for a lot of target machines out of the box. The level of library support varies by target with the best supported having sprite libraries, sound, graphics, etc supplementing the standard c library. It is mostly written in machine code and has a small stdio implementation. However, at this time it cannot be used to generate ROMable code as it mixes variables with code in the output binary. It’s also not compatible with sdcc at this time. Both of these issues are being addressed now.

The new C library is a rewrite from scratch with the intention of meeting a subset of C11 compliance. It is 100% machine code, is written to be compatible with any C compiler, and can generate ROMable code with separation of ROM and RAM data. The stdio model is object oriented and allows device drivers to be written using code inheritance from the library. Although it’s not finished (it’s missing disk io and non-blocking io), it is in an advanced state.

The choice of C library is made on the compile line. “-clib=new”, “-clib=sdcc_ix” and “-clib=sdcc_iy” all use the new C library. Anything else uses the classic C library. In order to generate ROMable code, you should really be using the new C library.

The sdcc_ix and sdcc_iy libraries are chosen when sdcc is the compiler and are selected between by either ”-clib=sdcc_ix” or ”-clib=sdcc_iy” on the compile line. The difference between the two is which index register the C library uses. “sdcc_ix” corresponds to the library using ix and “sdcc_iy” corresponds to the library using iy.

It’s always preferable to use the “sdcc_iy” version of the library because this gives sdcc sole use of ix for its frame pointer while the library uses iy. If “sdcc_ix” is selected, sdcc and the library must share ix which means the library must insert extra code to preserve the ix register when it is used. This means the “sdcc_iy” compile will be smaller.

z88dk’s C library is different from other compilers in that it is written in assembly language, so it is more compact and faster than other z80 C compilers.

Installation instructions for z88dk here and I’d recommend using a nightly build rather than the last release. z88dk is an active project and it changes quite quickly. If you run on windows or mac there are binary packages available from the nightly build. For linux or other targets there are instructions for building from source and for patching sdcc to create zsdcc, z88dk’s version of sdcc.

Just to add for the ROM target: the new C lib allows the stored data section to be lz77 compressed so this should save a few bytes in the stored binary in ROM. Another thing you could do is compile a program for RAM and store a compressed copy in ROM that gets decompressed into RAM at startup.

Z88DK & SDCC Installation

I’m installing Z88DK and SDCC onto Ubuntu 16.04 AMD64 and, since the machine has recently been refreshed, many packages that were required for the install were missing.

Clone the latest nightly checked Z88DK Github package:

git clone https://github.com/z88dk/z88dk.git

sudo apt-get install expect texinfo libxml2-dev flex bison gputils libboost-dev

This will create a populated z88dk directory in the current working directory.

To succeed in building the ‘z80svg’ graphics tool you need the ‘libxml2’ library to be previously installed, although its absence will not prevent the rest of the kit from building.

Then, just type:

cd z88dk
git submodule update --init --recursive
chmod 777 build.sh (just in case)
./build.sh

You can run z88dk keeping it in the current location, all you need to do is to set the following environment variables.

Supposing you have bash (most likely it is your system default shell) and you want to keep z88dk in your home directory, you can configure it permanently in this way:

vi ~/.profile

Modify the configuration by adding these lines (with the appropriate paths).

export PATH=${PATH}:${HOME}/z88dk/bin
export ZCCCFG=${HOME}/z88dk/lib/config

A system install is not supported in this release of Z88DK.

Then to install the SDCC compiled specifically for the Z80 and Z180 these are the instructions.

Check out the current development version of sdcc. If you already have the sdcc-code tree available from a previous checkout you can instead perform an update.

svn checkout svn://svn.code.sf.net/p/sdcc/code/trunk@9958 sdcc-code
# or if you're doing this to refresh your sdcc installation...
cd sdcc-code
svn update

You will have to apply the svn patch found in sdcc_z88dk_patch.zip and build sdcc from source. Copy “sdcc-z88dk.patch” from inside sdcc_z88dk_patch.zip into the sdcc-code directory.

The supplied configuration options disables all ports other than the Z80 family ports, and turns off compilation of many libraries. This will prevent errors from completing the build process, and results in a smaller binary.

cd sdcc-code/sdcc
patch -p0 < ../sdcc-z88dk.patch
cd sdcc
./configure --disable-mcs51-port --disable-gbz80-port --disable-avr-port --disable-ds390-port --disable-ds400-port --disable-hc08-port --disable-pic-port --disable-pic14-port --disable-pic16-port --disable-stm8-port --disable-tlcs90-port --disable-s08-port --disable-ucsim --disable-device-lib --disable-packihx
make

Copy the patched sdcc executable to {z88dk}/bin and rename it “zsdcc”.
Copy the sdcc preprocessor to {z88dk}/bin and rename it “zsdcpp”.

cd bin
cp sdcc {z88dk}/bin/zsdcc
cp sdcpp {z88dk}/bin/zsdcpp

Undo the patch.

cd ..
patch -Rp0 < ../sdcc-z88dk.patch

You can stop here and verify the install was successful below. Keeping the sdcc source tree in an unpatched state can allow you to update the zsdcc binary by repeating the steps above as sdcc itself is updated. Both z88dk and sdcc are active projects that see frequent updates.

To verify that sdcc is usable from z88dk, try compiling sudoku.c for the rc2014 target using sdcc:

zcc +rc2014 -subtype=rom -v -m -SO3 --max-allocs-per-node200000 --c-code-in-asm --list sudoku.c  -o sudoku -create-app

Using the C compiler

Assuming we have a source code called test.c

#include 

main()
{
     return(0);
}

We can compile it and produce binary CODE and DATA sections. The CODE and DATA sections need to be concatenated, and then assembled into an Intel HEX file by appmake.

zcc +rc2014 -subtype=rom -v -m -SO3 --max-allocs-per-node200000 --c-code-in-asm --list test.c -o test -create-app

The binary code can be checked by installing and then using a disassembler z80dasm

sudo apt install z80dasm
z80dasm --address --labels --origin=0x0 test.bin

Loading the Code

Eventually the YAZ180 will have a hardware USB interface, and Perl based loading mechanism to load both RAM and Flash storage. But, since I broke the only extant hardware interface, getting this function working will have to wait.

In the interim, I have to load assembled machine code into the YAZ180 via a back door, being via the YAZ180 Nascom Basic which I also have running. The back door is opened because the Basic interpreter has the capability to 1. insert or POKE arbitrary bytes into RAM located at any address, and 2. via a Basic instruction USR(x) jump into any location in RAM and begin executing code.

Because of these POKE, PEEK, and USR(x) instructions we can load our own program in two different ways. Firstly, we can encode our program as a series of poke instructions, and then let the Basic interpreter load the program code byte by byte. Whilst this is a practical way of loading smaller programs, it is quite inefficient and also somewhat difficult to confirm that the program is loaded into RAM correctly. Also, this method cannot handle writing to Flash, as the POKE command is only designed for RAM.

The second method is to take a two step approach. Use the previous method of generating POKE instructions to insert a small Intel HEX format capable program, or HexLoadr, into the RAM, and then use the USR(x) instruction to launch the HexLoadr which also reads the serial port, and inserts the read HEX formatted bytes into RAM or Flash. The first advantage of this method is efficiency because the density of program bytes is substantially higher in Intel HEX than it is in POKE instructions. Also, because we can craft the HexLoadr with any functions we choose, we can also enable it to configure the Z180 MMU using the Intel HEX Extended Segment Address, and program the entire physical address space of the YAZ180, and we can deploy capability to write Flash memory making changes written then permanent.

HexLoadr

The goal of the HexLoadr program is to load your arbitrary program in Intel HEX format into an arbitrary location in the Z80 address space, and allow you to start the program from Nascom Basic.

There are are several stages to this process.

  • The HexLoadr.asm loader program must be compiled into a binary format, HEXLOADR.BIN.
  • HEXLOADR.BIN must then be converted to a series of POKE statements using the bin2bas.py python program.
  • These POKE statements are then loaded through the serial interface into Nascom Basic to get the HexLoadr program placed correctly into the RAM of the RC2014 or YAZ180 machine.
  • The starting adddress of the HexLoadr program must be inserted into the correct location for the USR(x) jump out of Nascom Basic.
  • Then the HexLoadr program will initiate and look for your program’s Intel HEX formatted information on the serial interface.
  • Once the final line of the HEX code is read, the HexLoadr will return to Nascom Basic.
  • The newly loaded program starting address must be loaded into the USR(x) jump location.
  • Start the new arbitrary program by entering USR(x).

Important Addresses

There are a number of important Z80 addresses or origins that need to be modified (managed) within the assembly and python programs.

Arbitrary Program Origin

Your program (the one that you’re doing all this for) needs to start in RAM located somewhere. Some recommendations can be given.

For the RC2014 with 32kB of RAM, and the YAZ180 with 56kB of RAM available, when Nascom Basic initiates it requests the “Memory Top?” figure. Setting this to 57343 (0xDFFF), or lower, will give you space from 0xE000 to 0xFFFF for your program and for the hexloader program.

The eXit option on my initiation routine for Nascom Basic is set to jump to 0xE000, Under the assumption that if you are jumping off at restart you are interested to have a large space for your arbitrary program.

For the YAZ180 with 56kB of RAM, the arbitrary program location is set to 0x2900, to allow this to be in the Common 0 Space for the MMU. Further for the YAZ180, the MMU Bank Space is configured from 0x4000 through to 0x7FFF so that the entire address space can be written by configuring the physical location at which the HexLoader operates.

HexLoadr supports the Extended Segment Address Record Type, and will store the MSB of the ESA in the Z180 BBR Register. The LSB of the ESA is silently abandoned. When HexLoadr terminates the BBR is returned to the original value.

HexLoadr Program Origin

For convenience, the HexLoadr program is configured to load itself from 0xFF00. This means your arbitrary program can use the space from 0xE000 to 0xFEFF without compromise. Further, if you want to use a separate stack or heap space (preserving Nascom Basic) the HexLoadr program space can be overwritten, by setting the stack pointer to 0x0000 (which decrements on use to 0xFFFF).

This can be changed if substantial code or new capabilities are added to the HexLoadr program

RST locations

For convenience, because we can’t easily change ROM code interrupt routines already present in the RC2014 or YAZ180, the serial Tx and Rx routines are reachable by calling RST instructions.

* Tx: RST 08H expects a byte in the a register.
* Rx: RST 10H returns a byte in the a register, and will loop until it has a byte to return.
* Rx Check: RST 18H will return the number of bytes in the Rx buffer (0 if buffer empty) in the a register.

Program Usage

  1. Select the preferred origin .ORG for your arbitrary program, and assemble a HEX file using your preferred assembler.
  2. Confirm your preferred origin of the HexLoadr program, and adjust it to match in the hexloadr.asm and bin2bas.py programs.
  3. Assemble hexloadr.asm using TASM to produce a HEXLOADR.BIN file using this command line.
    c:\> tasm -80 -x3 -a7 -c -l -g3 d:hexloadr.asm d:hexloadr.bin
  4. Produce the “POKE” file called hexloadr.bas by using the python command.
    $ python bin2bas.py  HEXLOADR.BIN > hexloadr.bas
  5. Start your RC2014 or YAZ180 with the Memory top? set to 57343 (0xDFFF) or lower. This leaves space for your program and for the HexLoadr program.
  6. Using a serial terminal (assuming your machine is located at device /dev/ttyUSB0) either copy and paste all of the POKE commands into the RC2014, or upload them using a slow (or timed) serial loading program. If desired the python slowprint.py program can be used for this purpose.
    $ python slowprint.py  /dev/ttyUSB0<
  7. From the ok prompt in Basic, start the HexLoadr program with PRINT USR(x).
  8. Using a serial terminal, upload the HEX file for your arbitrary program that you prepared in Step 1. If desired the python slowprint.py program can also be used for this purpose.
    $ python slowprint.py  /dev/ttyUSB0
  9. Using POKE commands relocate the address for the USR(x) command to point to .ORG of your arbitrary program.
  10. When HexLoadr has finished, and you are back at the Basic ok prompt start your arbitrary program using PRINT USR(x), or other variant if you have parameters to pass to your program.

Credits

HexLoadr is derived from the work of @fbergama and @foxweb.

RC2014 Troubleshooting

So I soldered it all together, and it doesn’t work. Typical. It looked so easy, all of the instructions are straightforward, and the boards are clear and labeled for easy assembly.

I guess this is the story for many projects and some of them never proceed past this point and end up in the junk box. But, sometimes there’s a guide for what to do when there’s trouble.

So this is my guide to how I fixed my RC2014.

1. Power supply

I installed a 7405 linear regulator into the provided slot on the backplane. Because there was no space for a protection diode I added one in series to the power input terminal. First, remove all of the cards from the backplane. We’ll start with power supplies. Using a 12V supply, let’s check that there is 5V and GND available to every backplane slot.

IMG_0107

7805 Regulator with 1A linear diode in Vin.

2. Reset

I have used the new backplane Reset function. Let’s test that it is effective in providing 5V pull up normally, and pull down to GND when the reset button is depressed.

3.Clock Function

The CPU requires a Clock, and that is provided by the small PCB containing the crystal and the buffer amplifiers. I didn’t equip the Reset button or resistor on my build, because 2x resistors is not required, and only one switch was included in the materials provided.

Using an oscilloscope to watch the signal, the performance of the Clock and its crystal can be measured. So the crystal is oscillating and produces 7.353MHz with a good strong signal. And it is available across the backplane to all the slots.

[Add picture later. I’ve lost the USB stick.]

4. CPU General

Insert the CPU module and check that it has 5V power, GND, and the Clock and Reset lines are working as expected.

On the Address lines, there should be a signal at 1.232 MHz, representing the cycle of CPU running NOPs. With this signal in place, we can move on to the ROM and RAM modules.

[Add picture later, I’ve lost the USB stick.]

5. ROM & RAM

Now we check that there is power and ground at both of the ROM and RAM modules. If that is the case then it is back to the logic analyser to check what is happening on the system now that the CPU has access to instructions to read.

6. Logic Analyser

With an 8 input logic analyser we can’t look at all of the signals at one time, so let’s choose some relevant ones. The lower few address lines are interesting, because they show where the CPU is reading instructions as it starts up. Also, if the Serial I/O port is attached then the Tx and Rx lines can be monitored on the backplane too.

Reset the system to see what happens immediately after the Reset is released. Note the ASCII text message appearing on the Tx line on the backplane, noting the Geoff Seale copywrite, the invitation to choose cold or warm boot, and the memory size. This means that the RC2014 is living, but somehow the Serial I/O board is not functioning properly. So we need to focus attention there.

7. Serial I/O

The Serial I/O card should be connected to the FTDI Basic or other FTDI FT232R equivalent device. As I had a PL2303 based serial cable, I decided to use that, as it is much cleaner than using an FTDI adapter, and it allows the Serial I/O board to be positioned anywhere on the backplane.

IMG_0108

Prologic PL2303 Serial Cable, exits inline with backplane connector.

The Logic Analyser shows that the 63B50P chip is doing its job and producing characters on the Tx line. But we aren’t seeing characters on the FTDI Rx line on the Terminal. That is a problem. Note that there are 2.2kOhm resistors in series with the serial module Rx and Tx lines. That’s a bit more than I’d expect to see. Let’s reduce those resistors down to around 100 Ohm. They could be 0 Ohm, but it is better to be a bit conservative.

Reducing the in series resistance doesn’t allow either the FTDI or the Prologic interface to work either. There’s something else going on here.

With both the Logic Analyser and the FTDI interface attached to the RC2014 the comforting welcome from Greg Searle appears on the Terminal, urging us to cold or warm boot. Yet, when the logic analyser is removed the serial I/O is no longer working.

This looks like some kind of ground loop problem. I don’t know how this can be fixed easily. The RC2014 device only works when there is a ground provided by another source, such as the Logic Analyser.

8. Other issues

The Serial I/O device was delivered one incorrect IC socket (14 pin instead of 16 pin), so I had to solder one chip directly to the board. Not a big issue.

There is no mention of how the ROM system needs to be configured. There are three options available for selection. The right one is with all of the address lines set to 0.

The basic program is described on Greg Searle’s Simple Z80 web page. There is little mention that this this is where to go for additional software support and programming assistance.

Anyway, with the caviet that the device has to be connected with an external earth, the RC2014 is working perfectly.

9. Earth Issues

After asking Spencer for some ideas, and doing some further circuit testing, I found that I had somehow damaged the ground wire near the centre of the backplane. That was allowing the RC2014 to function when GND was provided by both FTDI and Logic Analyser on opposite sides of the backplane, but to fail when only one GND was provided.

I resolved this by soldering an additional wire along the entire GND line. Whilst  I could have just bridged the gap, I preferred to improve the GND stability by adding conductor all along its length. I covered the wire with hot-glue to increase its stability. As a side effect it adds a non-slip characteristic to the backplane, and helps to protect my desk from being scratched.

IMG_0105

GND wire hidden under a protective hot glue sheath.

Now that I’ve fixed my dodgy soldering on the GND line, everything is much better, and works perfectly

IMG_0106

The hot glue sheath helps to hold the solder pins off my desk too. Integrated non-slip.