Safe vsp by lft [web] | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
||||||||||||||
|
popularity : 75% |
|||||||||||||
alltime top: #2937 |
|
|||||||||||||
|
||||||||||||||
added on the 2013-02-11 13:05:34 by hedning |
popularity helper
comments
For non C64 sceners this may not look like it, but this is a milestone on the C64. The answer to a 20-25 year old mystery. Fantastic reverse engineering and brilliant work guys!
rulez added on the 2013-02-11 13:34:55 by nbm
LFT on C64. Must be a screen that looks like average, but it's really some new technical thing I don't understand. How else could LFT be on C64 :)
Ok, gonna read the scrolltext again to get it.
A fascinating read.
cool new effect
Here's LFT's full scrolltext, cleaned up for easy reading:
Technical lowdown:
The dreaded VSP crash is caused by a metastability condition in the dram. Some
have speculated that it has to do with refresh cycles, but hopefully the
detailed explanation in this scroller will crush that myth once and for all.
But first, this is what the machine behaves like from a programmer's point of
view. Let us call memory locations ending in 7 or F fragile. Sometimes when
VSP is performed, several fragile memory cells are randomly corrupted
according to the following rule: each bit in a fragile memory cell might be
changed into the corresponding bit of another fragile cell within the same
page.
This specific behaviour can be exploited in several ways: one approach is to
ensure that every fragile byte in a page is identical. If the page contains
code, for instance, corruption is avoided if all the fragile bytes are $ea
(nop). Similarly, in font definitions, the bottom line of each character could
be blank.
Another technique is to simply avoid all fragile memory locations. The
undocumented opcode $80 (nop immediate) can be used to skip them. Data
structures can be designed to have gaps in the critical places.
This latter technique is used in this demo, including the music player of
course. Data that cannot have gaps, i.e. graphics, is continuously restored
from safe copies elsewhere in memory. You can use shift lock to disable this
repair, and eventually you should see garbage accumulating on the screen. And
yet the code will keep running.
Thus, for the first time, the VSP crash has been tamed.
Now for the explanation. The c64 accesses memory twice in every clock cycle.
each memory access begins with the lsb of the address (also known as the row
address) being placed on an internal bus connected to the dram chips. As soon
as the row address is stable, the row address strobe (ras) signal is given.
each dram chip now latches the row address into a register, and this register
controls a multiplexer which connects the selected memory row to a set of
wires called sense lines. Each sense line connects to a single bit of memory.
The sense lines have been precharged to a voltage in between logical zero and
logical one. The charge stored in the memory cell affects the sense line
towards a slightly lower or higher voltage depending on the bit value. A
feedback amplifier senses the voltage difference and exaggerates it, so that
the sense line reaches the proper voltage representing either zero or one.
because the memory cell is connected (through the multiplexer) to the sense
line, the amplified charge will also flow back and refresh the memory cell.
hence, a memory row is refreshed whenever it is opened.
Vsp is achieved by triggering a badline condition during idle mode in the
visible part of a rasterline. When this happens, the vic chip gets confused
about what memory address to access during the half-cycle following the write
to $d011. It sets the internal bus lines to 11111111 in preparation for an
idle fetch, but suddenly changes its mind and tries to read from an address
with an lsb of 00000111.
Now, since electrical lines can't change voltage instantaneously, there is a
brief moment of time when each of the changing bits (bit 3 through 7) is
neither a valid one nor a valid zero. But because the vic chip changes the
address at an abnormal time, there is now a risk that the ras signal, which is
generated independently by another part of the vic chip, is sent while one or
more bus lines is within the undefined voltage range.
When an undefined voltage is latched into a register, the register enters a
metastable state, which means that its output will flicker rapidly between
zero and one several times before settling. This has catastrophic consequences
for a dram: the row multiplexer will connect several different memory rows,
one at a time, to the same sense lines. But as soon as some charge has moved
from a memory cell to the sense line, the amplifier will pull it all the way
to a one or a zero. If, at this point, another memory row is connected, then
the charge will travel from the sense line into this other memory cell. In
short, one memory cell gets refreshed with the bit value of a different memory
cell.
Note that because the bus lines change from $ff to $07, only memory rows with
an address ending in three ones are at risk of being opened simultaneously.
this explains why corruption can only occur in memory locations ending in 7 or
F.
Finally, this phenomenon hinges on the exact timing of the ras signal at the
nanosecond level, and on many machines the critical situation simply doesn't
occur. The timing (and thus the probability of a crash) depends on factors
such as temperature, vic revision, parasitic capacitance and resistance of the
traces on the motherboard, power supply ripple and interference with other
parts of the machine such as the phase of the colour carrier with respect to
the dotclock. The latter is assigned randomly at power-on, by the way, which
could be the reason why a power-cycle sometimes helps.
This is LFT signing off.
Technical lowdown:
The dreaded VSP crash is caused by a metastability condition in the dram. Some
have speculated that it has to do with refresh cycles, but hopefully the
detailed explanation in this scroller will crush that myth once and for all.
But first, this is what the machine behaves like from a programmer's point of
view. Let us call memory locations ending in 7 or F fragile. Sometimes when
VSP is performed, several fragile memory cells are randomly corrupted
according to the following rule: each bit in a fragile memory cell might be
changed into the corresponding bit of another fragile cell within the same
page.
This specific behaviour can be exploited in several ways: one approach is to
ensure that every fragile byte in a page is identical. If the page contains
code, for instance, corruption is avoided if all the fragile bytes are $ea
(nop). Similarly, in font definitions, the bottom line of each character could
be blank.
Another technique is to simply avoid all fragile memory locations. The
undocumented opcode $80 (nop immediate) can be used to skip them. Data
structures can be designed to have gaps in the critical places.
This latter technique is used in this demo, including the music player of
course. Data that cannot have gaps, i.e. graphics, is continuously restored
from safe copies elsewhere in memory. You can use shift lock to disable this
repair, and eventually you should see garbage accumulating on the screen. And
yet the code will keep running.
Thus, for the first time, the VSP crash has been tamed.
Now for the explanation. The c64 accesses memory twice in every clock cycle.
each memory access begins with the lsb of the address (also known as the row
address) being placed on an internal bus connected to the dram chips. As soon
as the row address is stable, the row address strobe (ras) signal is given.
each dram chip now latches the row address into a register, and this register
controls a multiplexer which connects the selected memory row to a set of
wires called sense lines. Each sense line connects to a single bit of memory.
The sense lines have been precharged to a voltage in between logical zero and
logical one. The charge stored in the memory cell affects the sense line
towards a slightly lower or higher voltage depending on the bit value. A
feedback amplifier senses the voltage difference and exaggerates it, so that
the sense line reaches the proper voltage representing either zero or one.
because the memory cell is connected (through the multiplexer) to the sense
line, the amplified charge will also flow back and refresh the memory cell.
hence, a memory row is refreshed whenever it is opened.
Vsp is achieved by triggering a badline condition during idle mode in the
visible part of a rasterline. When this happens, the vic chip gets confused
about what memory address to access during the half-cycle following the write
to $d011. It sets the internal bus lines to 11111111 in preparation for an
idle fetch, but suddenly changes its mind and tries to read from an address
with an lsb of 00000111.
Now, since electrical lines can't change voltage instantaneously, there is a
brief moment of time when each of the changing bits (bit 3 through 7) is
neither a valid one nor a valid zero. But because the vic chip changes the
address at an abnormal time, there is now a risk that the ras signal, which is
generated independently by another part of the vic chip, is sent while one or
more bus lines is within the undefined voltage range.
When an undefined voltage is latched into a register, the register enters a
metastable state, which means that its output will flicker rapidly between
zero and one several times before settling. This has catastrophic consequences
for a dram: the row multiplexer will connect several different memory rows,
one at a time, to the same sense lines. But as soon as some charge has moved
from a memory cell to the sense line, the amplifier will pull it all the way
to a one or a zero. If, at this point, another memory row is connected, then
the charge will travel from the sense line into this other memory cell. In
short, one memory cell gets refreshed with the bit value of a different memory
cell.
Note that because the bus lines change from $ff to $07, only memory rows with
an address ending in three ones are at risk of being opened simultaneously.
this explains why corruption can only occur in memory locations ending in 7 or
F.
Finally, this phenomenon hinges on the exact timing of the ras signal at the
nanosecond level, and on many machines the critical situation simply doesn't
occur. The timing (and thus the probability of a crash) depends on factors
such as temperature, vic revision, parasitic capacitance and resistance of the
traces on the motherboard, power supply ripple and interference with other
parts of the machine such as the phase of the colour carrier with respect to
the dotclock. The latter is assigned randomly at power-on, by the way, which
could be the reason why a power-cycle sometimes helps.
This is LFT signing off.
Now I read the whole story, then went to read older articles about the history of VSP, I can appreciate more. Wow, what a technical scroller!
hardcore hacking
Cool gears.
Great :)
good
Very interesting. Wow!
"LFT - coding on every hardware - with a multimeter in one hand". Hmm, maybe I'll send an Atari there with mysterious letter indicating there might be a hidden graphics mode which...
Cool scroller.
nice achievement in software.
This is history in the making right here :) Reminds me of reading about how people discovered how early HMOVE or the Comic Ark starfield worked.
So, how long until VICE emulates this properly?
So, how long until VICE emulates this properly?
Great hacking!
Agree!!1
Comprehensive explanation and it just hit Hacker News :)
It is a great achievement solving this old mystery. However, the number of machines suffering from the "bug" is rather small, and the workaround is not trivial (avoiding / restoring all addresses ending in $7 or $F)
So hardcore that I am surprised by the lack of mandatory coder colors. Anyways, it needs better direction. \;b>
.
visy said it. :)
Video capture please.
Cool invention.
Superb work.
submit changes
if this prod is a fake, some info is false or the download link is broken,
do not post about it in the comments, it will get lost.
instead, click here !