# pouët.net

## Raymarching Tutorial

category: code [glöplog]
The seminar by las at Demoscene Outreach Event II has made me get interested in sphere tracing / ray marching, and since I was unable to find a really good beginners' tutorial on that topic, I decided to write one myself after reading everything related I could find on the web.

Since I'm not an expert on ray marching myself, I guess my tutorial might contain some shortcomings, so please, if you find anything wrong, correct my errors.

On ray casting, ray tracing, ray marching and the like

There is a hype about ray marching in the demoscene. This is what people call a technique that is used mostly in 4k procedural graphics and 4k intros to create 3d pictures with interesting looks. Since there used to be no comprehensive tutorial on the matter, many people initially found it confusing to understand the underlying principle, which we can see in the huge number of threads in the pouet.net BBS dealing with various aspects of ray marching. The answers more experienced coders gave were partially a bit hard to grasp, and so I guess there are still some people around who do not really feel satisfied with their understanding of ray marching. For these people, I've written the following article.

History: It's all about 3d

While the first scrollers were strictly two-dimensional, 3d effects became omnipresent in demos rather soon. Already in C64 demos of the 1980s we can spot things such as rotating cubes. While first only the lines of the cubes were displayed, it did not take long until techniques to fill the sides with colours emerged: flat shading, gouraud shading, phong shading - you name them. Cubes were followed by more complex objects such as dodecahedra. To display spheres, a technique known as tesselation was used: the round surface was subdivided into a lot of small polygons; thus the shape of the sphere was approximated.

Techniques such as ray marching are a different way of displaying 3d objects. Since they are computationally more expensive, they were not often used in demos until computers became more powerful; you might remember the 64k intro Heaven Seven by Exceed from 1999, which featured realtime ray tracing. The demoscene, however, would not be the demoscene if it did not try to break barriers, and by now you can also find some sorts of ray casting in productions made for weaker systems.

Terminology

Ray casting, ray tracing, ray marching - are they different things, or are they just different names for the same thing or similar things? The latter possibility applies. As a matter of fact the original method is ray casting, and the other two are just variants of it. Ray tracing is a more sophisticated algorithm involving diffraction and reflection, thus generating images looking even more realistic; alas, it is computationally more expensive, of course. Ray marching, which is the thing that is so popular in the demoscene at the moment, is a variant of ray casting that permits the use of objects for which there is no analytic formula so that the intersection with the ray cannot be simply computed by solving an algebraic equation. Ray marching is also called sphere tracing, for a reason I will explain later on. Actually what's most widely used in the demoscene is not the general ray marching algorithm, but a variant of it involving a distance field - I'll show you what this is all about.

General Ray casting

In general, ray casting is about displaying a 3d scene involving one or several objects. The basic idea is: You have a camera (you may also call it viewport, or the human eye) that looks at a surface (the screen). The objects are located behind this surface and are projected on it. What ray casting is all about is to determine the colour of each pixel on the screen. Imagine that you send out rays from your eye, which pass through the surface. Each pixel can be accessed by one ray. If you continue following the ray behind the surface, it will sooner or later hit one of the objects in the scene. This is the object you see at that pixel. In the most basic variant of ray casting, the colour of the pixel is simply a function of the distance from the camera to the point where the ray hits the object (the intersection point). So the ray casting algorithm has to calculate the intersection points of all the rays with all the objects, pick the closest intersection point and calculate the distance to the camera.

How to compute the intersection points?

To compute the intersection point of a ray with an object, you need the formulas of both the ray and the object surface.

A ray is a straight line. In 2d, straight lines usually have a formula of the form:

y = k * x + d

In 3d, the same formula can be used, but we also need to compute the z coordinate for each x coordinate:

z = l * x + e

How to get k, d, l and e? All you have at first is the location of the camera x1, y1, z1 and the coordinates of the point on the surface (screen) the ray goes through x2, y2, z2. From these pieces of information k and l can be calculated easily:

k = (y2 - y1) / (x2 - x1)
l = (z2 - z1) / (x2 - x1)

d and e can then be computed by substituting x with x1, y with y1, z with z1 and inserting the already obtained formulas for k and l:

d = y - k * x = y1 - x1 * (y2 - y1) / (x2 - x1) = (x2 * y1 - x1 * y2) / (x2 - x1)
e = (x2 * z1 - x1 * z2) / (x2 - x1)

Eventually we get the following formulas for a straight line (ray):

y = x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)
= ((x - x1) * y2 - (x - x2) * y1) / (x2 - x1)
z = ((x - x1) * z2 - (x - x2) * z1) / (x2 - x1)

We get the intersection point of a ray with an object by inserting the formulas for y and z into the formula that describes the object and solving for x. E.g. for a sphere:

(x - x0)^2 + (y - y0)^2 + (z - z0)^2 = r^2

Here x0, y0, z0 are the coordinates of the center point.

This formula must be converted to:

x^2 - 2 * x0 * x + x0^2 + y^2 - 2 * y0 * y + y0^2 + z^2 - 2 * z0 * z + z0^2 - r^2 = 0

Now we can insert the fomulas for y and z:

x^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 = 0

To solve this to quadratic equation, a tool like Maple may be useful. I entered the equation above surrounded by solve(..., x) in Maple and got two really, really looooooooong solutions. Typing these solutions into a programming language IDE in a form the programming language understands is tedious, error-prone work. So what else can be done?

Ray marching

The ray marching algorithm has the advantage over traditional ray casting that you need not compute intersection points. Instead, you walk along the ray and simply check after each step whether you've hit an object.

The distance you walk per step may be constant or variable. In the simpler, constant case, the easiest way is to take the vector P1P2 (where P1 is the location of the camera and P2 the surface point the colour of which you want to compute) and add it to the current point at each step, starting with P2. So you need not even compute an analytic function of the ray. All you do is calculate the x, y, z coordinates for the current point and insert it into the formulas of the objects. Now if some point Pa has coordinates for which the sphere equation results in the left side being lower than the right side, and the next point Pa+1 yields the opposite result (the left side is greater than the right side), you know that somewhere between Pa and Pa+1, the sphere has been hit. By keeping track of the number of steps you have walked on the ray, you have automatically obtained the distance, and thus are able to calculate the colour of the pixel to put.

This approach is far easier to implement than standard ray casting. It may be a bit slow if the distance you walk on the rays is low and the objects are far away, but that does not seem to matter much. A real disadvantage is that it is not precise and depends on the distance you walk per step. The larger the distance per step, the faster it goes, but the less accurate it is. Large distances per steps also have the disadvantage that you might overlook objects as it may be possible that they are so small that you virtually step over them, and do not notice that the ray has hit them between two subsequent points. (You can only detect an object if one point is outside of it, and the next point inside. If both points are outside, then you will not notice anything.) So the distance per step is a crucial point.

Distance field

What is especially popular in the demoscene at the moment is distance fields. The basic idea is that the distance per step is not constant, but variable. It depends on the distance of the current point to the object that is closest to it. If you know that distance, it is safe to walk exactly that distance on the ray, as chances are zero that you will hit an object if you walk less than that.

If you are like me, your first thought on hearing about that technique will be: "WTF?! What sense does this make?" After all, ray marching is all about computing the distance from the camera to the closest object in the direction of the ray. So why compute distances to objects while walking on the ray - each of these calculations is as computationally expensive as the original problem, i.e. you try to solve an instance of a problem by solving several instances of itself; that won't work.

The answer to that paradoxon is that you do not compute the distances to the closest objects, but read them from a 3d matrix - the distance field. In other words, the distances to the closest objects from each 3d point (voxel) must be already known. Demosceners mostly use this technique to render interesting images such as the 4k procedural graphics by iq of rgba.

Other applications of ray marching

Ray marching also has the advantage over traditional ray casting that you need not have analytic formulas of the objects in the 3d scene. You may also use it with volumetric data, like a set of voxels. This is the technique that is used in 3d reconstruction of CT scans in clinical radiology, for instance.

Sphere tracing

The reason why ray marching is also called sphere tracing: You can picture the distance between two steps as the radius of a sphere. Each of these spheres touches at least one object. So you virtually follow the intersection points of these spheres with the ray.

Ray tracing

A couple of words on ray tracing: The main idea behind that is that you do not only compute the ray from the camera to the object, but also from the object to the light source, whereever it is. By checking whether an object actually gets light and taking this into consideration in the formula computing the colour of the pixel, you can get more realistic pictures. Moreover, reflection and diffraction can be implemented as well, to make the pictures looking even more like photos.

Conclusion

This article has been the most demoscene-related of the technical articles I've written so far, and I hope that it has fulfilled its purpose.

Links related to this article

The original paper about sphere tracing (page not available?)
Rendering Worlds with Two Triangles - presentation by iq/rgba
A paper about Ray tracing implicit surfaces on the GPU
An unfinished draft of a tutorial on distance fields and sphere tracing
A paper on Generalized Distance Functions
Wikipedia on Ray casting
Wikipedia on Ray tracing
Wikipedia on Distance fields
Pouet.net: Raymarching Beginners' Thread
Pouet.net: Raymarching Toolbox Thread
Pouet.net: Thread about distance field equations
Pouet.net: Problem with ray marching when applying twist distortion
Pouet.net: Raymarching idea - Invertible Surfaces
Pouet.net: Thread about the making of Cdak by Quite
Pouet.net: Trick for 1k raymarchers - lighting
Pouet.net: Implicit function to distance function
Example: Terrain Ray Marching in Processing
Example: An 1k JavaScript intro involving ray marching
Example: 704 (anther 1k JavaScript intro)
Example: Valleyball by BluFlame
Example: Legoland 2 by Fairlight (ray tracing on C64)
Example: Slisesix by Rgba
Example: Fallty by Loonies (one of the first intros using ray marching)
Example: Puls by Rrrola (256b intro using ray marching)
Example: Cdak by Quite and Orange
Example: nop by Stroboholics and Metalvotze
Oh, this is just what we needed - nobody seems to be discussing raymarching nowadays!
Adok: wow, for someone with self-proclaimed journalistic abilities, you should probably have picked up on the fact that your "Get started with.."-thread finishes up with links to 16 existing (and way better) resources for the same thing.
I am sorry but I have to tell you honestly what your article is NOT:
- interesting
- informative
- supported by nice colorful explaining images
- filled with nice simple code snippets
- uncomplicated
- straightforward
- correct

e.g.
Quote:

A ray is a straight line. In 2d, straight lines usually have a formula of the form:
y = k * x + d

Okay, I know that one from school! Yeah. y = mx + b.
That's the worst representation one can use for a line / ray in 2D. Let me ask you one simple question: What is m for this line: |.

And what has that crappy formula thing todo with raytracing, raymarching?!
I really hate the terminology part. Basically sphere tracing is just and damn intersection test. You can use that inside a raytracer.

Quote:

[...]but read them from a 3d matrix - the distance field.[...]

No comment.

Before writing such an article - you should have implemented it.
Actually given that you are in mensa - is this an tricky attempt to make me write an proper beginners tutorial?
Three simple steps for becoming a demoscene superhero:

a) hear about a great new technique that everyone has been using for 2 years

b) write a factually wrong beginner's tutorial about it from the random factoids you have gathered about it.

c) don't implement it, but feel smug about how you've enabled the demoscene to finally have a breakthrough with this amazing new technology.
For each nice (I decide what's a nice flame) upcoming flame post in this thread in the next 30 minutes I offer one free beer (at random demoparty in case we meet).

Urs just won the first.

Jeder bitte nur ein Kreuz!
Not exactly free beer material (too tired), but:

If you are like me, your first thought on hearing about that technique will be: "WTF?! What sense does this make?" After all, ray marching is all about computing the distance from the camera to the closest object in the direction of the ray.

No, it's not. And even if it were, I'd be kinda happy that whenever a mensa member thinks something along the lines of x^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z2 - x1 * z1) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 = 0 us normal people, or even an idiot (as I've been called by someone who can't even open his mouth wide enough to say hello) just calculates

distance = |(pos-center)|-radius

Yep, totally the same *sigh*
Also, don't write your "articles" outside of Hugi. As I've stated elsewhere: The moment they're readable through a medium that allows comparison to ANY other text they show their shallowness and general laughability. Yes, that's a word.

forgot the [quote] tags around the second paragraph, sorry.
Oh, and something about your fly being more open than your mouth. Also, your mother.
You'll get one too anyways ;)
Quote:
Quote:
[...]but read them from a 3d matrix - the distance field.[...]
No comment.
What's your problem, las? You are out of arguments and then think you are smart and/or funny by writing "no comment"? Now that's just lame.

Oh, and, I'd like an Augustiner.
Quote:
[...]in Maple and got two really, really looooooooong solutions.[...]

Not because it's complicated to intersect a ray with a sphere - but because the representation of your rays is crap.
chock: Yep - I'll organize that, you're welcome ;)
(due to "Zeitraumverschiebungen oder so" I'll add another 90 minutes beer-frame!)
I think Adok just got something wrong in the past. He probably read the phrase "Those who can, do. Those who can't, teach" somewhere and thought it was a tutorial.
Small correction:

Quote:
Now we can insert the fomulas for y and z:

x^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 = 0

should be:

Code:`x^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 = 0`
kb_: Please tell me if I understood you correctly:

You wrote:

distance = |(pos-center)|-radius

Does this mean that you calculate the distance of the camera to a point on the surface of a sphere by computing the length of the vector from that camera to the center of the sphere and subtracting the radius of the sphere?

If so, this will work only if the point on the surface of the sphere is located on the straight line between the camera and the center of the sphere, but if it is not, you will not get the correct result.
las: I have realized that techniques of ray tracing can be combined with ray marching and that this is exactly what is usually done in the demoscene. I'll try to update the article to make this point become more clearly expressed.
Regarding the representation of the line, the vector form P = A + k * AB is probably better suited for (intersection) calculations indeed. I'll change that as well.
Indeed, the computation of the intersection point becomes much easier with vector form. Here's the updated section on that matter:

How to compute the intersection points?

To compute the intersection point of a ray with an object, you need the formulas of both the ray and the object surface.

A ray is a straight line. Lines can be mathematically represented in various ways, one of them being the vector form:

P = A + t * AB

That means: Any point P is the sum of the coordinates of a point A plus the product of a vector AB and a scalar t. Actually we have three equations:

x = x0 + t * dx
y = y0 + t * dy
z = z0 + t * dz

We get the intersection point of a ray with an object by inserting the formulas for x, y and z into the formula that describes the object and solving for t. E.g. for a sphere:

(x - xc)^2 + (y - yc)^2 + (z - zc)^2 = r^2

Here xc, yc, zc are the coordinates of the center point.

Inserting this we get:

(x0 - xc + t * dx)^2 + (y0 - yc + t * dy)^2 + (z0 - zc + t * dz)^2 = r^2

(x0 - xc)^2 + 2 * (x0 - xc) * dx * t + dx^2 * t^2
+ (y0 - yc)^2 + 2 * (y0 - yc) * dy * t + dy^2 * t^2
+ (z0 - zc)^2 + 2 * (z0 - zc) * dz * t + dz^2 * t^2
= r^2

(dx^2 + dy^2 + dz^2) * t^2
+ ((x0 - xc) * dx + (y0 - yc) * dy + (z0 - zc) * dz) * t
+ (x0 - xc)^2 + (y0 - yc)^2 + (z0 - zc)^2 - r^2
= 0

So we have a quadratic equation with:

a = dx^2 + dy^2 + dz^2
b = (x0 - xc) * dx + (y0 - yc) * dy + (z0 - zc) * dz
c = (x0 - xc)^2 + (y0 - yc)^2 + (z0 - zc)^2 - r^2

t can now be computed by simply inserting these parameters into the general solution formula for quadratic equations:

t = -b +/- sqrt (b^2 - 4 * a * c) / (2 * a)

After obtaining t, we can compute x, y and z.
Quote:
If so, this will work only if the point on the surface of the sphere is located on the straight line between the camera and the center of the sphere, but if it is not, you will not get the correct result.

The point is that in sphere tracing one wants to calculate the distance of the current position to an object (in this case a sphere), i.e. to the closest point on the surface of the object. This is what kb's formula correctly does. You should really consider implementing at least the simplest type of sphere tracer before writing a tutorial about it.
Quote:
The ray marching algorithm has the advantage over traditional ray casting that you need not compute intersection points. Instead, you walk along the ray and simply check after each step whether you've hit an object.

Given equations A: P = origin + t * direction, and B: |P - centre| = radius, solve for t. You can do this by rewriting A AND B to isolate t (traditional raytracing), or by throwing different values of t at the problem to see what works (ray marching). In either case you're computing the intersection. So what you're saying doesn't really make sense.

Also, why you would (try to) solve the line-sphere intersection in terms of x,y,z is puzzling. You're searching a one-dimensional space (the ray), so you only have one variable. Anyway, all it takes is a Google search to find a much more elegant solution.

Quote:
The answer to that paradoxon is that you do not compute the distances to the closest objects, but read them from a 3d matrix - the distance field. In other words, the distances to the closest objects from each 3d point (voxel) must be already known. Demosceners mostly use this technique to render interesting images such as the 4k procedural graphics by iq of rgba.

This has nothing to do with how 4ks use distance fields for sphere tracing. Storing the distance field in a 3D matrix at any useful resolution would be computationally expensive, eat up loads of memory (probably more than you have) and add unnecessary complexity.

If you had actually taken the time to study the thing that you're writing about, just a bit, it would be obvious that yes, you do actually compute the distance to the nearest object, dozens of times per ray, and you do not look it up in a matrix. You cook up a single function that maps any point in space to the minimum distance to any object in your scene, and that's what a distance field is.

Quote:
The reason why ray marching is also called sphere tracing.

What? No. Sphere tracing is a specific technique for ray marching, not another word for the same thing.

Plus what everyone else has written. And more. The article is terrible in every way.

Quote:
This article has been the most demoscene-related of the technical articles I've written so far, and I hope that it has fulfilled its purpose.

What purpose would that be?
Who is that Adok person anyway?
Thanks to chock for replying for me in a meaningful way. That frees me to do more important things, eg. NOT wasting time with Adok and his plethora of mental conditions. Then again, there's one question I'd really, genuinely, un-sarcastically like to have an answer to:

Adok, why on earth do you feel obligated to write a tutorial about something that you have obviously no clue about?