[Solved] Need help on a raycasting light shader

Bigfoot71 · Post by **Bigfoot71** » Fri Jan 06, 2023 10:23 pm

Wow! Thank you for your answers, I finally got back to breaking my head on my shader.

Andlac028 wrote: ↑Fri Jan 06, 2023 7:31 am Make sure, you check only polygons in possible light radius. Also if you move only by one tile, you can make sure that at the beginning, you make add polygons from square with size of 2x radius of light and center of player and then just if player moves right, remove first collumn and add next collumn of polygons.
Also it seems like addPolygon updates light at each call, which can be expensive, so maybe try to simplify polygons or look in the lib, if light could be updated only after all changes to polygon, not on each change.

This is roughly what my code was doing even if it could be optimized. By doing tests (of the time spent on each step) it was clearly not the loops of my most problematic function, adding a single polygon could take up to 0.005 seconds some times. It's true otherwise that I had seen that in the code of Lighter (about addPolygon) but I didn't dare to touch it because I didn't want to waste my time too much especially if I broke it. I will try what you said in addition to what Darkfrei suggested.

Edit: Modification made, I removed this part in `addPolygon` and `removePolygon`:

Code: Select all

self.lightHash:each(x, y, w, h, function(light)
  updateLight(self, light)
end)

And the time of `addPolygon` was downright divided by a hundred and it still works the same in my case!

darkfrei wrote: ↑Fri Jan 06, 2023 8:30 am How about to convert tiles to rectangles/polygons?

I thought about it too but I no longer had the courage, it was very late, and I should have completely revised the function. But I kept what I had coded with fortunately so I will try to implement what you showed me!

Sasha264 wrote: ↑Fri Jan 06, 2023 7:15 pm Hi! Happy new year
There is just the shader, full example in the attachment:

And wow! I absolutely have to read this code in detail to see where I went wrong, that's exactly what I wanted to do, frankly all my respects

I would return given my advance as soon as I had done everything, the option I had chosen, etc. Thanks again !

Sasha264 · Post by **Sasha264** » Sat Jan 07, 2023 8:55 am

About raycasting shader performance: in this simple algorithm ray path have even, small steps. This is inefficient, but can help understand the basics. More suitable approach will be to use adaptive ray steps to minimize steps count. I can think of 2 different methods to do this: 1) raytracing and 2) raymarching or sphere tracing. This comes from 3d case, but will work fine in 2d as well. I think both of them could increase shader performance like ~5-20 times. Ask if you choose shader approach =)

darkfrei · Post by **darkfrei** » Sat Jan 07, 2023 11:31 am

Sasha264 wrote: ↑Fri Jan 06, 2023 7:15 pm Hi! Happy new year

There is just the shader, full example in the attachment:

Code: Select all

extern float light_radius;
extern vec2 light_pos;
extern float ambient;
extern vec2 size;
extern float wall_height;
extern Image collision_mask;

vec4 effect(vec4 color, Image texture, vec2 texture_coords, vec2 screen_coords) {

    vec4 pixel = Texel(texture, texture_coords);

    float dist = length(light_pos - screen_coords);

    if (dist > light_radius) {
        // that way I have actually rgb color multiplied, leaving alpha component untouched
        return vec4(pixel.rgb * ambient, 1.0);
    }

    // max number of ray steps
    float togo = floor(dist);

    // aproximatelly has length == 1.0, but not exactly. With this approach ray will arrive to light_pos way more accurately
    vec2 ray_step = (light_pos - screen_coords) / togo;
    // if instead we will do this there will be artifacts when light source close to a corner
    // vec2 ray_step = normalize(light_pos - screen_coords);

    vec2 current_position = screen_coords;

    // Original while (current_position != light_pos) is bad, because current_position will never be exactly == light_pos.
    // Just because they are floats. We can never compare floats with == and != only > < (or something like distance(a, b) < 0.001 if a and b is vec2, vec3)
    // Well, strictly speaking we can sometimes if floats actually are integers, like 0.0 or 2.0, but this was not the case.
    for (float passed = 0; passed < togo; passed += max(1.0, 1.0 / wall_height)) {
        // current_position changing inside (0..w-1, 0..h-1) but to access texture we need to normalize it to (0..1, 0..1), so need to divide by texture size
        if (Texel(collision_mask, current_position / size).r == 0.0) {
            return vec4(pixel.rgb * ambient, 1.0);
        }

        // it is better to check mask first, step later, not the other way around, fixes some wrongly enlighted pixels for us
        current_position += ray_step;
    }

    // I don't know how original shader managed to do light distance fade, so here I wrote some smoothstep for that purpose
    float light = ambient + (1.0f - ambient) * smoothstep(light_radius, 0.0, dist);
    return vec4(pixel.rgb * light, 1.0);
}

Something wrong with shaders on android.:

: Screenshot_20230107-003932.jpg (162.76 KiB) Viewed 1559 times

Also, is it possible to make with shaders something like this diagonal marching? You can check the tile just on the cross point between view line and diagonal of that tile, it means just Manhattan distance amount (abs(dx)+abs(dy)) of checked tile for a view line.
viewtopic.php?p=241909#p241909

Bigfoot71 · Post by **Bigfoot71** » Sat Jan 07, 2023 1:01 pm

darkfrei wrote: ↑Sat Jan 07, 2023 11:31 am Something wrong with shaders on android.

Nothing very serious, it is simply necessary to modify `float passed = 0` by `float passed = 0.0` I suppose. I am trying the modifications proposed by Sasha264 and I will test on Android at the end to debug all this, I will share what I would have done once finished, I also have to do tests between the technique with and without shader to be sure of the best option, I will share all this.

darkfrei wrote: ↑Sat Jan 07, 2023 11:31 am Also, is it possible to make with shaders something like this diagonal marching? You can check the tile just on the cross point between view line and diagonal of that tile, it means just Manhattan distance amount (abs(dx)+abs(dy)) of checked tile for a view line.
viewtopic.php?p=241909#p241909

I'm also going to be interested in what you say about that personally too, maybe Sasha264 could answer that directly.

Sasha264 · Post by **Sasha264** » Sat Jan 07, 2023 2:30 pm

Bigfoot71 wrote: ↑Sat Jan 07, 2023 1:01 pm ...it is simply necessary to modify `float passed = 0` by `float passed = 0.0` I suppose.

Exactly! My mistake. I fixed my original post. Some systems can be capricious about these 0 vs 0.0 vs 0.0f things for floats...

Bigfoot71 · Post by **Bigfoot71** » Sat Jan 07, 2023 4:06 pm

Its good ! I can finally say with certainty that the shader is much more efficient!

I've done a lot of time testing so I'll just share the most significant:

Displaying the game without shader takes me ~0.00013 seconds on average.

With shader it takes on average ~0.00008 seconds.

The average is calculated over 100 repetitions each. So if the `update` function is also taken into account to update the polygons, regardless of the optimizations made, it will be the shader that wins in the end.

I still tried to make the proposed optimizations to the shader but I'm starting to waste a lot of time on this problem which is finally solved but here is my very slightly modified version:

Code: Select all

extern float light_radius;
extern vec2 light_pos;

extern float ambient;
extern vec2 size;

extern float wall_height;
extern Image collision_mask;
extern vec2 mask_pos;

vec4 effect(vec4 color, Image texture, vec2 texture_coords, vec2 screen_coords) {

    vec4 pixel = Texel(texture, texture_coords);

    float dist = length(light_pos - screen_coords);

    if (dist > light_radius) {
        return vec4(pixel.rgb * ambient, 1.0);
    }

    float togo = floor(dist);
    vec2 ray_step = (light_pos - screen_coords) / togo;

    vec2 current_position = screen_coords - mask_pos;

    for (float passed = 0.0; passed < togo; passed += max(1.0 / wall_height, 1.0 / dist)) {

        if (Texel(collision_mask, current_position / size).r == 0.0) {
            return vec4(pixel.rgb * ambient, 1.0);
        }

        current_position += ray_step;

    }

    float light = ambient + (1.0f - ambient) * smoothstep(light_radius, 0.0, dist);
    return vec4(pixel.rgb * light, 1.0);
    
}

I added the value vec2 mask_pos because either the map is centered on the screen, or moves due to the camera.

And also after several failures to reduce the number of iterations of the loop by calculating adaptive ray steps based on the distance between the position of the light and the current pixel I finally settled on this simple modification:

Code: Select all

passed += max(1.0 / wall_height, 1.0 / dist)

Instead of:

Code: Select all

passed += max(1.0, 1.0 / wall_height)

I suppose it should slightly reduce the number of iterations (I would like to have a tip to easily debug this kind of thing in the shaders, because I don't know how).

I was also interested in how to determine if a line of sight intersects a tile diagonally using the Manhattan distance but I assumed it might be even more expensive since there would be a lot of tiles to calculate and it would have I had to find a way to reduce this number of tiles to calculate and in short it would have taken me even longer but I keep this idea if it can be useful for something else.

I may have misunderstood the principle with the distance of Manathan, maybe we could do without a loop in some cases by doing something like that if I understand correctly (this is just a theoretical example):

Code: Select all

vec2 checked_tile_coords = /* coordinates of the checked tile */;
vec2 diagonal_tile_coords = /* diagonal tile coordinates */;

float manhattan_distance = abs(diagonal_tile_coords.x - checked_tile_coords.x) + abs(diagonal_tile_coords.y - checked_tile_coords.y);

if (manhattan_distance >= length(light_pos - checked_tile_coords)) {
    // The line of sight does not cross the tile diagonally.
    // Use the calculated luminosity without considering the diagonal tile obstruction.
    float light = ambient + (1.0f - ambient) * smoothstep(light_radius, 0.0, dist);
    return vec4(pixel.rgb * light, 1.0);
} else {

    // The line of sight may cross the tile diagonally.
    // Use a "ray loop" to check if the line of sight actually intersects the tile diagonally and, if so,
    // perform the calculations to determine the brightness of the checked tile taking into account the obstruction of the tile diagonally.

    float togo = floor(dist);
    vec2 ray_step = (light_pos - screen_coords) / togo;
    vec2 current_position = screen_coords - mask_pos;

    for (float passed = 0.0; passed < togo; passed += max(1.0 / wall_height, 1.0 / dist)) {

        if (Texel(collision_mask, current_position / size).r == 0.0) {
            return vec4(pixel.rgb * ambient, 1.0);
        }

        current_position += ray_step;
    }

    // The line of sight does not cross any obstacle.
    // Use maximum brightness.
    return vec4(pixel.rgb, 1.0);
}

In any case, thank you very much for your help, I can finally say that it is completely solved.!

Here is a small animation of the final rendering:

Sasha264 · Post by **Sasha264** » Sat Jan 07, 2023 8:03 pm

Bigfoot71 wrote: ↑Sat Jan 07, 2023 4:06 pm Its good !

Hey! Nice animation =)

darkfrei wrote: ↑Sat Jan 07, 2023 11:31 am Also, is it possible to make with shaders something like this diagonal marching?

I think that diagonal marching is a pretty clever idea for that case, when we do not have smaller shadow-casting objects. It should reduce steps count by ~16 (our tile size), but each step will be a little heavier.

It took me a while to figure out how to do that without a lot of ifs. Here is my variant:

Code: Select all

extern float light_radius;
extern vec2 light_pos;
extern vec2 debug_pos;
extern float show_debug_ray;
extern float ambient;
extern vec2 size;
extern float tile_size;
extern Image collision_mask;

void swap(inout vec2 a, inout vec2 b) {
    vec2 temp = a;
    a = b;
    b = temp;
}

// convert v from original coord system to diagonal
// so that diagonal lines became horizontal or vertical, and distance between them became 1.0
// it is combination of: rotate by pi/4.0 to the "dir" direction, scale by sqrt(2.0) and scale by 1.0/tile_size
void toDiagonalCoord(inout vec2 v, float dir) {
    v = vec2(v.x - dir * v.y, dir * v.x + v.y) / tile_size;
}

// convert back from diagonal coord system to original
void fromDiagonalCoord(inout vec2 v, float dir) {
    v = vec2(v.x - dir * v.y, dir * v.x + v.y) * 0.5 * tile_size;
}

// used only for debug purposes: to show single ray and its steps
float debugRay(vec2 a, vec2 b, vec2 uv) {
    float result = 0.0;

    if (a.x > b.x) swap(a, b);
    float dir = (b.y < a.y) ? 1.0 : -1.0;
    toDiagonalCoord(a, dir);
    toDiagonalCoord(b, dir);
    toDiagonalCoord(uv, dir);

    for (float x = ceil(a.x); x <= floor(b.x) + 0.01; x++) {
        float part = (x - a.x) / (b.x - a.x);
        float y = mix(a.y, b.y, part);
        vec2 p = vec2(x, y);
        float add = smoothstep(0.0, tile_size, 1.0 / distance(p, uv));
        add += pow(add, 0.6);

        fromDiagonalCoord(p, -dir);
        result += mix(0.1 * add, add, Texel(collision_mask, p / size).r);
    }

    return result;
}

vec4 effect(vec4 color, Image texture, vec2 texture_coords, vec2 screen_coords) {

    vec4 pixel = Texel(texture, texture_coords);

    // dark color, used in many places where there is no light
    vec4 dark = vec4(pixel.rgb * ambient, 1.0);

    // for debug ray visualization
    vec2 a = debug_pos;
    vec2 b = light_pos;
    if (show_debug_ray > 0.5) {
        return dark + debugRay(a, b, screen_coords);
    }

    // order is not important: a and b can be swapped
    // it does not matter if we trace ray from point to light or from light to point
    a = screen_coords;
    b = light_pos;

    // if start of the ray or end of the ray on the wall >> no light here
    if (Texel(collision_mask, a / size).r <= 0.5) {
        return dark;
    }
    if (Texel(collision_mask, b / size).r <= 0.5) {
        return dark;
    }

    // if distance more then max light radius >> no light here
    float dist = distance(a, b);
    if (dist > light_radius) {
        return dark;
    }

    // convert a and b points to convinient coordinate system
    // so that b.x > a.x and abs(b.x-a.x) > abs(b.y-a.y)
    // so that instead of whole plane we will need to work only with quarter from -pi/4 to +pi/4, like a pie "<"
    if (a.x > b.x) swap(a, b);
    float dir = (b.y < a.y) ? 1.0 : -1.0; // this is almost sign(a.y - b.y), but it can not be 0.0, and it is important
    toDiagonalCoord(a, dir);
    toDiagonalCoord(b, dir);

    // trace ray by x direction (thanks to convinient coordinate system x is dominant under y)
    for (float x = ceil(a.x); x <= floor(b.x) + 0.01; x++) {
        float part = (x - a.x) / (b.x - a.x);
        float y = mix(a.y, b.y, part);
        vec2 p = vec2(x, y);

        // convert result point to original coord system, so I can take a probe from collision_mask
        fromDiagonalCoord(p, -dir);
        if (Texel(collision_mask, p / size).r < 0.5) {
            return dark;
        }
    }

    // I don't know how original shader managed to do light distance fade, so here I wrote some smoothstep for that purpose
    float light = ambient + (1.0 - ambient) * smoothstep(light_radius, 0.0, dist);
    return vec4(pixel.rgb * light, 1.0);
}

Final result looks exactly the same as before.
And here is this: Hold left mouse to see debug ray with its probe points:

: debug-ray.png (164.18 KiB) Viewed 1476 times

Sasha264 · Post by **Sasha264** » Sat Jan 07, 2023 8:51 pm

I would like to have a tip to easily debug this kind of thing in the shaders, because I don't know how

Yes, debugging shaders is a tricky exercise:

You can return some vivid colors, like red, in areas where some condition is met.
You can return float values as color components. For example, there is a calculation of H, and H can be from 0 to 15. So, you write something like result.r = H / 15.0, and see how red are result pixels.
For moving-animated values it is good to convert them to "zebra"-values. For example if G is gradient from left to right that is sliiiiightly increasing with time, you can do something like result.rgb = vec3(0.8 + 0.2 * sin(G * 50.0)). Or not 50, but other "big" value. So you will see moving lines along that desired gradient.
Personally I like to send mouse coord to the shader, and change some calculations based on mouse coord. So, for example, to the left of my mouse I see variant 1, and to the right I see variant 2. Then I move my mouse from left to right. Very interactive =)

Bigfoot71 · Post by **Bigfoot71** » Sat Jan 07, 2023 9:22 pm

Wow ! I now understand the principle that had to be implemented and frankly my respects once again.

I had fun comparing all these shaders and it's pretty much the same thing. I based myself on 10 seconds of execution, doing 5 tries on each, counting only the display time and trying to do the same pattern with the mouse and here are the results (made from the Lua code of Sasha264 ).

First version by Sasha264:

0.000027 s
0.000026 s
0.000027 s
0.000035 s
0.000028 s

Second version I published:

0.000031 s
0.000026 s
0.000029 s
0.000023 s
0.000028 s

Latest version submitted by Sasha264 (having removed all the debug parts):

0.000030 s
0.000029 s
0.000029 s
0.000028 s
0.000029 s

The tests I performed are done on an NVIDIA G98M [Quadro NVS 160M] so rather an old thing

But it still allows you to get an idea (the one that it's almost the same in my opinion).

And personally I prefer the latter because I find that it reacts better when the point of light is very close to a wall.

The process I used for testing:

Code: Select all

local exec_time = 0
function love.update(dt)

	exec_time = exec_time + dt
	if exec_time >= 10 then
		love.event.quit(0)
	end

	--[[ ... ]]

end

local times = {}
function love.draw()

	local t1 = os.clock()

	love.graphics.setShader(shader)
	love.graphics.draw(visible)

	local t2 = os.clock()
	times[#times+1] = t2-t1

end

local function average(numbers)
	local sum = 0
	for _, number in ipairs(numbers) do
	  sum = sum + number
	end
	return sum / #numbers
end

function love.quit()
	print(string.format("%.6f", average(times)))
end

Edit: And thank you very much for these debugging tips, it's true that displaying things seems so obvious in the end, it's going to help me a lot !

Sasha264 · Post by **Sasha264** » Sun Jan 08, 2023 7:50 am

Hmm... Measuring performance is always an interesting thing, yes

I noticed some problems with this method some time ago. I fired up some pretty heavy shaders and measured time like you do. And results were just the same as for shaders that do almost nothing. That made me think that love.graphics.draw() does not actually rushes to draw things, it is kind of queue it to be drawn later. Maybe when love.graphics.present() is called inside https://love2d.org/wiki/love.run or something like that. I am not sure of that, I did not read specifics, maybe someone will explain it better. It is just my experience. Maybe love even have frame queue with multiple frames to be drawn in future, so the gpu will newer wait while slow lua code deside what to draw next. It is pretty common in game engines, but I don't know for sure...

The second idea would be to just measure fps with vsync turned off (as well as all other internal and external framerate limiters). But it is not so accurate, because there are a lot of things going on inside one frame along with my shader code, starting with default love.timer.sleep(0.001) every frame =) Also, my gpu starts to make some pretty scary sound when rendering something simple with like 3000 fps

So, for me I found another way to measure performance of some arbitrary shader or particle effect. I draw that effect many-many times inside 1 frame and tweak that draw count, so that fps drops to 60 (or choose another low enough framerate). And when it does make sure that gpu load is close to 100%, so it is bound by gpu, not some other nasty single-threaded cpu things.
-----------------------------------------------------------
Danger: make sure you gpu will not overheat xD
-----------------------------------------------------------
And I measured that way my first implementation:

Code: Select all

love.graphics.setShader(oldShader)
for i = 1, 1020 do
    love.graphics.draw(visible)
end

...and got 1020 times to draw this, when fps hits 60 and gpu load is solid 99%

But for my later implementation of optimization suggested by darkfrei I got:

Code: Select all

love.graphics.setShader(newShader)
for i = 1, 4500 do
    love.graphics.draw(visible)
end

So, it seems that the method indeed is about ~4 times faster.
At least for my gpu and for that scene size and topology, and for that light radius, and (maybe most important) for that tile size.
Maybe for smaller tiles, like 4 pixels (?) will be faster to just go pixel by pixel.

[Solved] Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: Need help on a raycasting light shader

Re: [Solved] Need help on a raycasting light shader

Re: [Solved] Need help on a raycasting light shader

Re: [Solved] Need help on a raycasting light shader

Who is online