Page 5 of 5
Re: speed
Posted: Sun Jun 23, 2024 6:06 pm
by pgimeno
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 5:12 pm
Ahhhhhhhh. So Is the code just repeatedly asking the cpu, to ask the gpu to draw a pixel, Over and over again? If so I get why its going so slow. Yes, Cross platform is what I am going for. Ok Ok. So. Thats why. That makes me wonder, why is turbowarp so fast at drawing then? Does it use gpu acellaration?
You're using 230,400 calls to a GL drawing routine. These calls are typically slow, basically for the reasons you've stated. I don't even know what turbowarp is so I can't answer the question.
I've rewritten large parts of your code to make it work with FFI and use a single draw call. I've tried to respect your indentation style, even though it's customary to unindent the `end` and the `else`. Here's what I got.
Re: speed
Posted: Sun Jun 23, 2024 6:24 pm
by UnixRoot
So you don't want to use the GPU to draw your geometry, but you want to write an software rasterizer?
Then you should use the FFI approach
Something like this. It's just a mockup
Code: Select all
-- you could use byteData instead, if you don't need to render it
depthBufferData = love.image.newImageData( width, height, "r32f")
depthBufferPtr = ffi.cast('float*', depthBuffer.data:getFFIPointer())
local function drawDepthBuffer(x, y, depth)
depthBufferPtr[y * depthBuffer.width + x] = depth
end
local function clearDepthBuffer(depthBuffer)
for i = 0, depthBuffer.width * depthBuffer.height -1 do
depthBuffer.ptr[i] = math.huge
end
end
for y = 0, 360-1 do
for x = 0, 640-1 do
drawDepthBuffer(x, y, depth)
end
end
Re: speed
Posted: Sun Jun 23, 2024 6:34 pm
by 1Minus2P1Stringer2
UnixRoot wrote: ↑Sun Jun 23, 2024 6:24 pm
So you don't want to use the GPU to draw your geometry, but you want to write an software rasterizer?
Then you should use the FFI approach
Something like this. It's just a mockup
Code: Select all
-- you could use byteData instead, if you don't need to render it
depthBufferData = love.image.newImageData( width, height, "r32f")
depthBufferPtr = ffi.cast('float*', depthBuffer.data:getFFIPointer())
local function drawDepthBuffer(x, y, depth)
depthBufferPtr[y * depthBuffer.width + x] = depth
end
local function clearDepthBuffer(depthBuffer)
for i = 0, depthBuffer.width * depthBuffer.height -1 do
depthBuffer.ptr[i] = math.huge
end
end
for y = 0, 360-1 do
for x = 0, 640-1 do
drawDepthBuffer(x, y, depth)
end
end
I dont care if I use the gpu or not. Infact i probaly should use the gpu.
Re: speed
Posted: Sun Jun 23, 2024 6:38 pm
by UnixRoot
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 6:34 pm
UnixRoot wrote: ↑Sun Jun 23, 2024 6:24 pm
So you don't want to use the GPU to draw your geometry, but you want to write an software rasterizer?
Then you should use the FFI approach
Something like this. It's just a mockup
Code: Select all
-- you could use byteData instead, if you don't need to render it
depthBufferData = love.image.newImageData( width, height, "r32f")
depthBufferPtr = ffi.cast('float*', depthBuffer.data:getFFIPointer())
local function drawDepthBuffer(x, y, depth)
depthBufferPtr[y * depthBuffer.width + x] = depth
end
local function clearDepthBuffer(depthBuffer)
for i = 0, depthBuffer.width * depthBuffer.height -1 do
depthBuffer.ptr[i] = math.huge
end
end
for y = 0, 360-1 do
for x = 0, 640-1 do
drawDepthBuffer(x, y, depth)
end
end
I dont care if I use the gpu or not. Infact i probaly should use the gpu.
You edited your post. You clearly wrote you wanted a pixel buffer/ depth buffer, that's why i posted a software rendering approach for an z-buffer. I'm out of this, you're more confusing than LUA
Re: speed
Posted: Sun Jun 23, 2024 6:45 pm
by 1Minus2P1Stringer2
UnixRoot wrote: ↑Sun Jun 23, 2024 6:38 pm
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 6:34 pm
UnixRoot wrote: ↑Sun Jun 23, 2024 6:24 pm
So you don't want to use the GPU to draw your geometry, but you want to write an software rasterizer?
Then you should use the FFI approach
Something like this. It's just a mockup
Code: Select all
-- you could use byteData instead, if you don't need to render it
depthBufferData = love.image.newImageData( width, height, "r32f")
depthBufferPtr = ffi.cast('float*', depthBuffer.data:getFFIPointer())
local function drawDepthBuffer(x, y, depth)
depthBufferPtr[y * depthBuffer.width + x] = depth
end
local function clearDepthBuffer(depthBuffer)
for i = 0, depthBuffer.width * depthBuffer.height -1 do
depthBuffer.ptr[i] = math.huge
end
end
for y = 0, 360-1 do
for x = 0, 640-1 do
drawDepthBuffer(x, y, depth)
end
end
I dont care if I use the gpu or not. Infact i probaly should use the gpu.
You edited your post. You clearly wrote you wanted a pixel buffer/ depth buffer, that's why i posted the software rendering approach. I'm out of this, you're more confusing than LUA
Im sorry i did edit my post, I want to make a "Z buffer", For my 3d engine. IM REALLY SORRY for confusing you. Infact the above post is perfect.
Re: speed
Posted: Sun Jun 23, 2024 6:50 pm
by 1Minus2P1Stringer2
pgimeno wrote: ↑Sun Jun 23, 2024 6:06 pm
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 5:12 pm
Ahhhhhhhh. So Is the code just repeatedly asking the cpu, to ask the gpu to draw a pixel, Over and over again? If so I get why its going so slow. Yes, Cross platform is what I am going for. Ok Ok. So. Thats why. That makes me wonder, why is turbowarp so fast at drawing then? Does it use gpu acellaration?
You're using 230,400 calls to a GL drawing routine. These calls are typically slow, basically for the reasons you've stated. I don't even know what turbowarp is so I can't answer the question.
I've rewritten large parts of your code to make it work with FFI and use a single draw call. I've tried to respect your indentation style, even though it's customary to unindent the `end` and the `else`. Here's what I got.
NICE! How does this even work tho?
"
renderFunc = require("Render")
renderFunc(pixel)
"
Isnt renderFunc already defined as including "Render".
Re: speed
Posted: Sun Jun 23, 2024 8:01 pm
by pgimeno
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 6:50 pm
NICE! How does this even work tho?
"
renderFunc = require("Render")
renderFunc(pixel)
"
Isnt renderFunc already defined as including "Render".
require() returns whatever the loaded file returns. renderFunc is the result returned by the included file "Render". If you go to Render.lua, you'll see that the chunk has a return statement at the end that returns a function.
That way, it's only loaded once and then reused, rather than loading the resource every frame. That is a bad idea in general, no matter whether it's a Lua file, an image, a font, a sound, or whatever.
I just simplified your scheme. Also, rather than checking for errors everywhere, you can let it err out. A Lua file included in the distribution is next to impossible to fail loading; it's OK to let Löve throw an exception and stop the program if it fails. If that's not acceptable for errors in general, you can always write your own error handler. Once debugged, the called function should never throw an error either, so I didn't use pcall() to call it.
Re: speed
Posted: Sun Jun 23, 2024 8:23 pm
by 1Minus2P1Stringer2
pgimeno wrote: ↑Sun Jun 23, 2024 8:01 pm
1Minus2P1Stringer2 wrote: ↑Sun Jun 23, 2024 6:50 pm
NICE! How does this even work tho?
"
renderFunc = require("Render")
renderFunc(pixel)
"
Isnt renderFunc already defined as including "Render".
require() returns whatever the loaded file returns. renderFunc is the result returned by the included file "Render". If you go to Render.lua, you'll see that the chunk has a return statement at the end that returns a function.
That way, it's only loaded once and then reused, rather than loading the resource every frame. That is a bad idea in general, no matter whether it's a Lua file, an image, a font, a sound, or whatever.
I just simplified your scheme. Also, rather than checking for errors everywhere, you can let it err out. A Lua file included in the distribution is next to impossible to fail loading; it's OK to let Löve throw an exception and stop the program if it fails. If that's not acceptable for errors in general, you can always write your own error handler. Once debugged, the called function should never throw an error either, so I didn't use pcall() to call it.
Ok thanks! This topic can now be closed!