Page 1 of 5
speed
Posted: Sat Jun 08, 2024 10:59 pm
by 1Minus2P1Stringer2
im making a z buffer, so im testing speed. why does this run so slow?
Re: speed
Posted: Mon Jun 10, 2024 1:11 am
by RNavega
You're doing it in a brute force way with lots of redundant work. "love.draw()" is supposed to only have very fast code, as it executes at real-time speeds like 60FPS or more.
But in your case your love.draw() will end up calling love.filesystem.load() to read a hard disk file, interpreting it as Lua, and executing draw commands (also in an unoptimized way).
This is why throughout the wiki you have these warnings on the pages of API functions that load stuff from the hard disk:
- temp.png (13.83 KiB) Viewed 2748 times
For software rasterization you can make it much faster by plotting pixels directly into the byte data of an image, then drawing the image to screen.
I'm using this code to plot a grid image, you can repurpose it for whatever you want:
Code: Select all
local ffi = require('ffi')
local UINT8_PTR_TYPEOF = ffi.typeof('uint8_t*')
local FLOAT_PTR_TYPEOF = ffi.typeof('float*')
local SIZEOF_FLOAT = ffi.sizeof('float')
-- Returns a LÖVE ByteData object, as well as its uint8_t FFI pointer.
-- The pointer is for modifying the contents.
-- Use makeFloatData() when it's for a GLSL uniform.
local function makeByteData(totalBytes)
local data = love.data.newByteData(totalBytes)
return data, UINT8_PTR_TYPEOF(data:getFFIPointer())
end
-- For use with GLSL uniforms.
local function makeFloatData(totalFloats)
local data = love.data.newByteData(totalFloats * SIZEOF_FLOAT)
return data, FLOAT_PTR_TYPEOF(data:getFFIPointer())
end
local function plotImage(width, height)
local BYTES_PER_PIXEL = 1
local data, ptr = makeByteData(width * height * BYTES_PER_PIXEL)
for y = 0, height - 1 do
local rowIndex = y * width
for x = 0, width - 1 do
local alpha = ... -- Somehow calculate a floating point in the range [0, 1].
ptr[x + rowIndex] = math.floor(alpha * 255.0)
end
end
local imageData = love.image.newImageData(width, height, 'r8', data)
local image = love.graphics.newImage(imageData)
image:setWrap('clamp', 'clamp')
return image
end
You can also use the hardware-accelerated Z buffer, as that would be even faster. Read more in here:
https://love2d.org/wiki/love.graphics.setDepthMode
Edit: that plotImage() function creates a new image each time it's called, because that was my usecase (create an image once so I can draw it over and over). If you want to render different things onto it on each frame then you need to modify that function to also return the 'imageData' object, so you can keep modifying it and using it with image:replacePixels(), so each frame the only thing that changes is the contents imageData object, pointed to by imageData:getFFIPointer().
Re: speed
Posted: Mon Jun 10, 2024 9:41 pm
by 1Minus2P1Stringer2
"You're doing it in a brute force way with lots of redundant work. "love.draw()" is supposed to only have very fast code, as it executes at real-time speeds like 60FPS or more.
But in your case your love.draw() will end up calling love.filesystem.load() to read a hard disk file, interpreting it as Lua, and executing draw commands (also in an unoptimized way)"
Oh I see. So are you saying that since love.draw is a real time updater, based on frames rather than runtime, it can only do something so fast, instead, you should load the image using a faster method, then draw it. Also, when you say (also in an unoptimized way) Do you mean the code? Would there be a way to write it to execute faster? I split it up as multiple files to load functions and operations in a cleaner way. In my case im using a z-buffer to make a 3d engine. So, dependent on what it is that needs to be loaded, i need to load that image, then draw it, but, if I cant use love.draw, how will I update it? I fear that love.update will act the same way.
Re: speed
Posted: Mon Jun 10, 2024 9:45 pm
by 1Minus2P1Stringer2
Also thanks very much for the code. Dont know anything about how bytedata, because I have not found out the format for images or not. I should probaly learn that.
Re: speed
Posted: Mon Jun 10, 2024 9:52 pm
by dusoft
1Minus2P1Stringer2 wrote: ↑Mon Jun 10, 2024 9:41 pm
Oh I see. So are you saying that since love.draw is a real time updater, based on frames rather than runtime, it can only do something so fast, instead, you should load the image using a faster method, then draw it. Also, when you say (also in an unoptimized way) Do you mean the code? Would there be a way to write it to execute faster? I split it up as multiple files to load functions and operations in a cleaner way. In my case im using a z-buffer to make a 3d engine. So, dependent on what it is that needs to be loaded, i need to load that image, then draw it, but, if I cant use love.draw, how will I update it? I fear that love.update will act the same way.
You should offload your asset loading to love.load or scene/state load functions. love.update is meant exactly just for updates (math, recalculations, computing, speed regulation/dt, partially also input handling) and love.draw for drawing on screen.
Re: speed
Posted: Mon Jun 10, 2024 10:35 pm
by 1Minus2P1Stringer2
Not for assets, Like for other stuff like pixel color, position, stuff like that for. I see. So love.update DOESNT go by frames it goes by runtime. right?
Re: speed
Posted: Tue Jun 11, 2024 12:41 am
by RNavega
1Minus2P1Stringer2 wrote: ↑Mon Jun 10, 2024 9:41 pm
Oh I see. So are you saying that since love.draw is a real time updater, based on frames rather than runtime, it can only do something so fast, instead, you should load the image using a faster method, then draw it.
The difference between (or rather, the meanings of) love.update(dt) and love.draw() is in how they're called by LÖVE, more specifically inside the love.run() function. By default it has the mainloop of your LÖVE program, but you can (expertly and carefully, of course) override it to customize how your program runs, but usually this isn't needed and it's just something useful to know about.
Anyway, the default code to love.run() can be found in here:
Also, when you say (also in an unoptimized way) Do you mean the code? Would there be a way to write it to execute faster? I split it up as multiple files to load functions and operations in a cleaner way. In my case im using a z-buffer to make a 3d engine. So, dependent on what it is that needs to be loaded, i need to load that image, then draw it, but, if I cant use love.draw, how will I update it? I fear that love.update will act the same way.
Sorry, I should've been clearer.
For real-time, or even near-real-time, it's not efficient to have a Lua function call for each pixel on screen, as well as using one or more love.graphics.* call for each pixel. It's a lot of overhead.
Using a couple of X and Y FOR loops to iterate on each pixel and setting their R,G,B,A etc bytes on an FFI buffer like shown is more efficient, especially after the LuaJIT interpreter compiles this frequently called Lua code into a very fast form (still not as fast as using the Z buffer from the GPU, but at least faster than what's giving you the slow result right now).
So to reiterate, creating an ImageData object and an Image object to plug it into. Then on each frame you update the byte data pointed to by the pointer of the ImageData, and call the :replacePixels() from that Image object so it's refreshed. Including rearranging the moment when you call love.filesystem.load() to be outside frequently called functions like @dusoft mentioned, this all should be very fast.
Re: speed
Posted: Tue Jun 11, 2024 2:13 am
by RNavega
...also, you only need to bother with ImageData + Image objects if you actually need to visualize / debug the depth buffer at all by drawing it on screen.
Usually it's an internal buffer used to keep track of the depth of pixels of triangles that were already rendered on screen, so you technically don't have to bother visualizing it unless it's for some debugging purpose.
Re: speed
Posted: Sat Jun 15, 2024 8:52 pm
by 1Minus2P1Stringer2
AHHHHHHHHHHHHHHHHHH Lua is to confusing. Yes, its not efficient to render all pixels raw because of the real time . So we store byte data instead and then render it all at once. But bytecode and floats and words like buffer or FFI I have no idea what that is. Well I know bytecode is, and I know what floats are, but I typically never use them. But buffers and FFI are completely unknown to me so the only thing i get is the fact that I think you want me to . But don't think Im bad at coding cause I made full 3d renderer in scratch with ease, but since im so used to scratch, i have zero clue what most other things are. I mainly learned lua because it was jit, and somewhat similar to scratch but its ending up being very confusing, so i would use some sort of assembly but they are ALL different, so a jit it is, and lua is the fastest ive seen. So the main thing I want to ask is this:
One: if real time is the fastest type of rendering, why would it be bad to load each pixel bit by bit. Isnt that what real time is for? Dont computers in machine code do that?
Two: As far as i'm aware, every comp reads machine code differently, and the only thing I currently know about machine code is pixels and their 255 amount, so, how does the byte function work?
also, considering how different turbowarp/scratch is from other coding launguages by ALOT, should I go back to it? Turbowarp is pretty fast but idk.
Re: speed
Posted: Sun Jun 16, 2024 12:28 am
by RNavega
I think you're confusing bytecode (the way that computer program instructions are formatted in a file that can be executed) with bytes (a data type used to represent numbers, usually 8 bits in size).
So a buffer has bytes, those bytes can store a value in the range [0, 255]. The program that creates those buffers and other things is described in bytecode in the .EXE file.
But the important part is this:
1Minus2P1Stringer2 wrote: ↑Sat Jun 15, 2024 8:52 pm
also, considering how different turbowarp/scratch is from other coding launguages by ALOT, should I go back to it? Turbowarp is pretty fast but idk.
If you just want to make a 3D game and not have to worry about these low-level technical details, then I have great news for you. For decades there have been "graphics APIs" that handle all of this for you like OpenGL, Direct3D etc, and the programmer is required only to: A) send data to the GPU (the graphics card) using the API functions, and B) issue commands to control
when and
how the drawing is done by the GPU, including commands that set up its internal Z buffer.
You don't have to worry about implementing a Z buffer yourself.
If you want to make 3D games then just grab a 3D engine and have at it. You'll only deal with the top-level juicy parts of making a game like gameplay, characters, menus and such, and you'll be well served with a high-level 3D engine like:
- Raylib (I linked to the LuaJIT binding of Raylib, but there are bindings for many other languages)
Additionally, when you need encyclopedic knowledge like "what are bytecodes?, what are bytes?" etc. you should really use
ChatGPT or
Gemini. When used in this way they have little chance of producing errors, and save a lot of your time since you can ask follow-up questions about a particular thing that they said. I've learned a lot with them.