Skip to content

Conversation

ghost
Copy link

@ghost ghost commented Jul 19, 2023

A significant performance increase can be had by copying the first row to all other rows.

A significant performance increase can be had by copying the first row to all other rows.
@raysan5
Copy link
Owner

raysan5 commented Jul 19, 2023

@smalltimewizard did you review the performance gain? Please, could you provide more info, and example or some metrics?

@ghost
Copy link
Author

ghost commented Jul 20, 2023

The following code performs a barebones test.

// gcc main.c -o bench.exe -I "./include" -L "./lib" -lraylib -lgdi32 -lwinmm

#include <stdio.h>
#include <time.h>

#include "raylib.h"

#define GRIDW 64
#define GRIDH 64
#define CELLW 24
#define CELLH 24
#define RUNS 10000

int main(int argc, char* argv[]) {
	SetTraceLogLevel(LOG_WARNING);
	
	Image benchimage = {0};
	benchimage = GenImageColor(GRIDW * CELLW, GRIDH * CELLH, (Color){0,0,0,0}); // Set transparent image of the proper size
	ImageFormat(&benchimage, PIXELFORMAT_UNCOMPRESSED_R8G8B8A8);
	
	double timerstart = (double) clock();
	
	int runs = 0;
	while (runs < RUNS) {
		for (int i = 0; i < GRIDW * GRIDH; i++) {
			ImageDrawRectangleRec(&benchimage, (Rectangle){(i % GRIDW) * CELLW, (i / GRIDW) * CELLH, CELLW, CELLH}, (Color){GetRandomValue(0,255), GetRandomValue(0,255), GetRandomValue(0,255), 255});
		}
		runs++;
	}
	
	printf("%d calls took %f seconds to run.", GRIDW * GRIDH * RUNS, (double) (clock() - timerstart)/CLOCKS_PER_SEC);
	
	return 0;
};

This is 40,960,000 calls, fully replacing a 64x64 grid of 24x24 randomly-colored squares 10,000 times.
Without the change (libraylib.a from raylib-4.5.0_win64_mingw-w64), it takes ~105 seconds to run on my machine.
With the change (libraylib.a compiled with mingw32-make), it takes ~19 seconds.

@ghost
Copy link
Author

ghost commented Jul 20, 2023

The program where I was trying to use this before/after (55 FPS vs 128 FPS):
image
image
85x64 grid refreshed each frame with randomly-colored 24x24 squares. And a bit of other stuff that shouldn't affect the overhead much.

@raysan5 raysan5 merged commit 1310617 into raysan5:master Jul 20, 2023
@raysan5
Copy link
Owner

raysan5 commented Jul 20, 2023

@smalltimewizard thank you very much for the benchmark, reviewed and verified that the improvement is considerable, it went from 160 seconds to 14 seconds on my system (11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz, Windows 10 Pro 64bit, compiled with GCC 12.2.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant