Posts tagged ‘CPU’

Benchmarks: my second thermal erosion

Introduction

This benchmark session has confirmed what I’ve introduced in the previous post, the new erosion algorithm is really fast!

As usual, all the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 288 ms. - avg = 292 ms. - max = 295 ms.

iterations = 30 -> min = 413 ms. - avg = 416 ms. - max = 420 ms.

iterations = 50 -> min = 540 ms. - avg = 541 ms. - max = 544 ms.

iterations = 70 -> min = 665 ms. - avg = 666 ms. - max = 668 ms.

iterations = 100 -> min = 851 ms. - avg = 853 ms. - max = 856 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

The new algorithm is about 55~65% faster than the previous one, this is a huge improvement!

CPU version

The program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 226 ms.

iterations = 10 -> 2252 ms.

iterations = 50 -> 11182 ms.

The CPU version is slightly faster (about 2%) than the CPU version of the previous algorithm.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: my thermal erosion

Introduction

I’ve run some tests on my erosion algorithm, the results are really encouraging.

All the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

As I’ve introduced in my previous post, the GPU version of the thermal erosion is a reversed version of the original algorithm.

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 632 ms. - avg = 640 ms. - max = 651 ms.

iterations = 30 -> min = 1013 ms. - avg = 1016 ms. - max = 1025 ms.

iterations = 50 -> min = 1393 ms. - avg = 1398 ms. - max = 1422 ms.

iterations = 70 -> min = 1769 ms. - avg = 1766 ms. - max = 1782 ms.

iterations = 100 -> min = 2332 ms. - avg = 2347 ms. - max = 2372 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

This algorithm is about 9% faster than the first one and 2% of the second one.

CPU version

The program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 230 ms.

iterations = 10 -> 2280 ms.

iterations = 50 -> 11350 ms.

The CPU version is really faster than the CPU version of the first algorithm (more than 90%) even it it’s a bit slower (about 10%) than the second algorithm.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: thermal erosion 2

Introduction

I’ve run some tests of the second erosion algorithm I’ve developed, the results are really good.

All the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

As I’ve introduced in my previous post, the GPU version of the thermal erosion is a reversed version of the original algorithm.

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 643 ms. - avg = 651 ms. - max = 659 ms.

iterations = 30 -> min = 1030 ms. - avg = 1039 ms. - max = 1068 ms.

iterations = 50 -> min = 1418 ms. - avg = 1424 ms. - max = 1432 ms.

iterations = 70 -> min = 1804 ms. - avg = 1819 ms. - max = 1846 ms.

iterations = 100 -> min = 2374 ms. - avg = 2397 ms. - max = 2434 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

This algorithm is about 7% faster than the first one, even if it needs more iterations to get visible results, so the gain is not so fair.

CPU version

The CPU version is based on the original version of the algorithm, the program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 208 ms.

iterations = 10 -> 2051 ms.

iterations = 50 -> 10257 ms.

The CPU version is really faster than the CPU version of the first algorithm (more than 90%) .

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: thermal erosion

Introduction

I’ve run some tests of the last erosion algorithm I’ve developed, the results are really good.

All the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

As I’ve introduced in my previous post, the GPU version of the thermal erosion is a reversed version of the original algorithm.

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 673 ms. - avg = 682 ms. - max = 692 ms.

iterations = 30 -> min = 1089 ms. - avg = 1101 ms. - max = 1133 ms.

iterations = 50 -> min = 1496 ms. - avg = 1524 ms. - max = 1548 ms.

iterations = 70 -> min = 1881 ms. - avg = 1911 ms. - max = 1935 ms.

iterations = 100 -> min = 2514 ms. - avg = 2552 ms. - max = 2602 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

CPU version

The CPU version is based on the original version of the algorithm, the program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 2748 ms.

iterations = 5 -> 20483 ms.

iterations = 10 -> 42540 ms.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: Perlin Noise

Introduction

After a bit of polishing I’ve run some tests using a 800×800 16-bits floating point texture (required by the algorithm because of the negative values).

I’ve implemented also a CPU version of the original pseudo-code, also this time the shader program is really faster than the “classic” one running

Perlin noise benchmarks

The following data are the execution times needed to complete a different number of octaves of the generation phase, for each group of octaves you can see the slowest, the average and the fastest time on 10 tests.

octaves = 2 -> min = 236 ms. - avg = 240 ms. - max = 247 ms.

octaves = 4 -> min = 254 ms. - avg = 256 ms. - max = 258 ms.

octaves = 6 -> min = 270 ms. - avg = 273 ms. - max = 277 ms.

octaves = 8 -> min = 289 ms. - avg = 291 ms. - max = 293 ms.

As you can see the average times are not so different passing from 2 to 8 octaves and this is really good!

The situation is quite different for the C program run on the CPU:

octaves = 2 -> 1544 ms.

octaves = 4 -> 3090 ms.

octaves = 6 -> 4614 ms.

octaves = 8 -> 6136 ms.

Here the times grow up in a linear way (they are doubled each time).

All these times have to be added to the average time required for the data normalization (37 ms. with the shader version).

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]