Posts tagged ‘GPU’

Benchmarks: Thermal Erosion algorithms

Introduction

I’ve done new benchmarks to test how the latest optimizations have improved the performance of the thermal erosion shaders.

All the benchmarks are made using a 1024×1024 16-bits floating points texture.

I’ve run the benchmarks on my graphic card, a GeForce 7600 GT, and on the graphic card of my friend Encelo, a GeForce 8600 GT.

The following data are the execution times needed to complete a different number of iterations of theerosion, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

First thermal erosion

These are the results for the 7600 GT:

iterations = 10 -> min = 135 ms. - avg = 136 ms. - max = 137 ms.

iterations = 30 -> min = 406 ms. - avg = 407 ms. - max = 408 ms.

iterations = 50 -> min = 676 ms. - avg = 678 ms. - max = 679 ms.

iterations = 70 -> min = 948 ms. - avg = 949 ms. - max = 950 ms.

iterations = 100 -> min = 1354 ms. - avg = 1356 ms. - max = 1357 ms.

The optimized shader is about 47% faster than the previous version.

These are the results for the 8600 GT:

iterations = 10 -> min = 49 ms. - avg = 50 ms. - max = 51 ms.

iterations = 30 -> min = 148 ms. - avg = 149 ms. - max = 150 ms.

iterations = 50 -> min = 246 ms. - avg = 247 ms. - max = 249 ms.

iterations = 70 -> min = 346 ms. - avg = 347 ms. - max = 348 ms.

iterations = 100 -> min = 492 ms. - avg = 493 ms. - max = 495 ms.

The shader is about 64% faster on this graphic card.

Second thermal erosion

These are the results for the 7600 GT:

iterations = 10 -> min = 124 ms. - avg = 126 ms. - max = 127 ms.

iterations = 30 -> min = 376 ms. - avg = 377 ms. - max = 378 ms.

iterations = 50 -> min = 627 ms. - avg = 628 ms. - max = 630 ms.

iterations = 70 -> min = 879 ms. - avg = 880 ms. - max = 881 ms.

iterations = 100 -> min = 1255 ms. - avg = 1256 ms. - max = 1258 ms.

The optimized shader is about 48% faster than the previous version.

These are the results for the 8600 GT:

iterations = 10 -> min = 49 ms. - avg = 51 ms. - max = 52 ms.

iterations = 30 -> min = 150 ms. - avg = 151 ms. - max = 151 ms.

iterations = 50 -> min = 248 ms. - avg = 250 ms. - max = 252 ms.

iterations = 70 -> min = 347 ms. - avg = 349 ms. - max = 351 ms.

iterations = 100 -> min = 495 ms. - avg = 496 ms. - max = 498 ms.

The shader is about 39% faster on this graphic card.

My first thermal erosion

These are the results for the 7600 GT:

iterations = 10 -> min = 120 ms. - avg = 121 ms. - max = 121 ms.

iterations = 30 -> min = 361 ms. - avg = 363 ms. - max = 364 ms.

iterations = 50 -> min = 603 ms. - avg = 604 ms. - max = 605 ms.

iterations = 70 -> min = 845 ms. - avg = 846 ms. - max = 847 ms.

iterations = 100 -> min = 1207 ms. - avg = 1208 ms. - max = 1209 ms.

The optimized shader is about 49% faster than the previous version.

These are the results for the 8600 GT:

iterations = 10 -> min = 50 ms. - avg = 51 ms. - max = 52 ms.

iterations = 30 -> min = 151 ms. - avg = 153 ms. - max = 153 ms.

iterations = 50 -> min = 252 ms. - avg = 254 ms. - max = 255 ms.

iterations = 70 -> min = 353 ms. - avg = 356 ms. - max = 358 ms.

iterations = 100 -> min = 508 ms. - avg = 509 ms. - max = 511 ms.

The shader is about 58% faster on this graphic card.

My second thermal erosion

These are the results for the 7600 GT:

iterations = 10 -> min = 55 ms. - avg = 56 ms. - max = 57 ms.

iterations = 30 -> min = 165 ms. - avg = 166 ms. - max = 167 ms.

iterations = 50 -> min = 275 ms. - avg = 277 ms. - max = 277 ms.

iterations = 70 -> min = 386 ms. - avg = 387 ms. - max = 388 ms.

iterations = 100 -> min = 552 ms. - avg = 553 ms. - max = 554 ms.

The optimized shader is about 36% faster than the previous version.

These are the results for the 8600 GT:

iterations = 10 -> min = 24 ms. - avg = 24 ms. - max = 25 ms.

iterations = 30 -> min = 72 ms. - avg = 73 ms. - max = 73 ms.

iterations = 50 -> min = 120 ms. - avg = 122 ms. - max = 122 ms.

iterations = 70 -> min = 169 ms. - avg = 170 ms. - max = 172 ms.

iterations = 100 -> min = 241 ms. - avg = 242 ms. - max = 243 ms.

The shader is about 57% faster on this graphic card.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: Generation algorithms

Introduction

I’ve done new benchmarks to test how the latest optimizations have improved the performance of the generation shaders.

All the benchmarks are made using a 1024×1024 16-bits floating points texture.

I’ve run the benchmarks on my graphic card, a GeForce 7600 GT, and on the graphic card of my friend Encelo, a GeForce 8600 GT.

The following data are the execution times needed to complete a different number of iterations of the generation phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

Fault formation

These are the results for the 7600 GT:

iterations = 250 -> min = 271 ms. - avg = 273 ms. - max = 276 ms.

iterations = 500 -> min = 543 ms. - avg = 545 ms. - max = 548 ms.

iterations = 1000 -> min = 1089 ms. - avg = 1090 ms. - max = 1093 ms.

iterations = 2000 -> min = 2179 ms. - avg = 2180 ms. - max = 2183 ms.

The optimized shader is about 12% faster than the previous version.

These are the results for the 8600 GT:

iterations = 250 -> min = 235 ms. - avg = 237 ms. - max = 239 ms.

iterations = 500 -> min = 471 ms. - avg = 473 ms. - max = 477 ms.

iterations = 1000 -> min = 942 ms. - avg = 945 ms. - max = 951 ms.

iterations = 2000 -> min = 1883 ms. - avg = 1885 ms. - max = 1889 ms.

The shader is about 14% faster on this graphic card.

Circles

These are the results for the 7600 GT:

iterations = 250 -> min = 338 ms. - avg = 340 ms. - max = 243 ms.

iterations = 500 -> min = 676 ms. - avg = 678 ms. - max = 682 ms.

iterations = 1000 -> min = 1355 ms. - avg = 1356 ms. - max = 1359 ms.

iterations = 2000 -> min = 2710 ms. - avg = 2712 ms. - max = 2715 ms.

The optimized shader is about 1% faster than the previous version.

These are the results for the 8600 GT:

iterations = 250 -> min = 238 ms. - avg = 239 ms. - max = 242 ms.

iterations = 500 -> min = 475 ms. - avg = 477 ms. - max = 481 ms.

iterations = 1000 -> min = 951 ms. - avg = 952 ms. - max = 955 ms.

iterations = 2000 -> min = 1900 ms. - avg = 1902 ms. - max = 1906 ms.

The shader is about 30% faster on this graphic card.

Perlin Noise

These are the results for the 7600 GT:

octaves = 2 -> min = 31 ms. - avg = 32 ms. - max = 32 ms.

octaves = 4 -> min = 61 ms. - avg = 62 ms. - max = 63 ms.

octaves = 6 -> min = 92 ms. - avg = 92 ms. - max = 93 ms.

octaves = 8 -> min = 121 ms. - avg = 122 ms. - max = 123 ms.

The optimized shader is about 61% faster than the previous version.

These are the results for the 8600 GT:

octaves = 2 -> min = 12 ms. - avg = 13 ms. - max = 13 ms.

octaves = 4 -> min = 23 ms. - avg = 24 ms. - max = 24 ms.

octaves = 6 -> min = 36 ms. - avg = 36 ms. - max = 36 ms.

octaves = 8 -> min = 47 ms. - avg = 48 ms. - max = 49 ms.

The shader is about 60% faster on this graphic card.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: my second thermal erosion

Introduction

This benchmark session has confirmed what I’ve introduced in the previous post, the new erosion algorithm is really fast!

As usual, all the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 288 ms. - avg = 292 ms. - max = 295 ms.

iterations = 30 -> min = 413 ms. - avg = 416 ms. - max = 420 ms.

iterations = 50 -> min = 540 ms. - avg = 541 ms. - max = 544 ms.

iterations = 70 -> min = 665 ms. - avg = 666 ms. - max = 668 ms.

iterations = 100 -> min = 851 ms. - avg = 853 ms. - max = 856 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

The new algorithm is about 55~65% faster than the previous one, this is a huge improvement!

CPU version

The program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 226 ms.

iterations = 10 -> 2252 ms.

iterations = 50 -> 11182 ms.

The CPU version is slightly faster (about 2%) than the CPU version of the previous algorithm.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: my thermal erosion

Introduction

I’ve run some tests on my erosion algorithm, the results are really encouraging.

All the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

As I’ve introduced in my previous post, the GPU version of the thermal erosion is a reversed version of the original algorithm.

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 632 ms. - avg = 640 ms. - max = 651 ms.

iterations = 30 -> min = 1013 ms. - avg = 1016 ms. - max = 1025 ms.

iterations = 50 -> min = 1393 ms. - avg = 1398 ms. - max = 1422 ms.

iterations = 70 -> min = 1769 ms. - avg = 1766 ms. - max = 1782 ms.

iterations = 100 -> min = 2332 ms. - avg = 2347 ms. - max = 2372 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

This algorithm is about 9% faster than the first one and 2% of the second one.

CPU version

The program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 230 ms.

iterations = 10 -> 2280 ms.

iterations = 50 -> 11350 ms.

The CPU version is really faster than the CPU version of the first algorithm (more than 90%) even it it’s a bit slower (about 10%) than the second algorithm.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Benchmarks: thermal erosion 2

Introduction

I’ve run some tests of the second erosion algorithm I’ve developed, the results are really good.

All the test are based on 1024×1024 maps, the GPU version uses 16-bits floating point textures.

GPU version

As I’ve introduced in my previous post, the GPU version of the thermal erosion is a reversed version of the original algorithm.

The following data are the execution times needed to complete a different number of iterations of the erosion phase, for each group of iterations you can see the slowest, the average and the fastest time on 10 tests.

iterations = 10 -> min = 643 ms. - avg = 651 ms. - max = 659 ms.

iterations = 30 -> min = 1030 ms. - avg = 1039 ms. - max = 1068 ms.

iterations = 50 -> min = 1418 ms. - avg = 1424 ms. - max = 1432 ms.

iterations = 70 -> min = 1804 ms. - avg = 1819 ms. - max = 1846 ms.

iterations = 100 -> min = 2374 ms. - avg = 2397 ms. - max = 2434 ms.

As it’s possible to notice the times are not linear, but they grow slowly.

This algorithm is about 7% faster than the first one, even if it needs more iterations to get visible results, so the gain is not so fair.

CPU version

The CPU version is based on the original version of the algorithm, the program is written in C language.

The following data are the average execution times needed to complete a different number of iterations of the erosion phase.

iterations = 1 -> 208 ms.

iterations = 10 -> 2051 ms.

iterations = 50 -> 10257 ms.

The CPU version is really faster than the CPU version of the first algorithm (more than 90%) .

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]