Pff, lako je tako moj direktore, dok ti djuskas u Zemunu po Akapulku (nemoj da lazes i izvlacis se, imamo i slike i svedokinje), napacen narod je morao da pazari karticu i ukratko, OpenCL performanse su i do 10 puta bolje nego sa R6450.
Ocito kartica nativno radi na 8x PCI-E 3.0, obzirom da kad se pokrene stress test pocne raditi na 8x 2.0 sto je po standardu ploce.
Lijevo je stara 6450 a desno je taj XFX R7 240.
Code:
Device 0: Oland
Build: _WINxx DEBUG
GPU work items: 8192
Buffer size: 33554432
CPU workers: 1
Timing loops: 20
Repeats: 1
Kernel loops: 20
inputBuffer: CL_MEM_READ_ONLY
outputBuffer: CL_MEM_WRITE_ONLY
Host baseline (single thread, naive):
Timer resolution 447 ns
Page fault 57 ns
Barrier speed 379 ns
CPU read 1.23 GB/s
memcpy() 1.48 GB/s
memset(,1,) 3.41 GB/s
memset(,0,) 3.32 GB/s
AVERAGES (over loops 2 - 19, use -l for complete log)
--------
1. Host mapped write to inputBuffer
clEnqueueMapBuffer(WRITE): 0.009851 s [ 3.41 GB/s ]
memset(): 0.009623 s 3.49 GB/s
clEnqueueUnmapMemObject(): 0.011735 s [ 2.86 GB/s ]
2. GPU kernel read of inputBuffer
clEnqueueNDRangeKernel(): 0.029360 s 22.86 GB/s
verification ok
3. GPU kernel write to outputBuffer
clEnqueueNDRangeKernel(): 0.030297 s 22.15 GB/s
4. Host mapped read of outputBuffer
clEnqueueMapBuffer(READ): 0.009852 s [ 3.41 GB/s ]
CPU read: 0.026507 s 1.27 GB/s
verification ok
clEnqueueUnmapMemObject(): 0.000017 s [ 1955.09 GB/s ]
Code:
Platform 0 : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Oland
Build Options are : -D DATATYPE=float4 -D SIZE=5120
AccessType : single(static index)
VectorElements : 4
Bandwidth : 978.19 GB/s
AccessType : single(dynamic index)
VectorElements : 4
Bandwidth : 987.645 GB/s
AccessType : linear
VectorElements : 4
Bandwidth : 236.85 GB/s
AccessType : random
VectorElements : 4
Bandwidth : 104.923 GB/s
Code:
Platform 0 : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Oland
Build Options are : -D DATATYPE=float4 -D OFFSET=16384
Global Memory Read
AccessType : single
VectorElements : 4
Bandwidth : 705.14 GB/s
Global Memory Read
AccessType : linear
VectorElements : 4
Bandwidth : 235.383 GB/s
Global Memory Read
AccessType : linear(uncached)
VectorElements : 4
Bandwidth : 21.4064 GB/s
Global Memory Write
AccessType : linear
VectorElements : 4
Bandwidth : 165.447 GB/s
Code:
Platform 0 : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Oland
Build Options are : -D DATATYPE=float2
AccessType : single
VectorElements : 2
Bandwidth : 458.647 GB/s
AccessType : linear
VectorElements : 2
Bandwidth : 248.747 GB/s
Code:
Platform 0 : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Oland
-----------------------------------------
Copy 1D FastPath : 20.8473 GB/s
-----------------------------------------
Copy 1D CompletePath : 21.6473 GB/s
-----------------------------------------
Copy 2D 32-bit (64x2) : 21.4248 GB/s
Copy 2D 128-bit (64x2) : 21.65 GB/s
-----------------------------------------
Copy 2D 32-bit (64x4) : 21.7261 GB/s
Copy 2D 128-bit (64x4) : 21.4583 GB/s
-----------------------------------------
Copy 2D 32-bit (8x8) : 20.8729 GB/s
Copy 2D 128-bit (8x8) : 20.8078 GB/s
-----------------------------------------
Copy 2D 32-bit (256x1) : 21.6526 GB/s
Copy 2D 128-bit (256x1) : 21.5972 GB/s
-----------------------------------------
Copy 2D 32-bit (32x2) : 21.5317 GB/s
Copy 2D 128-bit (32x2) : 21.4526 GB/s
-----------------------------------------
Copy 2D 32-bit (64x1) : 21.6105 GB/s
Copy 2D 128-bit (64x1) : 21.5902 GB/s
-----------------------------------------
Copy 2D 32-bit (16x16) : 20.3652 GB/s
Copy 2D 128-bit (16x16) : 20.5926 GB/s
-----------------------------------------
Copy 2D 32-bit (16x4) : 21.2091 GB/s
Copy 2D 128-bit (16x4) : 21.0523 GB/s
-----------------------------------------
Copy 2D 32-bit (1x64) : 4.34438 GB/s
Copy 2D 128-bit (1x64) : 6.63463 GB/s
-----------------------------------------
Copy 1D 128-bit : 180.147 GB/s
-----------------------------------------
NoCoal Copy 1D 32-bit : 48.5842 GB/s
-----------------------------------------
Split Copy 1D 32-bit : 48.7274 GB/s