Shocked by such low DP performance on newer cards. What's up with that?

Sharky Forums


Results 1 to 7 of 7

Thread: Shocked by such low DP performance on newer cards. What's up with that?

  1. #1
    Great White Shark Un4given's Avatar
    Join Date
    Oct 2000
    Location
    Salt Lake City, UT United States
    Posts
    22,553

    Shocked by such low DP performance on newer cards. What's up with that?

    Since I built my new system last year I have relegated by old Intel quad core to being a crunching box, running SETI, Milkyway, and Collatz. Since the 4890 that was in it was getting a bit long in the tooth, I decided a little upgrade to the video card would help immensely. I found a 7870 GHz Edition for a decent price, and didn't figure there would be problems, because the card did support double precision calcs. Dear oh dear, was I shocked when I started running the programs again.

    Now, being the tech I am I figured I just had remnants of the old driver still on the system, so I ran the uninstall again, used a driver sweeper program, and ran Ccleaner to make sure all traces of the old driver were gone. Installed the newest driver again and....no change whatsoever. While doing a little research this morning I ran across a little information on a wiki page related to BOINC projects in general. Man was I not prepared for what I found. I found a chart listing the DP GFlop capabilities for ATi/AMD cards. While a 4890 wasn't listed, there was a 4870x2. OK, it's not rocket science, I basically halved the GFlop values to get the output of one card, but also accounted for my 4890s faster clock. Here's what I found.

    4870x2: 480 GFlop

    OC'd 4890 (estimated): ~240-250 GFlop

    7890 GHz Ed.: 190 GFlop

    That's right. A current generation, performance level card, being outperformed by a card three generations older to the tune of 50-60 GFlops in DP calcs.

    What's up with that?
    Prince of the OC Crusaders

    Intel i7 3.2GHz @ 4.24GHz
    Cooler Master V8
    Asus P9X79 Pro
    16GB Patriot Viper Extreme DDR3-1600 (quad channel)
    HIS R9 290X @1050MHz
    Asus 20x DVD-RW DL DVD-RW

  2. #2
    Great White Shark
    Join Date
    Nov 2000
    Location
    Alpharetta, Denial, Only certain songs.
    Posts
    9,925
    Quote Originally Posted by Un4given View Post
    That's right. A current generation, performance level card, being outperformed by a card three generations older to the tune of 50-60 GFlops in DP calcs.

    What's up with that?
    Both ATI and nVidia have refocused their consumer cards on gaming performance, which means that compute/render performance has actually gone down with the newer generations. It is no longer the case that workstation GPU's are the same as consumer GPU's. They are now actually diverging at the hardware design level.
    Last edited by James; 02-09-2013 at 08:40 AM.

    Crusader for the 64-bit Era.
    New Rule: 2GB per core, minimum.

    Intel i7-9700K | Asrock Z390 Phantom Gaming ITX | Samsung 970 Evo 2TB SSD
    64GB DDR4-2666 Samsung | EVGA RTX 2070 Black edition
    Fractal Arc Midi |Seasonic X650 PSU | Klipsch ProMedia 5.1 Ultra | Windows 10 Pro x64

  3. #3
    Great White Shark Un4given's Avatar
    Join Date
    Oct 2000
    Location
    Salt Lake City, UT United States
    Posts
    22,553
    Quote Originally Posted by James View Post
    Both ATI and nVidia have refocused their consumer cards on gaming performance, which means that compute/render performance has actually gone down with the newer generations. It is no longer the case that workstation GPU's are the same as consumer GPU's. They are now actually diverging at the hardware design level.
    I can understand that, and I'm definitely no engineer or programmer, but wouldn't physics and AI calcs benefit from this type of capability? Or have we gotten to the point where CPUs have so many cores and capabilities that we are back to leaving that to the CPU?

    I would also point out that the difference between GFlop levels in the 78xx and 79xx series cards are significant. I'm talking 3x the difference between a 7870 and 7950, where in terms of gaming, the difference is not nearly as large.
    Last edited by Un4given; 02-10-2013 at 04:23 PM.
    Prince of the OC Crusaders

    Intel i7 3.2GHz @ 4.24GHz
    Cooler Master V8
    Asus P9X79 Pro
    16GB Patriot Viper Extreme DDR3-1600 (quad channel)
    HIS R9 290X @1050MHz
    Asus 20x DVD-RW DL DVD-RW

  4. #4
    Great White Shark
    Join Date
    Nov 2000
    Location
    Alpharetta, Denial, Only certain songs.
    Posts
    9,925
    Aha! I knew they had done something further to really skew things....

    Check out the reviews for the 7950 and the 7870. Specifically the FP64 ratings along with the # of stream processors. Taking the two in combination, from the Tahiti(79xx) to the Pitcairn(78xx) core, they eviscerated FP performance. Basically, it's what they cut to save money.

    Code:
    GPU	Stream	FP64	"RATING"
    --------------------------------
    7950	1792	1/4	448
    7870	1280	1/16	80
    *Edit: P.S. That "RATING" column is simply stream processors multiplied by the FP64 divider value.

    **Edit again: From the Anandtech review of the 7970, talking about the "Graphics Core Next" design AMD is now using:
    Quote Originally Posted by Anandtech.com 7970 Article
    Diving deeper into Tahiti, as per the GCN architecture Tahiti’s 2048 SPs are organized into 32 Compute Units. Each of these CUs contains 4 texture units and 4 SIMD units, along with a scalar unit and the appropriate cache and registers. At the 7970’s core clock of 925MHz this puts Tahiti’s theoretical FP32 compute performance at 3.79TFLOPs, while its FP64 performance is ¼ that at 947GFLOPs. As GCN’s FP64 performance can be configured for 1/16, ¼, or ½ its FP32 performance...
    ***Edit the third: From the FirePro w9000 review. Seems even for their workstation cards, they stayed with 1/4 divider for double precision floating point work. They didn't even go with the 1/2 they say they are capable of.

    Code:
    Single Precision	Double Precision	Pixel Fillrate		Texture Fillrate	Memory Band.
    4TFLOPs	 		1TFLOPs	 		31.2 GPixels/sec	124 GTexels/sec		264GB/sec
    Last edited by James; 02-11-2013 at 10:15 AM.

    Crusader for the 64-bit Era.
    New Rule: 2GB per core, minimum.

    Intel i7-9700K | Asrock Z390 Phantom Gaming ITX | Samsung 970 Evo 2TB SSD
    64GB DDR4-2666 Samsung | EVGA RTX 2070 Black edition
    Fractal Arc Midi |Seasonic X650 PSU | Klipsch ProMedia 5.1 Ultra | Windows 10 Pro x64

  5. #5
    Great White Shark Un4given's Avatar
    Join Date
    Oct 2000
    Location
    Salt Lake City, UT United States
    Posts
    22,553
    Quote Originally Posted by James View Post
    Aha! I knew they had done something further to really skew things....

    Check out the reviews for the 7950 and the 7870. Specifically the FP64 ratings along with the # of stream processors. Taking the two in combination, from the Tahiti(79xx) to the Pitcairn(78xx) core, they eviscerated FP performance. Basically, it's what they cut to save money.

    Code:
    GPU	Stream	FP64	"RATING"
    --------------------------------
    7950	1792	1/4	448
    7870	1280	1/16	80
    *Edit: P.S. That "RATING" column is simply stream processors multiplied by the FP64 divider value.

    **Edit again: From the Anandtech review of the 7970, talking about the "Graphics Core Next" design AMD is now using:


    ***Edit the third: From the FirePro w9000 review. Seems even for their workstation cards, they stayed with 1/4 divider for double precision floating point work. They didn't even go with the 1/2 they say they are capable of.

    Code:
    Single Precision	Double Precision	Pixel Fillrate		Texture Fillrate	Memory Band.
    4TFLOPs	 		1TFLOPs	 		31.2 GPixels/sec	124 GTexels/sec		264GB/sec
    Thanks for the info. I don't get to follow the new tech like I used to. Oh well, I ordered a 7950 to replace the 7870. Guess I should have done a bit more digging first. :\
    Prince of the OC Crusaders

    Intel i7 3.2GHz @ 4.24GHz
    Cooler Master V8
    Asus P9X79 Pro
    16GB Patriot Viper Extreme DDR3-1600 (quad channel)
    HIS R9 290X @1050MHz
    Asus 20x DVD-RW DL DVD-RW

  6. #6
    Great White Shark
    Join Date
    Nov 2000
    Location
    Alpharetta, Denial, Only certain songs.
    Posts
    9,925
    Just a quick follow up from the new Titan review on Anandtech.

    Quote Originally Posted by anandtech.com
    We’ll dive more into why that is in a bit in our feature breakdown, but the biggest factor is that for the first time on any consumer-level NVIDIA card, double precision (FP64) performance is uncapped. That means 1/3 FP32 performance, or roughly 1.3TFLOPS theoretical FP64 performance. NVIDIA has taken other liberties to keep from this being treated as a cheap Tesla K20, but for lighter workloads it should fit the bill.
    Last edited by James; 02-19-2013 at 04:13 PM.

    Crusader for the 64-bit Era.
    New Rule: 2GB per core, minimum.

    Intel i7-9700K | Asrock Z390 Phantom Gaming ITX | Samsung 970 Evo 2TB SSD
    64GB DDR4-2666 Samsung | EVGA RTX 2070 Black edition
    Fractal Arc Midi |Seasonic X650 PSU | Klipsch ProMedia 5.1 Ultra | Windows 10 Pro x64

  7. #7
    Catfish
    Join Date
    Feb 2013
    Posts
    166
    Quote Originally Posted by Un4given View Post
    Thanks for the info. I don't get to follow the new tech like I used to. Oh well, I ordered a 7950 to replace the 7870. Guess I should have done a bit more digging first. :\
    not that,

    actually you forgot somthing ...

    the 4870 was the flagship , ATI flagship in the past was x8xx named

    later on they changed their flagship naming to x9xx , so 7970 is in place of 4870..

    so actually you were comparing different levels ...
    Last edited by PCVSNotebook; 02-22-2013 at 12:18 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •