Graphic Processor Units for Many-particle Dynamics


A few days ago a friend came to me with a question about floating point.  Let me start by saying that my friend knows his stuff, he doesn't ask stupid questions.  So he had my attention.  He was working on some biosciences simulation code and was getting answers of a different precision than he expected on the GPU  and wanted to know what was up.
Even expert CUDA programmers don't always know all the intricacies of floating point.  It's a tricky topic.  Even my friend, who is so cool he wears sunglasses indoors, needed some help.  If you look at the NVIDIA CUDA forums, questions and concerns about floating point come up regularly.  [1] [2] [3] [4] [5] [6] [7] Getting a handle on how to effectively use floating point is obviously very important if you are doing numeric computations in CUDA.
In an attempt to help out, Alex and I have written a short whitepaper about floating point on NVIDIA GPUs.
In the paper we talk about various issues related to floating point in CUDA.  You will learn:
  • How the IEEE 754 standard fits in with NVIDIA GPUs
  • How fused multiply-add improves accuracy
  • There's more than one way to compute a dot product (we present three)
  • How to make sense of different numerical results between CPU and GPU

Random Images


Login Form

Visitors Counter

© 2009-2024 by GPIUTMD

We have 10 guests and no members online

Latest News


NVIDIA is calling on global researchers to submit their innovations for the NVIDIA Global Impact Award - an annual grant of $150,000 for groundbreaking work that addresses the world's most important social and humanitarian problems. 

Read more ...

Unified Memory in CUDA 6

With CUDA 6, we’re introducing one of the most dramatic programming model improvements in the history of the CUDA platform, Unified Memory. In a typical PC or cluster node today, the memories of the CPU and GPU are physically distinct and separated by the PCI-Express bus. Before CUDA 6, that is exactly how the programmer has to view things. Data that is shared between the CPU and GPU must be allocated in both memories, and explicitly copied between them by the program. This adds a lot of complexity to CUDA programs.

Read more ...

My Apple Style Countdown

© 2009-2015 by GPIUTMD

Word Cloud

ارتقاء امنیت وب با وف بومی