3D graphics cards and big data

ILab Notes
3D graphics cards and big data

The other day I did a talk. 150 participants from different sectors and industries on the next big thing in Big Data – namely massively parallel data processing on graphics cards. I wanted to explain why this is extraordinary, the use of a graphics cards for data processing. And so I asked the crowd “How many of you ever played Wolfenstein 3D?”. Surprisingly almost EVERYONE put their hands in the air, and so my story followed.

Back in the day, Wolfenstein 3D was literally a game changer. Now, we could play games in 3D. My own 40MhZ-486DX (Yes, DX with a mathematics co-processor) with 16MB RAM (WAY more than my friends had), ran Wolfenstein and later Doom, perfectly. The progression was fast. Not far down the line, along came the Pentium 166MhZ. It would run more advanced games, but after a short time, no computer was able to provide power to all the latest releases. The first 3D cards, saw the dawn of day as cards you installed alongside the graphics card, and whenever 3D was needed, the 3DFX – Later renamed DirectX drivers would let the 3D card take over.

Along came the next generation of games, and with them, new graphics cards with onboard 3D processors and extra memory. The memory was earlier only used to store the screen in memory on the graphics board, but more memory meant you could place surface graphics, textures, or bitmaps directly on the graphics card. The biggest part of a graphics card, at this point, was now the central graphics processor, called the GPU.

wolfenstein game
Back in the day, Wolfenstein 3D was a game changer

The GPU could crack specific calculations a lot, lot faster than the computers CPU, however the GPU could only be used for exactly that purposes, until Nvidia, a producer of the GPU chips got a request from Seti @ Home, an organization that used spare processing power on the participant’s computers to analyze datasets from radio telescopes for signs of extraterrestrial life. Seti @ Home wanted to use the GPU for crunching the datasets, and it could speed up the process up by a factor of 10. Other players had also seen the light, and Nvidia built CUDA, a c++ library that could access the low-level functions of the GPU and harness the vast computational potential.

And so CUDA was silently developed and used for engineering, medical research, stock market analysis and more and all was good. But NVidia thought, “What if we made cards specifically for computation” – and so they did, and the crowd went wild. 

In the early 10’ you could suddenly buy a server, stuffed with specially designed hardware, and a robust API on top of it, and they called it Desktop Supercomputing. Practically, it was, and is desktop supercomputing. The amount of processing power is gigantic and is available to anyone with 100.000 dollars to spare (not a lot, when talking about supercomputers).

After the Big Data hype has reached a level where people are looking for real practical applications, the GPUs are shining. The hottest thing at the moment is “in memory databases,” and if you run those in-memory, on the bastard child of a gamer PC graphics card, but with multiple processors and memory, you will be able to crack numbers 100-1000 times faster, than using traditional methods. As all data is in memory, you can cross any parameter, with any other parameter, making exploratory data analysis a walk in the park.

So, where do we go from here? Well, AI and GPU’s also work exceptionally well. The Tesla cars have Nvidia technology running their AI. Any place where sophisticated AI will be required to run, there will be a market for GPUs and Nvidia. That’s evolution, defined in a story that started as a dream in the computer class, at the high school where played too many hours of Wolfenstein 3D, and forgot to do my homework.

For more background, and information on GPU processing of Big Data and AI, contact me.