Covering Disruptive Technology Powering Business in The Digital Age

image
Knight Rider Rides a GAN: Bringing KITT to Life with AI, NVIDIA Omniverse
image
April 20, 2021 Blogs

 

 

Authored by: Isha Salian, writer for NVIDIA’s corporate communications team

Fasten your seatbelts. NVIDIA Research is revving up a new deep learning engine that creates 3D object models from standard 2D images — and can bring iconic cars like the Knight Rider’s AI-powered KITT to life — in NVIDIA Omniverse.

Developed by the NVIDIA AI Research Lab in Toronto, the GANverse3D application inflates flat images into realistic 3D models that can be visualised and controlled in virtual environments. This capability could help architects, creators, game developers and designers easily add new objects to their mock-ups without needing expertise in 3D modelling, or a large budget to spend on renderings.

A single photo of a car, for example, could be turned into a 3D model that can drive around a virtual scene, complete with realistic headlights, taillights and blinkers.

To generate a dataset for training, the researchers harnessed a generative adversarial network, or GAN, to synthesise images depicting the same object from multiple viewpoints — like a photographer who walks around a parked vehicle, taking shots from different angles. These multi-view images were plugged into a rendering framework for inverse graphics, the process of inferring 3D mesh models from 2D images.

Once trained on multi-view images, GANverse3D needs only a single 2D image to predict a 3D mesh model. This model can be used with a 3D neural renderer that gives developers control to customise objects and swap out backgrounds.

When imported as an extension in the NVIDIA Omniverse platform and run on NVIDIA RTX GPUs, GANverse3D can be used to recreate any 2D image into 3D — like the beloved crime-fighting car KITT, from the popular 1980s Knight Rider TV show.

Previous models for inverse graphics have relied on 3D shapes as training data.

Instead, with no aid from 3D assets, “We turned a GAN model into a very efficient data generator so we can create 3D objects from any 2D image on the web,” said Wenzheng Chen, research scientist at NVIDIA and lead author on the project.

“Because we trained on real images instead of the typical pipeline, which relies on synthetic data, the AI model generalises better to real-world applications,” said NVIDIA researcher Jun Gao, an author on the project.

The research behind GANverse3D will be presented at two upcoming conferences: the International Conference on Learning Representations in May, and the Conference on Computer Vision and Pattern Recognition, in June.

From Flat Tire to Racing KITT 

Creators in gaming, architecture and design rely on virtual environments like the NVIDIA Omniverse simulation and collaboration platform to test out new ideas and visualise prototypes before creating their final products. With Omniverse Connectors, developers can use their preferred 3D applications in Omniverse to simulate complex virtual worlds with real-time ray tracing.

But not every creator has the time and resources to create 3D models of every object they sketch. The cost of capturing the number of multi-view images necessary to render a showroom’s worth of cars, or a street’s worth of buildings, can be prohibitive.

That’s where a trained GANverse3D application can be used to convert standard images of a car, a building or even a horse into a 3D figure that can be customised and animated in Omniverse.

To recreate KITT, the researchers simply fed the trained model an image of the car, letting GANverse3D predict a corresponding 3D textured mesh, as well as different parts of the vehicle such as wheels and headlights. They then used NVIDIA Omniverse Kit and NVIDIA PhysX tools to convert the predicted texture into high-quality materials that give KITT a more realistic look and feel, and placed it in a dynamic driving sequence.

“Omniverse allows researchers to bring exciting, cutting-edge research directly to creators and end users,” said Jean-Francois Lafleche, deep learning engineer at NVIDIA. “Offering GANverse3D as an extension in Omniverse will help artists create richer virtual worlds for game development, city planning or even training new machine learning models.”

GANs Power a Dimensional Shift

Because real-world datasets that capture the same object from different angles are rare, most AI tools that convert images from 2D to 3D are trained using synthetic 3D datasets like ShapeNet.

To obtain multi-view images from real-world data — like images of cars available publicly on the web — the NVIDIA researchers instead turned to a GAN model, manipulating its neural network layers to turn it into a data generator.

The team found that opening the first four layers of the neural network and freezing the remaining 12 caused the GAN to render images of the same object from different viewpoints.

Keeping the first four layers frozen and the other 12 layers variable caused the neural network to generate different images from the same viewpoint. By manually assigning standard viewpoints, with vehicles pictured at a specific elevation and camera distance, the researchers could rapidly generate a multi-view dataset from individual 2D images.

The final model, trained on 55,000 car images generated by the GAN, outperformed an inverse graphics network trained on the popular Pascal3D dataset.

(0)(0)

Archive