Authored by: Isha Salian, writer for NVIDIA’s corporate communications team
Fasten your seatbelts. NVIDIA Research is revving up a new deep learning engine that creates 3D object models from standard 2D images — and can bring iconic cars like the Knight Rider’s AI-powered KITT to life — in NVIDIA Omniverse.
Developed by the NVIDIA AI Research Lab in Toronto, the GANverse3D application inflates flat images into realistic 3D models that can be visualised and controlled in virtual environments. This capability could help architects, creators, game developers and designers easily add new objects to their mock-ups without needing expertise in 3D modelling, or a large budget to spend on renderings.
A single photo of a car, for example, could be turned into a 3D model that can drive around a virtual scene, complete with realistic headlights, taillights and blinkers.
To generate a dataset for training, the researchers harnessed a generative adversarial network, or GAN, to synthesise images depicting the same object from multiple viewpoints — like a photographer who walks around a parked vehicle, taking shots from different angles. These multi-view images were plugged into a rendering framework for inverse graphics, the process of inferring 3D mesh models from 2D images.
Once trained on multi-view images, GANverse3D needs only a single 2D image to predict a 3D mesh model. This model can be used with a 3D neural renderer that gives developers control to customise objects and swap out backgrounds.
When imported as an extension in the NVIDIA Omniverse platform and run on NVIDIA RTX GPUs, GANverse3D can be used to recreate any 2D image into 3D — like the beloved crime-fighting car KITT, from the popular 1980s Knight Rider TV show.
Previous models for inverse graphics have relied on 3D shapes as training data.
Instead, with no aid from 3D assets, “We turned a GAN model into a very efficient data generator so we can create 3D objects from any 2D image on the web,” said Wenzheng Chen, research scientist at NVIDIA and lead author on the project.
“Because we trained on real images instead of the typical pipeline, which relies on synthetic data, the AI model generalises better to real-world applications,” said NVIDIA researcher Jun Gao, an author on the project.
The research behind GANverse3D will be presented at two upcoming conferences: the International Conference on Learning Representations in May, and the Conference on Computer Vision and Pattern Recognition, in June.
From Flat Tire to Racing KITT
Creators in gaming, architecture and design rely on virtual environments like the NVIDIA Omniverse simulation and collaboration platform to test out new ideas and visualise prototypes before creating their final products. With Omniverse Connectors, developers can use their preferred 3D applications in Omniverse to simulate complex virtual worlds with real-time ray tracing.
But not every creator has the time and resources to create 3D models of every object they sketch. The cost of capturing the number of multi-view images necessary to render a showroom’s worth of cars, or a street’s worth of buildings, can be prohibitive.
That’s where a trained GANverse3D application can be used to convert standard images of a car, a building or even a horse into a 3D figure that can be customised and animated in Omniverse.
To recreate KITT, the researchers simply fed the trained model an image of the car, letting GANverse3D predict a corresponding 3D textured mesh, as well as different parts of the vehicle such as wheels and headlights. They then used NVIDIA Omniverse Kit and NVIDIA PhysX tools to convert the predicted texture into high-quality materials that give KITT a more realistic look and feel, and placed it in a dynamic driving sequence.
“Omniverse allows researchers to bring exciting, cutting-edge research directly to creators and end users,” said Jean-Francois Lafleche, deep learning engineer at NVIDIA. “Offering GANverse3D as an extension in Omniverse will help artists create richer virtual worlds for game development, city planning or even training new machine learning models.”
GANs Power a Dimensional Shift
Because real-world datasets that capture the same object from different angles are rare, most AI tools that convert images from 2D to 3D are trained using synthetic 3D datasets like ShapeNet.
To obtain multi-view images from real-world data — like images of cars available publicly on the web — the NVIDIA researchers instead turned to a GAN model, manipulating its neural network layers to turn it into a data generator.
The team found that opening the first four layers of the neural network and freezing the remaining 12 caused the GAN to render images of the same object from different viewpoints.
Keeping the first four layers frozen and the other 12 layers variable caused the neural network to generate different images from the same viewpoint. By manually assigning standard viewpoints, with vehicles pictured at a specific elevation and camera distance, the researchers could rapidly generate a multi-view dataset from individual 2D images.
The final model, trained on 55,000 car images generated by the GAN, outperformed an inverse graphics network trained on the popular Pascal3D dataset.
Archive
- October 2024(44)
- September 2024(94)
- August 2024(100)
- July 2024(99)
- June 2024(126)
- May 2024(155)
- April 2024(123)
- March 2024(112)
- February 2024(109)
- January 2024(95)
- December 2023(56)
- November 2023(86)
- October 2023(97)
- September 2023(89)
- August 2023(101)
- July 2023(104)
- June 2023(113)
- May 2023(103)
- April 2023(93)
- March 2023(129)
- February 2023(77)
- January 2023(91)
- December 2022(90)
- November 2022(125)
- October 2022(117)
- September 2022(137)
- August 2022(119)
- July 2022(99)
- June 2022(128)
- May 2022(112)
- April 2022(108)
- March 2022(121)
- February 2022(93)
- January 2022(110)
- December 2021(92)
- November 2021(107)
- October 2021(101)
- September 2021(81)
- August 2021(74)
- July 2021(78)
- June 2021(92)
- May 2021(67)
- April 2021(79)
- March 2021(79)
- February 2021(58)
- January 2021(55)
- December 2020(56)
- November 2020(59)
- October 2020(78)
- September 2020(72)
- August 2020(64)
- July 2020(71)
- June 2020(74)
- May 2020(50)
- April 2020(71)
- March 2020(71)
- February 2020(58)
- January 2020(62)
- December 2019(57)
- November 2019(64)
- October 2019(25)
- September 2019(24)
- August 2019(14)
- July 2019(23)
- June 2019(54)
- May 2019(82)
- April 2019(76)
- March 2019(71)
- February 2019(67)
- January 2019(75)
- December 2018(44)
- November 2018(47)
- October 2018(74)
- September 2018(54)
- August 2018(61)
- July 2018(72)
- June 2018(62)
- May 2018(62)
- April 2018(73)
- March 2018(76)
- February 2018(8)
- January 2018(7)
- December 2017(6)
- November 2017(8)
- October 2017(3)
- September 2017(4)
- August 2017(4)
- July 2017(2)
- June 2017(5)
- May 2017(6)
- April 2017(11)
- March 2017(8)
- February 2017(16)
- January 2017(10)
- December 2016(12)
- November 2016(20)
- October 2016(7)
- September 2016(102)
- August 2016(168)
- July 2016(141)
- June 2016(149)
- May 2016(117)
- April 2016(59)
- March 2016(85)
- February 2016(153)
- December 2015(150)