Photogrammetry / NeRF / Gaussian Splatting comparison

Matthew Brennan
1 Oct 202323:29

TLDRThis video explores three cutting-edge technologies: photogrammetry, neural radiance fields (Nerfs), and Gaussian splatting. The presenter, with 15 years of experience in photogrammetry, delves into each method's capabilities and ideal applications. Using a dataset of 345 frames from a video of a sandstone column named Church Rock, the video demonstrates how each technology processes and visualizes the data. Agisoft Metashape is used for photogrammetry, resulting in an accurate, textured 3D model. For neural radiance fields, the presenter employs Nerf Studio, showcasing its ability to generate novel viewpoints and create videos with customized camera paths. Lastly, Gaussian splatting in Unity provides a real-time visualization of the point cloud, offering a comprehensive scene context. The video concludes with a discussion on sharing photogrammetric models online and enhancing them with panoramic backgrounds to mimic the broader scene context provided by neural radiance fields and Gaussian splatting.

Takeaways

  • 📈 **Photogrammetry, Neural Radiance Fields (Nerfs), and Gaussian Splatting are three distinct technologies that have gained popularity due to advancements in AI.**
  • 🔍 **Photogrammetry** is a technique used for 3D modeling that has been around for 15 years, focusing on accurate and measurable 3D models.
  • 🤖 **Neural Radiance Fields (Nerfs)** utilize AI to generate novel viewpoints of a scene, offering a more extensive context including the sky and horizon not typically captured in photogrammetry.
  • 🌐 **Gaussian Splatting** involves rendering a scene in real-time using a game engine like Unity, providing a visualization method that combines the benefits of both photogrammetry and neural radiance fields.
  • 📷 **Agisoft Metashape** is used for photogrammetry processing, while **Nerf Studio** is employed for training the neural radiance field on the same set of video frames.
  • 🌄 **Photogrammetry** provides a detailed textured 3D model with baked-in lighting conditions, suitable for applications like 3D printing or game engine integration.
  • 📏 **Accuracy and scalability** are key benefits of photogrammetry, allowing for measurements such as area and volume, which are not currently possible with Nerfs or Gaussian Splatting.
  • 🌌 **Neural Radiance Fields** can estimate the appearance of a scene from viewpoints that were not part of the original data, offering creative freedom in generating new camera paths.
  • 🎥 **Virtual production** is a significant application for neural radiance fields, where quick data collection and later perfecting of camera paths are valuable.
  • 🧩 **Gaussian Splatting** allows for the integration of additional elements into the scene, such as panoramic skies, providing a more complete scene visualization.
  • 🌟 **Photogrammetric models** can be easily shared and viewed online, offering a lightweight and accessible way to experience 3D models, which is not yet possible for Nerfs or Gaussian Splatting.

Q & A

  • What are the three technologies discussed in the video?

    -The three technologies discussed in the video are photogrammetry, neural radiance fields (often called NeRFs), and Gaussian splatting.

  • What is the primary use of photogrammetry?

    -Photogrammetry is primarily used for creating accurate, measurable 3D models of objects. It is often used in applications where precise measurements are required, such as in architecture or archaeology.

  • How does neural radiance field (NeRF) differ from photogrammetry?

    -Neural radiance fields (NeRF) differ from photogrammetry in that they allow for the generation of novel viewpoints that were not part of the original data. They create a volumetric representation of a scene, which is not metrically valid for precise measurements like photogrammetry.

  • What is the significance of Gaussian splatting in the context of radiance fields?

    -Gaussian splatting is significant as it provides a real-time visualization of radiance fields in a game engine like Unity. It allows for the representation of a point cloud with visible 'splats' derived from the input data, offering a more complete scene visualization.

  • How can one enhance a photogrammetric model to include broader context like the sky and distant mountains?

    -One can enhance a photogrammetric model by adding a panoramic sphere to the background of the model. This technique involves mapping a panorama image to a sphere geometry around the photogrammetric model to simulate a broader scene context.

  • What is the advantage of using Agisoft Metashape for photogrammetry processing?

    -Agisoft Metashape is advantageous for photogrammetry processing because it provides a structured workflow for creating 3D models from photographs. It allows for the estimation of camera positions, creation of a point cloud, and the generation of a textured polygonal mesh.

  • How does the lighting condition affect the final model in photogrammetry?

    -In photogrammetry, the lighting conditions are 'baked' into the final model's texture. This means that the final model will reflect the lighting as it was during the photography, which can affect the appearance of the model when viewed without the texture.

  • What is the role of a neural network in the context of a neural radiance field?

    -In the context of a neural radiance field, the neural network is trained on the input data (video frames) to estimate the colors and appearance of the scene from novel viewpoints that were not part of the original data set.

  • What is the main advantage of using neural radiance fields for virtual production?

    -The main advantage of using neural radiance fields for virtual production is the ability to quickly capture a scene and later set up the perfect camera path in a virtual environment. This allows for more creative freedom and the possibility of achieving shots that may not be feasible with actual camera equipment on location.

  • How does the 3D Gaussian splatting method enhance the visualization of radiance fields?

    -The 3D Gaussian splatting method enhances the visualization of radiance fields by representing the point cloud with visible 'splats' in a real-time environment like Unity. This provides a clearer and more detailed representation of the scene, allowing for interactive exploration and camera path setup within the game engine.

  • What are some potential applications of photogrammetry, neural radiance fields, and Gaussian splatting?

    -Potential applications include creating accurate 3D models for architectural planning, archaeological documentation, virtual production for film and video, and real-time visualization in gaming or simulation environments.

Outlines

00:00

📈 Introduction to Emerging Technologies

The video introduces three popular technologies: photogrammetry, neural Radiance Fields (often called NeRFs), and Gaussian splatting. The speaker clarifies that while these technologies have gained attention due to AI hype, they are not deeply related and serve different purposes. The speaker, with experience in photogrammetry and a newcomer to NeRFs and Gaussian splatting, aims to simplify these complex methods. The video will compare these technologies by processing the same image set, specifically 345 frames from a video of a Sandstone column named Church Rock. The speaker provides resources such as software links and a dataset for viewers to experiment with.

05:02

🌐 Photogrammetry and its Applications

The speaker discusses photogrammetry, a technique used for creating 3D models from photographs. The process involves estimating camera positions to relate each photograph to the object and to each other. Agisoft Metashape is used to process the photogrammetry, starting with a point cloud representing the surface of Church Rock. The resulting polygonal mesh is detailed, although not as high-resolution as it would be with professional photographs. The model is then textured using the original photographs to create a photorealistic appearance. Photogrammetry's strength lies in its accuracy and metric validity, allowing for measurements such as area and volume. However, it does not capture the broader scene context, focusing instead on the object of interest.

10:03

🧠 Neural Radiance Fields (NeRFs)

The video moves on to neural Radiance Fields, using the same video frames to train a neural network. The resulting Radiance field allows for the generation of novel viewpoints that were not part of the original data. Unlike photogrammetry, which provides a 3D model, NeRFs offer a volumetric representation that can estimate colors and scene details from new viewpoints. The speaker uses Metashape's camera pose estimation as a starting point for the neural Radiance field. The output includes additional scene context like the sky and distant mountains, which were not captured by photogrammetry. NeRFs are particularly useful for quickly capturing data for virtual production, where the perfect shot may not be feasible on location.

15:04

🎨 Gaussian Splatting for Real-time Visualization

The final technology discussed is Gaussian splatting, specifically 3D real-time Gaussian splatting of Radiance fields. The speaker describes how the point cloud from the photogrammetry is imported into a real-time game engine, Unity, and visualized using a method that transforms the point cloud into a splatted representation. This method allows for a broader scene context and the ability to move around and set up camera paths within the engine. The video showcases a rendered sequence from Unity that resembles the NeRF video but with added clarity and the ability to add additional elements, such as a panoramic sky.

20:05

🌟 Enhancing Photogrammetry with Broader Context

The speaker concludes by demonstrating how photogrammetric models can be enhanced to include broader scene context, similar to NeRFs and Gaussian splatting. By adding a panoramic sphere to the background of the photogrammetric model, the speaker achieves a wider scene representation. The model is uploaded to Sketchfab, a web viewer for 3D models, where the panoramic background is displayed. The video ends with an invitation for viewers to engage with the content, subscribe for more, and explore the speaker's other videos on related topics.

Mindmap

Keywords

Photogrammetry

Photogrammetry is a technique that uses photographs to create 3D models of objects or environments. It involves estimating camera positions and orientations, and then using these to reconstruct the 3D scene. In the video, the speaker uses photogrammetry to create a detailed model of a sandstone column called Church Rock, which is accurate and can be used for measurements such as area and volume.

Neural Radiance Fields (NeRF)

Neural Radiance Fields, often abbreviated as NeRF, is a method that uses machine learning to generate 3D representations of scenes from a set of 2D images. Unlike photogrammetry, which creates a mesh model, NeRF generates a volumetric representation that can be used to render novel viewpoints of the scene. In the video, the speaker trains a neural network on video frames to create a Radiance Field of Church Rock, which includes the surrounding sky and distant mountains.

Gaussian Splatting

Gaussian Splatting is a visualization technique used to render 3D point clouds in a way that appears smooth and continuous. It involves spreading the intensity values of the points across their local neighborhoods using a Gaussian function. In the context of the video, the speaker uses 3D Gaussian Splatting to visualize the Radiance Field in real-time using a game engine, which allows for interactive exploration and the addition of extra elements to the scene.

Agisoft Metashape

Agisoft Metashape is a professional photogrammetry software used to create 3D models from a series of photographs. In the video, the speaker uses Metashape to process the frames from the video of Church Rock, estimating camera positions and creating a polygonal textured mesh that can be used for various applications such as 3D printing or game engine integration.

Nerf Studio

Nerf Studio is a software tool designed for working with Neural Radiance Fields. It allows users to train neural networks on image data and visualize the resulting 3D representations. In the video, the speaker uses Nerf Studio to train a model on video frames of Church Rock and explore the generated Radiance Field, which includes a broader scene context than the photogrammetry model.

Polygonal Textured Mesh

A polygonal textured mesh refers to a 3D model made up of polygons, which are the building blocks of 3D geometry. Each polygon is a flat, multi-sided shape that, when combined with others, forms the surface of the model. The model is then textured, meaning that images are mapped onto the surface to give it color and detail. In the video, the photogrammetry process results in a polygonal textured mesh of Church Rock.

Point Cloud

A point cloud is a collection of data points in a coordinate system, often representing the external surface of an object. In 3D scanning and photogrammetry, point clouds are used to capture the shape of a real-world object. In the video, the speaker mentions a point cloud representing the surface of Church Rock, which is then used in the creation of both the photogrammetry model and the Radiance Field.

UV Mapping

UV mapping is a 3D modeling process that assigns 2D images (textures) to the surfaces of a 3D model. It is similar to the way one would wrap a map around the Earth. In the video, the speaker uses UV mapping to apply the photographs as a texture to the 3D model of Church Rock, creating a photorealistic appearance.

Camera Pose Estimation

Camera pose estimation involves determining the position and orientation of a camera in space relative to the scene it is capturing. This is a crucial step in both photogrammetry and Neural Radiance Fields, as it provides the necessary information to reconstruct the 3D scene. In the video, the speaker discusses how camera pose estimation is used in both Metashape and Nerf Studio to align the photographs and video frames.

Volumetric Representation

A volumetric representation is a 3D model that represents the density or intensity values within a continuous space, rather than a surface made of polygons. It is often used to simulate phenomena such as light scattering or to create detailed visual effects. In the video, the speaker contrasts the polygonal mesh of the photogrammetry model with the volumetric cloud generated by the Neural Radiance Field, which includes the surrounding environment.

Real-Time Rendering

Real-time rendering is the process of generating 2D images or animations from a 3D model at the time of viewing, without precomputed frames. It is a key feature of game engines and other interactive applications. In the video, the speaker uses real-time rendering in Unity to visualize the Gaussian Splatted point cloud of the Radiance Field, allowing for interactive exploration and the creation of new camera paths.

Highlights

The video discusses three popular technologies: photogrammetry, neural radiance fields (Nerfs), and Gaussian splatting.

Photogrammetry is a technique used for creating 3D models from photographs, which has been used for 15 years by the speaker.

Neural Radiance Fields (Nerfs) and Gaussian splatting are newer technologies used for visualization and have been explored by the speaker for a few months.

The speaker uses Agisoft Metashape for photogrammetry processing and Nerf Studio for neural radiance field processing.

Photogrammetry involves estimating camera positions and creating a textured 3D model, which can be measured and scaled.

Neural Radiance Fields allow for the generation of novel viewpoints that were not captured in the original data.

Gaussian splatting visualizes a point cloud in a real-time game engine like Unity, offering interactivity and the ability to add additional elements to the scene.

Photogrammetry provides accurate and measurable 3D models, suitable for applications like architectural plans.

Neural Radiance Fields and Gaussian splatting are not metrically valid and are better suited for virtual production and artistic visualization.

The video demonstrates how to use camera pose estimation from Metashape as a starting point for a neural radiance field.

Nerf Studio can generate new videos with stylized camera movements that emulate the look and feel of first-person view drone videos.

Gaussian splatted point clouds can be imported into Unity for real-time visualization and further enhancement with additional elements.

Photogrammetric models can be shared easily and viewed in web viewers like Sketchfab, offering a broader context to the model.

The speaker has collected numerous photogrammetric datasets over 15 years, which can be reused for new visualizations with these technologies.

The video includes a link to a zip file containing the data used in the demonstration for viewers to try processing themselves.

The speaker invites viewers to share their results and questions in the comments section for further discussion.

The video concludes with an invitation to subscribe for more content on neural radiance fields and Gaussian splatted videos.