A Poisoning Attack Against 3D Gaussian Splatting

A new research collaboration between Singapore and China has proposed a method for attacking the popular synthesis method 3D Gaussian Splatting (3DGS).

The new attack method uses crafted source data  to overload the available GPU memory of the target system, and to make training so lengthy as to potentially incapacitate the target server, equivalent to a denial-of-service (DOS) attack. Source: https://arxiv.org/pdf/2410.08190

The new attack method uses crafted source data  to overload the available GPU memory of the target system, and to make training so lengthy as to potentially incapacitate the target server, equivalent to a denial-of-service (DOS) attack. Source: https://arxiv.org/pdf/2410.08190

The attack uses crafted training images of such complexity that they are likely to overwhelm an online service that allows users to create 3DGS representations.

This approach is facilitated by the adaptive nature of 3DGS, which is designed to add as much representational detail as the source images require for a realistic render. The method exploits both crafted image complexity (textures) and shape (geometry).

The attack system 'poison-splat' is aided by a proxy model that estimates and iterates the potential of source images to add complexity and Gaussian Splat instances to a model, until the host system is overwhelmed.

The attack system ‘poison-splat’ is aided by a proxy model that estimates and iterates the potential of source images to add complexity and Gaussian Splat instances to a model, until the host system is overwhelmed.

The paper asserts that online platforms – such as LumaAI, KIRI, Spline and Polycam – are increasingly offering 3DGS-as-a-service, and that the new attack method – titled Poison-Splat – is potentially capable of pushing the 3DGS algorithm towards ‘its worst computation complexity’ on such domains, and even facilitate a denial-of-service (DOS) attack.

According to the researchers, 3DGS could be radically more vulnerable other online neural training services. Conventional machine learning training procedures set parameters at the outset, and thereafter operate within constant and relatively consistent levels of resource usage and power consumption. Without the ‘elasticity’ that Gaussian Splat requires for assigning splat instances, such services are difficult to target in the same manner.

Furthermore, the authors note, service providers cannot defend against such an attack by limiting the complexity or density of the model, since this would cripple the effectiveness of the service under normal use.

From the new work, we see that a host system which limits the number of assigned Gaussian Splats cannot function normally, since the elasticity of these parameters is a fundamental feature of 3DGS.

From the new work, we see that a host system which limits the number of assigned Gaussian Splats cannot function normally, since the elasticity of these parameters is a fundamental feature of 3DGS.

The paper states:

‘[3DGS] models trained under these defensive constraints perform much worse compared to those with unconstrained training, particularly in terms of detail reconstruction. This decline in quality occurs because 3DGS cannot automatically distinguish necessary fine details from poisoned textures.

‘Naively capping the number of Gaussians will directly lead to the failure of the model to reconstruct the 3D scene accurately, which violates the primary goal of the service provider. This study demonstrates more sophisticated defensive strategies are necessary to both protect the system and maintain the quality of 3D reconstructions under our attack.’

In tests, the attack has proved effective both in a loosely white-box scenario (where the attacker has knowledge of the victim’s resources), and a black box approach (where the attacker has no such knowledge).

The authors believe that their work represents the first attack method against 3DGS, and warn that the neural synthesis security research sector is unprepared for this kind of approach.

The new paper is titled Poison-splat: Computation Cost Attack on 3D Gaussian Splatting, and comes from five authors at the National University of Singapore, and Skywork AI in Beijing.

Method

The authors analyzed the extent to which the number of Gaussian Splats (essentially, three-dimensional ellipsoid ‘pixels’) assigned to a model under a 3DGS pipeline affects the computational costs of training and rendering the model.

The authors study reveals a clear correlation between the number of assigned Gaussians and training time costs, as well as GPU memory usage.

The authors study reveals a clear correlation between the number of assigned Gaussians and training time costs, as well as GPU memory usage.

The right-most figure in the image above indicates the clear relationship between image sharpness and the number of Gaussians assigned. The sharper the image, the more detail is seen to be required to render the 3DGS model.

The paper states*:

‘[We] find that 3DGS tends to assign more Gaussians to those objects with more complex structures and non-smooth textures, as quantified by the total variation score—a metric assessing image sharpness. Intuitively, the less smooth the surface of 3D objects is, the more Gaussians the model needs to recover all the details from its 2D image projections.

‘Hence, non-smoothness can be a good descriptor of complexity of [Gaussians]’

However, naively sharpening images will tend to affect the semantic integrity of the 3DGS model so much that an attack would be obvious at the early stages.

Poisoning the data effectively requires a more sophisticated approach. The authors have adopted a proxy model method, wherein the attack images are optimized in an off-line 3DGS model developed and controlled by the attackers.

On the left, we see a graph representing the overall cost of computation time and GPU memory occupancy on the MIP-NeRF360 'room' dataset, demonstrating native performance, naïve perturbation and proxy-driven data. On the right, we see that naïve perturbation of the source images (red) leads to quickly catastrophic results too early in the process. By contrast, we see that the proxy-guided source images maintain a more stealthy and cumulative attack method.

On the left, we see a graph representing the overall cost of computation time and GPU memory occupancy on the MIP-NeRF360 ‘room’ dataset, demonstrating native performance, naïve perturbation and proxy-driven data. On the right, we see that naïve perturbation of the source images (red) leads to quickly catastrophic results too early in the process. By contrast, we see that the proxy-guided source images maintain a more stealthy and cumulative attack method.

The authors state:

‘It is evident that the proxy model can be guided from non-smoothness of 2D images to develop highly complex 3D shapes.

‘Consequently, the poisoned data produced from the projection of this over-densified proxy model can produce more poisoned data, inducing more Gaussians to fit these poisoned data.’

The attack system is constrained by a 2013 Google/Facebook collaboration with various universities, so that the perturbations remain within bounds designed to allow the system to inflict damage without affecting the recreation of a 3DGS image, which would be an early signal of an incursion.

Data and Tests

The researchers tested poison-splat against three datasets: NeRF-Synthetic; Mip-NeRF360; and Tanks-and-Temples.

They used the official implementation of 3DGS as a victim environment. For a black box approach, they used the Scaffold-GS framework.

The tests were carried out on a NVIDIA A800-SXM4-80G GPU.

For metrics, the number of Gaussian splats produced were the primary indicator, since the intention is to craft source images designed to maximize and exceed rational inference of the source data. The rendering speed of the target victim system was also considered.

The results of the initial tests are shown below:

Full results of the test attacks across the three datasets. The authors observe that they have highlighted attacks that successfully consume more than 24GB of memory. Please refer to the source paper for better resolution.

Full results of the test attacks across the three datasets. The authors observe that they have highlighted attacks that successfully consume more than 24GB of memory. Please refer to the source paper for better resolution.

Of these results, the authors comment:

‘[Our] Poison-splat attack demonstrates the ability to craft a huge extra computational burden across multiple datasets. Even with perturbations constrained within a small range in [a constrained] attack, the peak GPU memory can be increased to over 2 times, making the overall maximum GPU occupancy higher than 24 GB.

[In] the real world, this may mean that our attack may require more allocable resources than common GPU stations can provide, e.g., RTX 3090, RTX 4090 and A5000. Furthermore [the] attack not only significantly increases the memory usage, but also greatly slows down training speed.

‘This property would further strengthen the attack, since the overwhelming GPU occupancy will last longer than normal training may take, making the overall loss of computation power higher.’

The progress of the proxy model in both a constrained and an unconstrained attack scenario.

The progress of the proxy model in both a constrained and an unconstrained attack scenario.

The tests against Scaffold-GS (the black box model) are shown below. The authors state that these results indicate that poison-splat generalizes well to such a different architecture (i.e., to the reference implementation).

Test results for black box attacks on NeRF-Synthetic and the MIP-NeRF360 datasets.

Test results for black box attacks on NeRF-Synthetic and the MIP-NeRF360 datasets.

The authors note that there have been very few studies centering on this kind of resource-targeting attacks at inference processes. The 2020 paper Energy-Latency Attacks on Neural Networks was able to identify data examples that trigger excessive neuron activations, leading to debilitating consumption of energy and to poor latency.

Inference-time attacks were  studied further in subsequent works such as Slowdown attacks on adaptive multi-exit neural network inference, Towards Efficiency Backdoor Injection, and, for language models and vision-language models (VLMs), in NICGSlowDown, and Verbose Images.

Conclusion

The Poison-splat attack developed by the researchers exploits a fundamental vulnerability in Gaussian Splatting – the fact that it assigns complexity and density of Gaussians according to the material that it is given to train on.

The 2024 paper F-3DGS: Factorized Coordinates and Representations for 3D Gaussian Splatting has already observed that Gaussian Splatting’s arbitrary assignment of splats is an inefficient method, that frequently also produces redundant instances:

‘[This] inefficiency stems from the inherent inability of 3DGS to utilize structural patterns or redundancies. We observed that 3DGS produces an unnecessarily large number of Gaussians even for representing simple geometric structures, such as flat surfaces.

‘Moreover, nearby Gaussians sometimes exhibit similar attributes, suggesting the potential for enhancing efficiency by removing the redundant representations.’

Since constraining Gaussian generation undermines quality of reproduction in non-attack scenarios, the growing number of online providers that offer 3DGS from user-uploaded data may need to study the characteristics of source imagery in order to determine signatures that indicate a malicious intention.’

In any case, the authors of the new work conclude that more sophisticated defense methods will be necessary for online services in the face of the kind of attack that they have formulated.

* My conversion of the authors’ inline citations to hyperlinks

First published Friday, October 11, 2024