Neural Bootstrapper

Abstract

Bootstrapping 은 Uncertainty quantification 에 많이 사용되는 방식이었습니다. 하지만 기존 bootstrapping 방식은 반복되는 계산이 많기 때문에, Neural Network 를 학습시키는데에 computational cost 가 너무 높고 이러한 특성으로 인해 Neural Network (특히 modern deep learning) 에서 실용적으로 사용되기에는 커다란 장애가 됩니다. Computational bottleneck 을 해결하기 위해 본 논문의 저자들을 Neural Bootstrapper (NeuBoots) 라는 generator of bootstrapped networks 를 제안합니다.

일반적인 Bootstrapping 과는 다르게, 본 논문에서 제안하는 Neuboots 는 하나의 loss function 으로 단 한번의 학습을 통해 구현이 가능합니다. 이런 방식을 통해 기존 bootstrapping 이 갖고 있던 반복 학습을 피할 수 있어 bootstrap 계산의 효율성을 높일 수 있습니다. 더 나아가 본 논문에서는 Neuboots 가 standard bootstrap distribution 을 asymptotically approximate 한다는 것을 이론적, 실험적으로 증명하였습니다.

결과적으로 machine learning 에서 Neuboots 를 적용하여 uncertainty quantification 을 할 수 있으며 동시에 calibration, semantic segmentation, out-of-distribution detection, 그리고 active learning 에도 사용할 수 있음을 보여줍니다. 또한 Neuboots 는 적은 연산량으로 state-of-the-art uncertainty quantification 성능을 보여줍니다.

Introduction

Bootstrap 이 1979 년에 소개된 이후로 이 기술은 uncertainty 을 측정하는 주요한 툴이 됐습니다. 예를 들어, standard error, confidence interval, hypothetical null distribution 등을 평가하는데 사용됐습니다. Statistics 분야에서 굉장히 많이 쓰임에도 불구하고, bootstrap 을 neural network application 에 활용하기에는 computational intensity 가 높아 주목받지 못했습니다. Uncertainty 를 측정하기 위해서 bootstrap 은 적어도 수백 수천개의 모델을 평가해야 하는데, neural network 로는 computational cost 가 높아 실용적이지 못합니다.

본 논문에서 소개하는 Neural Bootstrap 은 deep neural network 에서도 사용 가능하게 고안되었고, generator function 을 만들어 bootstrap weight 을 bootstrap sample 에 mapping 하게됩니다. 다시 말해, Neuboots 는 Neural network 들의 bootstrap distribution 을 생성하게 고안되었고 이를 활용하여 Convolutional neural network 에도 적용합니다.

Neuboots 는 이미 존재하는 neural network 에 쉽게 적용 가능합니다. Bootstrap weights 를 입력으로 받는 generator function 을 구성하고, Neuboots 로 wrapping 된 neural network 은 target network 의 feature 와 bootstrap weights 를 입력값으로 받습니다. Application 단에서는 bootstrap generator 의 output 이 bootstrap weigths 와 last hidden-layer 의 hidden nodes 를 element-wise multiplication 으로 구성합니다. 이렇게 함으로서 Neuboots 를 활용하게 되면 network parameters 에 randomness 를 주지 않고 (보통 randomness 는 large-number 이고) target network 의 bootstrap sample 을 바로 출력값으로 내게 됩니다. 다시 말해, Neuboots 의 randomness 는 model parameter 의 randomness 가 아닌 input bootstrap weights 의 randomness 입니다.

본 논문에서는 Neuboots 가 이론적으로 valid bootstrap distribution 을 갖는 다는 것을 증명하였고. 처음에는 정확한 bootstrap distribution 을 구성하는 Vanilla NeuBoots 를 보여준 뒤, scalable approximation by considering blocks of data observation 을 고려한 block bootstrap 을 적용한 것도 보여줍니다.

더 나아가 Neuboots 는 MCDrop, Deep Ensemble, Gaussian Process (GP) 와 같은 방식들에 비해 학습과 추론 시간 모두 computationally efficient 합니다. Neuboots 는 다수의 neural network 들을 학습하는 것이 아닌, 단 한번만 학습하여 학습 효율적입니다. 또 추론 시에도 randomly generated bootstrap weights 를 모델에 집어넣는 형식이 되어 효율적입니다.

Materials and Methods