3D Reconstruction Using Images From Aerial Vehicles

Project Goals and Deliverables:
The goal of our project is to create 3D models of buildings and monuments of historic importance using images taken by an aerial vehicle. To create 3D models out of images we exploit a range imaging technique called Structure from Motion (SfM). Structure from Motion is a technique of estimating 3D structure from images taken from different viewpoints. We also plan on enabling the aerial vehicle to “scan” buildings in an automated fashion without having to pilot it by humans. We hope to achieve such automated scanning by using beacons on ground to localize the aerial vehicle in the environment (using RF trilateration for example). Once the robot is localized in the environment, the robot can be made to fly in such a way as to scan a bounding box around the building. Such scanning will allow us to capture images of the building from multiple viewpoints. These images can then be fed into any SfM algorithm to obtain the 3D model.

Our Project relies on two major principles : Structure from Motion (SfM) and Trilateration.

Structure from Motion(SfM):
As mentioned earlier SfM is a range imaging technique where 3D structure of objects are estimated using images taken from multiple viewpoints. Images from different viewpoints can be thought of as images due to motion of the imaging device such as camera and hence the name (3D)Structure from (images due to) Motion. This technique is quite well known in the field of Computer Vision and has been extensively studied over the past couple of years.

Source : Jianxiong Xiao, Princeton Vision Group

Agrawal et. al. [1] in their 2009 paper reported building a 3D model of the entire city of Rome from 150K images taken from image sharing website Flicker in less than a day on a cluster with 500 compute cores. Building 3D model from such large number of images is computationally expensive and takes a good amount of time. Given limited computational resources however we do not aim at reconstructing the entire city but rather some selected monuments and buildings. Furthermore drones nowadays are equipped with high-definition cameras with high frames rates (~30 frames of high definition images every second) and flying them for a short amount of time already provides us with thousands of high-definition images. This huge volume of data has to be somehow dealt with. Not each image frame provides new information about the scene. For example two successive frames separated some milliseconds apart do not differ much and capture almost the exact same scene. Hence we need to be able to pick some representative image frames from a pool of thousands of images in order to reduce the amount of data used for 3D reconstruction. Such “key-frame extraction” process hugely reduces the complexity of 3D reconstruction process since we will be working with much less number of images. Key-frame extraction can be carried out in different ways. A simple strategy would be to simply skip a certain number of image frames before considering an image for 3D reconstruction. This simple approach however may not always work, especially in situations where the robot is undergoing a rapid maneuver (turning around an edge for example) and hence we might require more sophisticated key-frame extraction strategy for such cases.

Also keeping in mind that SfM does not work well with highly regular texture patterns due to potential feature mismatches we need to make sure that buildings we are scanning don’t have such highly regular patterns. For example the image below shows a building with a repeated pattern of identical looking windows which may not be suitable for 3D reconstruction using SfM.


We plan to use some already available open-source SfM algorithm. There’s already a repertory of open source SfM algorithms out there, OpenMVG (Open Multi-View Geometry) library being one of the most promising ones. Also we plan to use some commonly available 3D rendering softwares such as Blender (http://www.blender.org/) to visualize the generated 3D models.

Trilateration is a position estimation technique using distance sensors and simple circle/sphere geometry. We plan to use simple trilateration using beacons on ground to localize the robot. Once we know the pose of the robot with respect to some known reference frame we can then write a controller to make the robot go around a bounding box surrounding the building in an automated fashion. There are several ways one can go about implementing a localization scheme. [2] discusses various localization schemes for aerial vehicles in outdoors scenarios. Ultra-wideband radio (UWB) seems to be one of the most promising techniques for localization outdoors given its accuracy (cm range accuracy) and robustness against interference and multipath problems common in urban settings. However we will explore other various techniques and implement the one that best fits our goals.


After a preliminary research about various drones available in the market, we’ve decided to use the Bebop quadcopter by Parrot (http://www.parrot.com/usa/products/bebop-drone/) for our project. For a reasonable price it gives a decent battery life (22 mins on two batteries), a good control range (up to 2km using range extenders) and a good wide angle camera with high definition images.

Following are the milestones we hope to achieve during the course of the project:

  1. Research about regulations regarding UAV flights in Lausanne and in Venice. [1 week]
  2. Experiment with various available SfM libraries and have the 3D reconstruction module ready. [2 weeks]
  3. Have all the necessary hardware ready including spare parts for the drone, electronics such as beacons and receivers, etc. [1 week]
  4. Research about various localization techniques and implement a most feasible one for a task at hand. [4 weeks]
  5. Write a robot controller that makes the robot go around the bounding box in an automated fashion (lateral scans at different heights)  [3 weeks]
  6. Flight testing and debugging. [1 week]
  7. Deployment : Scan selected buildings and monuments in Venice [1 week]



[1] S. Agarwal, N. Snavely, I. Simon, S. M. Seitz, and R. Szeliski. Building rome in a day. In ICCV, 2009. (http://grail.cs.washington.edu/rome/rome_paper.pdf)

[2] Final Thesis by Nicholas Pacholski, Extending The Sensor Edge, Smart Drone Positioning System (https://njpacholski.files.wordpress.com/2014/08/sdps_final_njpacholski.pdf)