Photogrammetry and Visual Effects

The benefits, pitfalls, and concepts of photogrammetry

January 31, 2017

• Landscape Mapping capture setups:
Landscape mapping captures can be done with whatever can get your camera high enough and able to travel the distance you want cover. There are some drones that can do this with ease. Some software can even help you plot out a course to capture.

A few other info and links to check out:
Intro to UAV Photogrammetry
dronemapper.com
sensefly.com
simactive.com

• Room capture setups:
The approach for rooms can vary depending on how much clutter or open space there is, and how much detail you actually need. The general approach is to start at the corner of the room, assuming you are in a standard four wall room, and face the opposite wall. You’ll want to follow the perimeter of the room taking shots of the opposite wall. Side stepping after each set of shots. Think copy machine again. You have to take a series of image slices, and then go back for any of the detailed elements you need, like lamps, furniture, counters, pillows, etc…

One of my first Photogrammetry attempts was with my iPhone 5s and I was able to get this result. Not spectacular, but definitely a great start if I had to model the room by hand. It has the location of everything in the room and it’s to scale. I’d like to start with this rather than nothing.

• Human capture setups:
Human capture by far has been the biggest hurdle and the most popular subject for Visual Effects. The majority of the capture systems out there for human talent are studio based. By this I mean that they will have a locked off area in their office reserved for the capture space. These studio setups can have anywhere from 40 – 150 DSLR cameras in their setup and are primarily setup for full body capture. You can see in these links, the incredibly large camera array setups for Infinite Realities and Ten 24.

The talent then poses in the center of that space and a master control system fires all of the cameras simultaneously. The cameras are all tethered to multiple computers to handle the large amount of image data being downloaded. With all the cameras firing in sync, your capture time only takes fractions of a second. This fast speed makes capturing multiple poses in a short time period a lot easier and also allows for moving action poses like jumping to be possible as well. Lots of soft light is usually required too to get good image exposures, and minimize shadows on the talent.

A smaller portable setup is possible using Raspberry Pi’s, but is limited in quality due to the smaller megapixel camera sensors. This Instructables post explains how. This one over at Pi 3D Scan is a bit more fancy with extra coverage.

Smaller rigs are definitely still useful and if you are just focusing on capturing only the face or head of a person, then less space is required. This early test I had done, was with one camera and only 17 photos.

Alternative setups include using a turntable that your talent stands on, like this one over at Thingiverse. You get to use less cameras, somewhere between one to four, but there are mixed feelings about this approach. Some have had success, others not so much. Your capture time goes way up and your talent has to play statue the entire time, and not get dizzy from spinning. If they shake, or slightly shift any part of their body, including their clothing swaying, it will make for a bad capture and cause major headaches in post when solving the data to create a 3D model. In some cases, the solve completely fails, and requires another new capture session or an extremely skilled ZBrush sculpting artist to fix it.

If however you are capturing an object or product, then a turntable makes total sense. I did this capture recently for my cousin’s wife. She’s a sculptor and was selling these zombie ornaments last year. The scan came out really well with just using a two camera setup and a simple lazy susan condiment tray as my turntable. Adding a simple circular chart marking every 11 degrees or so for reference helped to space out my shots evenly. I explain more in this next video.

There is also the infamous single camera capture setup. The same result scenarios apply here as well just like the turntable. It is possible, but the success rate is very unstable. I know this particular one from personal experience. The capture time also extends to a length of about six to eight minutes instead of just fractions of a second. This is a crazy long time to have someone stand still and not take a break. You might be able to cut that time down a little if you can move fast enough with the camera around your talent. Warning too, going faster tends to be difficult when using photography flash strobes, since they need a slight recharge after each shot. You can also throw your talent into a seizure if they are epileptic, so be careful about that. If you capture the average 80-150 photos needed, that’s gonna be a lot of flashing the talent has to withstand. Special thanks to Brandon Parvini and Rachael Joyce for being so patient when I did these captures.

The amount of variables that can effect the process is quite large, more so when using less cameras. A lot of this process is trial and error, and can be tedious at times.

Any matchmovers out there, will have some relatable knowledge already in how to prime a space, because a lot of the same issues for motion tracking apply to photogrammetry as well. They are both image based processes, and are highly dependent on consistent matching pixel information. Details MUST be consistent and able to be found in at least three other images. So for example, you have a glass bottle on a nice wood grain table you want to capture/track…. ok yes this is a ridiculous scenario, but it helps explain the point better. …ok so glass bottle, you spend the time capturing this bottle and you have 120 images. You’ve got your bases covered, every possible angle. The images are sharp and fully in focus, but the software gives you nothing, but the wood table and a ton of random floating blobs in the air. No 3D bottle. Why? Lack of consistent matching information, that’s why. Glass refracts and distorts the view of anything behind it from the camera’s viewpoint. This pixel information changes at every possible angle you can think of, therefore the software has no consistent matching information to extrapolate a three-dimensional location for that point. The same goes for reflective surfaces like mirrors, and laquer. They also alter the actual surface information by introducing data from the surrounding area. So it’s the subtle details that people don’t think about like this that can make or break your final product.

Photographers too have relatable knowledge that can help too. Reflections can be minimized/removed with circular polarizers, shutter speeds can be set to help remove motion blur. DOF (depth of field) can be widened with aperture settings to keep more of the subject nice and sharp, and softbox strobes can help even lighting and minimize dark shadowed areas. If you are only wanting the shape of the object and are not worried about the visual appearance. You can use baby powder or a paint of some kind to create a texture that will work. A good tutorial of this is on Ten24’s 3d Scan Store site.

• Photogrammetry software solutions?:
So the processing part of Photogrammetry can be time consuming, way more so if you have a slow computer. More GPU and CPU power comes in very handy.

These are some of the more common softwares that can process your image data:
Agisoft PhotoScan.
Capturing Reality
Autodesk 123D Catch

There is even some Python based Photogrammetry solutions too:
Python Photogrammetry Toolbox
Open Drone Map

I hope this was an informative article for you and it at the very least helped shed some light on the subject of Photogrammetry.