This week, we’re going to take a look at some of the frontiers of DAM functionality in the mobile/machine-learning era. We’ll start with this post about computational imaging, then move into computational tagging, and then a discussion of evolving visual semantics.
Photography has always had the ability to help us see the unseeable. Early photos showed cityscapes where all the people disappeared as they moved through long exposures. Macro photography and remote cameras allow us to view from impossible perspectives. Computational photography is now pushing the boundaries of visual rendering in new and remarkable ways.
What is computational photography?
Of course, all digital photography is computational. It uses a sensor and a computer instead of film and chemicals to make images. Computational imaging is a subset of digital imaging in which the resulting image can only be created by computers. This is frequently accomplished by combining multiple images together. Computational imaging often makes use of depth information to create the image. Let’s take a look at some examples of computational imaging.
"Traditional" image processing
The first six items listed have become standard techniques in many image-processing applications or inside the camera’s onboard processor. The output from these techniques is, more or less, a traditional photograph.
- High Dynamic Range Images (HDR) combine a series of exposures to capture a range of brightness information that can’t be done with a single photo. They are then blended together to create a finished image that may look “normal” or have more painterly effects.
- Correcting for lens and focus defects can now be done in post-processing, using complex algorithms to repair the problems.
- Stitched panoramas are composite images that combine multiple frames to make a new photo. In this case, they offer a field of view and resolution larger than a single frame can allow.
- Multi-lens capture is becoming common on mobile phones and other specialized cameras. It uses lenses with different focal lengths to capture wide- and long-lens photos, for instance, and allows you to create many different effects in post-production by blending the images.
- Focus stacking is a technique where macro photos are shot at multiple focusing distances and are combined in one frame that has a depth of field greater than a single frame can provide.
- Alternate representations of geometry Software that is built to represent three-dimensional space is also being used to misrepresent dimensional space. Panorama sequences can be stitched to create “little worlds” like this one by Russell Brown at Adobe.
Rich dimensional data
The next set of techniques captures depth information and combines it with photographic imagery to build three-dimensional models. The resulting creations are accessed with smartphones or computers that allow some level of navigation through three-dimensional space.
- 360-degree cameras shoot two (or more) fisheye photos and then stitch them together to make a seamless “bubble” that can be zoomed and spun inside viewing software.
- Single-camera depth mapping is a technique where many photos are taken in rapid succession at different focusing distances. These photos are analyzed for in-focus areas, which can be processed to create a three-dimensional map of the scene. This depth information is overlaid on a traditional flat image.
Computer-generated or augmented
And lastly, we get to the computational techniques that move beyond imaging into new digital-native forms.
Computer-Generated Imaging (CGI) tools can use source photos and video to make convincing new creations in ways beyond what’s listed above. CGI can also create images entirely inside a computer without the need for specific source images.
Depth-enabled viewing environments To fully leverage the depth information that is frequently part of computational imaging, you’ll need a depth-enabled viewing environment. They fall into a number of camps:
- Augmented Reality (AR) services combine data, drawings, videos, or photographs with a mobile camera image. The AR service typically uses GPS location or object recognition to trigger the display. Pokemon Go is an example of AR used in gaming. And Ikea uses AR to help people visualize how the company’s furniture will look in a room. Apple and Google offer Software Development Kits (SDK) to create AR applications. This has dramatically reduced the price and time needed to make new applications.
- Virtual Reality (VR) typically refers to a 3-D system where the viewer can navigate through scenes, looking around in 360 degrees and moving to new viewing positions. VR systems may combine the rich dimensional and computer-generated techniques listed above to create a virtual environment. VR is usually done with a headset like the Oculus, but it can also be done on a smartphone. Google Cardboard is a low-cost tool to turn your smartphone into a VR viewing device.
Just the beginning
The dimensional flavors outlined above offer some compelling capabilities for both documentation and new art forms. Some use, like real estate marketing, product retailing, or crime scene documentation, are already employed. Many more uses are coming online as industries see the usefulness of reproducing spatial information. And, of course, we see artists using these tools as well, pushing the boundaries of the photographic medium.
Computational photography has become a standard part of photographic imaging in the mobile era, and we can expect all of the categories above to continue to grow and merge. Mobile devices are well suited to dimensional image viewing since they have fast processors, position sensors, and accelerometers.
Most of the computational breakthroughs listed above rely on the combination of many source images into a new creative work. And some of these can be applied retroactively to images that were previously captured. One of the most common features of all these new imaging developments is using multiple images to create a new kind of image. This poses a challenge and an opportunity for the photographic collection. The images must be preserved so they’ll be available when the new technologies arrive, they must be accessible, and it must be possible to find them and bring them together into the new tools. A collection that is well-managed and well-annotated will be able to take advantage of new imaging tools in ways that we can’t even conceive of now.
Okay, so that was a long one. In the next post in this series, we’ll look at how computational tagging can help us make images discoverable and better understand the content of a media collection.