Augmented reality (AR) is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data. It is related to a more general concept called mediated reality, in which a view of reality is modified (possibly even diminished rather than augmented) by a computer. As a result, the technology functions by enhancing one’s current perception of reality. By contrast, virtual reality replaces the real world with a simulated one. Augmentation is conventionally in real-time and in semantic context with environmental elements, such as sports scores on TV during a match. With the help of advanced AR technology (e.g. adding computer vision and object recognition) the information about the surrounding real world of the user becomes interactive and digitally manipulable. Artificial information about the environment and its objects can be overlaid on the real world.
Hardware components for augmented reality are: processor, display, sensors and input devices. Modern mobile computing devices like smartphones and tablet computers contain these elements which often include a camera and MEMS sensors such as accelerometer, GPS, and solid state compass, making them suitable AR platforms.
Various technologies are used in Augmented Reality rendering including optical projection systems, monitors, hand held devices, and display systems worn on one's person.
A head-mounted display (HMD) is a display device paired to a headset such as a harness or helmet. HMDs place images of both the physical world and virtual objects over the user's field of view. Modern HMDs often employ sensors for six degrees of freedom monitoring that allow the system to align virtual information to the physical world and adjust accordingly with the user's head movements. HMDs can provide users immersive, mobile and collaborative AR experiences.
AR displays can be rendered on devices resembling eyeglasses. Versions include eye wear that employ cameras to intercept the real world view and re-display its augmented view through the eye pieces and devices in which the AR imagery is projected through or reflected off the surfaces of the eye wear lens pieces. Google Glass is not intended for an AR experience, but third-party developers are pushing the device toward a mainstream AR experience. After the debut of Google Glass many other AR devices emerged as alternatives. Most promising Google Alternatives can be listed as Vuzix M100, Optinvent, Meta Space Glasses, Telepathy, Recon Jet, Glass Up, K-Glass. CrowdOptic, an existing app for smartphones, applies algorithms and triangulation techniques to photo metadata including GPS position, compass heading, and a time stamp to arrive at a relative significance value for photo objects. CrowdOptic technology can be used by Google Glass users to learn where to look at a given point in time.
Contact lenses that display AR imaging are in development. These bionic contact lenses might contain the elements for display embedded into the lens including integrated circuitry, LEDs and an antenna for wireless communication. Another version of contact lenses, in development for the U.S. Military, is designed to function with AR spectacles, allowing soldiers to focus on close-to-the-eye AR images on the spectacles and distant real world objects at the same time. In 2013, at the Augmented World Expo Conference, a futuristic video named Sight featuring the potential of having augmented reality through contact lenses received the best futuristic augmented reality video award.
Virtual retinal display
A virtual retinal display (VRD) is a personal display device under development at the University of Washington's Human Interface Technology Laboratory. With this technology, a display is scanned directly onto the retina of a viewer's eye. The viewer sees what appears to be a conventional display floating in space in front of them.
The EyeTap (also known as Generation-2 Glass) captures rays of light that would otherwise pass through the center of a lens of an eye of the wearer, and substituted each ray of light for synthetic computer-controlled light. The Generation-4 Glass (Laser EyeTap) is similar to the VRD (i.e. it uses a computer controlled laser light source) except that it also has infinite depth of focus and causes the eye itself to, in effect, function as both a camera and a display, by way of exact alignment with the eye, and resynthesis (in laser light) of rays of light entering the eye.
Handheld displays employ a small display that fits in a user's hand. All handheld AR solutions to date opt for video see-through. Initially handheld AR employed fiduciary markers, and later GPS units and MEMS sensors such as digital compasses and six degrees of freedom accelerometer–gyroscope. Today SLAM markerless trackers such as PTAM are starting to come into use. Handheld display AR promises to be the first commercial success for AR technologies. The two main advantages of handheld AR is the portable nature of handheld devices and ubiquitous nature of camera phones. The disadvantages are the physical constraints of the user having to hold the handheld device out in front of them at all times as well as distorting effect of classically wide-angled mobile phone cameras when compared to the real world as viewed through the eye.
Spatial Augmented Reality (SAR) augments real world objects and scenes without the use of special displays such as monitors, head mounted displays or hand-held devices. SAR makes use of digital projectors to display graphical information onto physical objects. The key difference in SAR is that the display is separated from the users of the system. Because the displays are not associated with each user, SAR scales naturally up to groups of users, thus allowing for collocated collaboration between users.
Examples include shader lamps, mobile projectors, virtual tables, and smart projectors. Shader lamps mimic and augment reality by projecting imagery onto neutral objects, providing the opportunity to enhance the object’s appearance with materials of a simple unit- a projector, camera, and sensor.
Other applications include table and wall projections. One innovation, the Extended Virtual Table, separates the virtual from the real by including beam-splitter mirrors attached to the ceiling at an adjustable angle. Virtual showcases, which employ beam-splitter mirrors together with multiple graphics displays, provide an interactive means of simultaneously engaging with the virtual and the real. Many more implementations and configurations make spatial augmented reality display an increasingly attractive interactive alternative.
A SAR system can display on any number of surfaces of an indoor setting at once. SAR supports both a graphical visualisation and passive haptic sensation for the end users. Users are able to touch physical objects in a process that provides passive haptic sensation.
Modern mobile augmented reality systems use one or more of the following tracking technologies: digital cameras and/or other optical sensors, accelerometers, GPS, gyroscopes, solid state compasses, RFID and wireless sensors. These technologies offer varying levels of accuracy and precision. Most important is the position and orientation of the user's head. Tracking the user's hand(s) or a handheld input device can provide a 6DOF interaction technique.
Techniques include speech recognition systems that translate a user's spoken words into computer instructions and gesture recognition systems that can interpret a user's body movements by visual detection or from sensors embedded in a peripheral device such as a wand, stylus, pointer, glove or other body wear.
The computer analyzes the sensed visual and other data to synthesize and position augmentations.
Software and algorithms
A key measure of AR systems is how realistically they integrate augmentations with the real world. The software must derive real world coordinates, independent from the camera, from camera images. That process is called image registration which uses different methods of computer vision, mostly related to video tracking. Many computer vision methods of augmented reality are inherited from visual odometry. Usually those methods consist of two parts.
First detect interest points, or fiduciary markers, or optical flow in the camera images. First stage can use feature detection methods like corner detection, blob detection, edge detection or thresholding and/or other image processing methods. The second stage restores a real world coordinate system from the data obtained in the first stage. Some methods assume objects with known geometry (or fiduciary markers) present in the scene. In some of those cases the scene 3D structure should be precalculated beforehand. If part of the scene is unknown simultaneous localization and mapping (SLAM) can map relative positions. If no information about scene geometry is available, structure from motion methods like bundle adjustment are used. Mathematical methods used in the second stage include projective (epipolar) geometry, geometric algebra, rotation representation with exponential map, kalman and particle filters, nonlinear optimization, robust statistics.
Augmented Reality Markup Language (ARML) is a data standard developed within the Open Geospatial Consortium (OGC), which consists of an XML grammar to describe the location and appearance of virtual objects in the scene, as well as ECMAScript bindings to allow dynamic access to properties of virtual objects.
Greece, 15. to 20. May