Get a demo

Cameras Don’t See — Mimicking the Human Eye in Software

Perceive | May 23rd, 2019

Latest article

Say “so long” to people with a clipboard. The next generation camera has been built to mimic the human eye and provide spatial intelligence about the data that the camera captures.

Curiosity is the best part of being “human”. This curiosity and creativeness is a software vendor’s worst nightmare. Just because a person can “dream it”, doesn’t mean that software can implement it.

This is especially true when working with video. Computer vision specialist Everett Berry and CEO of Perceive along with Aaron Michaux, CTO of Perceive, helped me understand that most overhead cameras don’t capture exactly what the human eye sees. I had previously assumed that video cameras provide an exact recording of what I see. This is not the case. A security camera, in-store camera, shelf camera only captures a portion of what our eyes see, but the real magic is how the brain turns that into a rich 3D visual scene.

Trending AI Articles:

1. Ten trends of Artificial Intelligence (AI) in 2019

2. Bursting the Jargon bubbles — Deep Learning

3. How Can We Improve the Quality of Our Data?

4. Machine Learning using Logistic Regression in Python with Code

The lack of 3D spatial reasoning — the “mind’s view” is why current in-store analytics and people counters are limited in the accuracy and type of analytics that they can provide.

Cameras, like the retina, see what’s called a “projective space”

This space is heavily distorted, and even allows us to see things infinitely far away — that point on the horizon where receding train tracks meet. Seeing the infinite is… rather distorted. The magic of the human visual system is that it takes this 2D information, and reconstructs a 3D world in the mind’s eye. This illusion is so obvious and immediate that its complexities were completely missed until 1960s robotics scientists attempted to process projective images.

The 3D world is where we think. It’s how we reason about space. When we think of people moving through space, it’s not a projective space, but a 3D space. At Perceive, we’re building an integrated computer vision system that understands that space, and has figured out how to extract data from that space: “turning that space into data.”

Let’s look at Store Camera basics: Cameras either have 1 lens or 2 lenses.

1 lens camera:

  • Most security or overhead cameras are 1 lens cameras.
  • The further away a camera is placed from the scene, the fewer details are captured.
  • 1 lens cameras capture a limited point of view, and it is currently difficult to reconstruct 3D information.
  • An overhead camera takes a top/down approach, and can only capture the images that are in its direct line of sight.

2 lens camera:

  • Capable of capturing coarse 3D data within its field of view.
  • Does not require top-down placement for accurate people counting.
  • Can only capture images that are in its direct line of sight — that is, several 2 lens cameras generally aren’t integrated into a single camera network.
  • Lacks integrated software stack to save, process, and extract information.

Next Generation Camera: Perceive Camera Network

  • Cloud-based, wireless, utilizes light fixture for power.
  • Camera is able to utilize 5G.
  • Integrated computer vision software that can mimic the 3D reasoning of the human mind.
  • Multiple camera systems are integrated into a network to produce a single queryable “view” of a store, workplace, or area.
  • Next generation computer vision gives feedback on various behaviors, on demographics, and on the way a person is facing.
  • Maintains customer privacy because Perceive doesn’t use facial recognition.
  • Gives stores, museums, universities, malls, workplaces, the best action and demographic data about their space.
  • Better for the environment as it doesn’t need a battery.

Computer vision software

The next area for boundary-breaking technology is the ability to mimic the human visual system. This field, called “computer vision”, is considered one of the toughest areas of artificial intelligence. Computer vision researchers have an injoke: vision is “AI complete”, a play of words that means that human-like artificial vision will be achieved after all other problems in AI have been solved. But computer vision has made progress, and we’re nearing a suite of vision technologies that will drive future economic growth.

Can 2D images be converted to 3D with today’s technology?

Although far from the abilities of the human mind, artificial vision works surprisingly well in particular applications, and vision technology is moving quickly, thanks to improvements in hardware and algorithms. Most research focuses on the 2D projective space (i.e., retinal information), eschewing 3D as “too hard”. However, a renaissance in 3D vision is underway, driven by recognition of its importance, and by new technologies like self-driving cars.

Cutting edge research is showing that 3D vision is possible with new hardware and new techniques, and this dramatically simplifies other problems, like precisely where is that pedestrian, what direction are they traveling, and how fast. These questions are nearly impossible with 2D projective images, but they’re dead simple in 3D.

The next generation economy

The first giant computers were called “brains”, and the first electric computer — built here in the USA — was called the “big brain”. Intelligent computing seemed just around the corner… anything was possible! That was a long time ago, but the promise of artificial intelligence has finally landed, and it’s going to be big. 3D vision and reasoning will be everywhere, from games, to smart phones, cars, brick and mortar stores… to things we haven’t even dreamed of. The Perceive Camera Network, and even self-driving cars, that’s just the tip of the iceberg.

Find out how we can work together!

Let's talk about how Perceive can

  • improve your operations
  • reduce costs
  • increase revenue