It looks like you're using an Ad Blocker.
Please white-list or disable AboveTopSecret.com in your ad-blocking tool.
Thank you.
Some features of ATS will be disabled while you continue to use an ad-blocker.
It seems that in a real world application, air blowing through the room will make the plant move, too. How are they cancelling out this movement?
originally posted by: neoholographic
Canceling out this movement? Why would they need to cancel out this movement when the vibrations are captured in frame and are less than 100th of a pixel?
WOW!! COMPLETELY STILL LOL!!!
originally posted by: neoholographic
Canceling out this movement? Why would they need to cancel out this movement when the vibrations are captured in frame and are less than 100th of a pixel?
originally posted by: neoholographic
I said it will be harder to mask the audio because you're dealing with vibrations that are less than 100th of a pixel that gets better as you increase the frames per second. So it will not be like picking up sound with a bug that's in the room that depends strictly on audio data. This technology depends on visual data as well and that's why they did the experiment through soundproof glass. This is because they're recreating sound through visual data.
originally posted by: EvillerBob
A recording device in a room such as a "bug" works because the audio causes "vibrations" in the detecting equipment, which then translates the vibrations into an electronic signal.
This experiment simply replaces the microphone with a bag, and uses video combined with the algorithm to convert the visually observed vibrations into an electronic signal. It's the same fundamental concept. Sound makes something vibrate, and we use equipment to detect the vibration. This experiment still relies entirely on audio data as the initial input, just like a microphone. It simply replaces a fairly efficient method of capturing that data, with a fairly inefficient method.
That movement of the plant caused by the air conditioner would be part of the 100th of a pixel vibration. It would be additive to the "target" vibration of the voice, but that sum vibration would just be one movement. It's not like the plant is sometimes moving due to the target sound and sometimes due to the air conditioning.
Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.
originally posted by: neoholographic
a reply to: Soylent Green Is People
Again, this comes from a lack of understanding of the technology being used.
originally posted by: GetHyped
If the background noise is of sufficient amplitude, the masking will be so great that the signal you wish to record (someone's voice) will be masked to the point of intelligibility because the sum of these different pressure waves makes it impossible to pick out the original signal.
originally posted by: neoholographic
a reply to: EvillerBob
HOW HARD IS THIS TO UNDERSTAND?
originally posted by: Soylent Green Is People
Nowhere in the article in the OP do the MIT people say "voice vibrations are different than other vibrations, therefore we can separate voice from unwanted noise". Forgive me if I'm wrong, but that's what you seem to be suggesting.