A team of researchers has devised a method to defeat NuCaptcha, one of the most popular video-based antispam tests on the Internet, and have proposed a solution to increase its resilience to attacks.
CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart" and is meant to protect websites from automated spam bots.
Most people are familiar with image-based CAPTCHAs that require users to input a string of distorted characters in order to prove that they are human. However, there are also audio and video variants of such tests.
NuCaptcha is a video-based CAPTCHA implementation that uses animation techniques in order to make it harder for spam bots to decipher the characters. Its creators claim that NuCaptcha has the highest usability and security levels of any CAPTCHA on the market.
However, according to Stanford University researcher Elie Bursztein, that's not exactly true. Bursztein has worked with other researchers to evaluate the security of NuCaptcha since October 2010 and has devised a method that defeats it with a success rate of over 90 percent.
"The most difficult part of this research turned out not to be breaking NuCaptcha, which I've known how to do since December 2010, but rather to come up with the right abstraction to explain why video captchas might offer better security than image captchas and to synthesize where the extra security comes from," Bursztein said in a blog post detailing the NuCaptcha attack techniques he devised.
Bursztein and his colleagues have already developed a tool called Decaptcha which uses specialized algorithms to defeat image-based CAPTCHA implementations. "Compared to breaking image-based captchas, attacking video captchas is both harder and easier," Bursztein said.
The hard part lies with isolating the moving object that represents the actual CAPTCHA string. This requires motion tracking and flow analysis algorithms that evaluate every object based on their properties and appearance in different frames.
NuCaptcha attempts to make this type of attack harder by including backgrounds and additional characters that are not part of the actual verification string. For example, the standard version of NuCaptcha displays moving text that reads "Type the code:" followed by four random letters.
Isolating the four letters as the object of interest is technically the most complicated step in Bursztein's attack, but it isn't extremely difficult to achieve for NuCaptcha's current implementation.
The other steps are pretty much the same as for image-based CAPTCHA attacks. In fact, some of them, like segmenting the CAPTCHA string into individual letters, are easier to perform for video CAPTCHAs, because there are more frames to analyze and learn from.
Bursztein's research didn't focus only on defeating NuCaptcha, but also on identifying the best methods of improving video CAPTCHA security in general.
Animating the individual CAPTCHA letters, as well as adding confusing backgrounds can be easily defeated, the researcher said. "On the other hand, it seems possible to make the isolation of the correct moving object very difficult."
Bursztein refers to this as "tracking resistance" and it involves adding decoy objects that have the same properties as the actual CAPTCHA string in order to confuse the tracking algorithm.
"When successfully implemented, tracking resistance makes video captcha secure against vision/machine learning attacks and more secure than standard text-based captchas," Bursztein said.
The NuCaptcha creators were notified about Bursztein's findings in November 2011. According to the researcher, the company said that its systems serve video CAPTCHA tests of different complexity based on the risk associated with every user.
This means that requests coming from IP addresses that are, for example, associated with botnet activity, would result in more complex CAPTCHAs than those originating from average users.
These high-risk CAPTCHA tests differ in font face, size, thickness and warp levels from those analyzed by Bursztein, the company said.
"While we believe we got the version that a standard attacker might get (which is already harder than the version displayed on site), we have not evaluated the hard version referenced in their response," Bursztein said. However, the researcher doesn't believe that heavier distortions or more crowded letters represent an efficient defense.
The company is also preparing a fix that relies on distorting the shape of the individual CAPTCHA characters as they rotate in order to make it harder for the optical flow analysis algorithm to identify them. "I won't be able to characterize the effectiveness of this technique until they roll out their changes and I can test it," Bursztein said.