Google Speech-to-Text API Can Help Attackers Easily Bypass Google reCAPTCHA

A three-year-old attack technique to bypass Google’s audio
reCAPTCHA by using its own Speech-to-Text API has been found to
still work with 97% accuracy.

Researcher Nikolai Tschacher disclosed his findings in a
proof-of-concept (PoC) of the attack on January 2.

“The idea of the attack is very simple: You grab the MP3 file of
the audio reCAPTCHA and you submit it to Google’s own
speech-to-text API,” Tschacher said[1]
in a write-up. “Google will return the correct answer in over 97%
of all cases.”

Introduced in 2014, CAPTCHAs[2] (or Completely Automated
Public Turing test to tell Computers and Humans Apart) is a type of
challenge-response test designed to protect against automated
account creation and service abuse by presenting users with a
question that is easy for humans to solve but difficult for
computers.

reCAPTCHA[3]
is a popular version of the CAPTCHA technology that was acquired by
Google in 2009. The search giant released the third iteration[4]
of reCAPTCHA in October 2018. It completely eliminates the need to
disrupt users with challenges in favor of a score (0 to 1) that’s
returned based on a visitor’s behavior on the website — all without
user interaction.

The whole attack hinges on research dubbed “unCaptcha[5],” published by
University of Maryland researchers in April 2017 targeting the
audio version of reCAPTCHA. Offered for accessibility reasons, it
poses an audio challenge, allowing people with vision loss to play
or download the audio sample and solve the question.

To carry out the attack[6], the audio payload is
programmatically identified on the page using tools like Selenium,
then downloaded and fed into an online audio transcription service
such as Google Speech-to-Text API, the results of which are
ultimately used to defeat the audio CAPTCHA.

Following the attack’s disclosure, Google updated reCAPTCHA in
June 2018 with improved bot detection and support for spoken
phrases rather than digits, but not enough to thwart the attack —
for the researchers released “unCaptcha2[7]” as a PoC with even
better accuracy (91% when compared to unCaptcha’s 85%) by using a
“screen clicker to move to certain pixels on the screen and move
around the page like a human.”

Tschacher’s effort is an attempt to keep the PoC up to date and
working, thus making it possible to circumvent the audio version of
reCAPTCHA v2 by

“Even worse: reCAPTCHA v2 is still used in the new reCAPTCHA v3
as a fallback mechanism,” Tschacher noted.

With reCAPTCHA used by hundreds of thousands of sites to detect
abusive traffic and bot account creation, the attack is a reminder
that it’s not always foolproof and of the significant consequences
a bypass can pose.

In March 2018, Google addressed[8]
a separate flaw in reCAPTCHA that allowed a web application using
the technology to craft a request to “/recaptcha/api/siteverify” in
an insecure manner and get around the protection every time.

References

  1. ^
    said
    (incolumitas.com)
  2. ^
    CAPTCHAs
    (en.wikipedia.org)
  3. ^
    reCAPTCHA
    (support.google.com)
  4. ^
    third
    iteration
    (security.googleblog.com)
  5. ^
    unCaptcha
    (uncaptcha.cs.umd.edu)
  6. ^
    attack
    (github.com)
  7. ^
    unCaptcha2
    (github.com)
  8. ^
    addressed
    (andresriancho.com)

Read more

Leave a Reply