3

Let's say we have a captcha system that consists of a greyscale picture (of a part of a street or something akin to re-captcha), divided into 9 blocks, with 2 missing pieces.

You need to choose the appropriate missing pieces from over 15 possibilities to complete the picture.

The puzzle pieces have their edges processed with glitch treatment as well as they have additional morphs such as heavy jpeg compression, random affine transform, and blurred edges.

Every challenge picture is unique - pulled from a dataset of over 3 million images.

Is it possible for the neural network to reliably (above 50%) predict the missing pieces? Sometimes these are taken out of context and require human logic to estimate the correct piece.

The chance of selecting two answers in correct order is 1/15*1/14.

Mithical
  • 2,885
  • 5
  • 27
  • 39
  • You're thinking of making the task visually more challenging, which ultimately isn't going to help much. I think captcha uses other variables like mouse tracking to discern human movement from an AI-based decision. – Frans Rodenburg Jul 07 '20 at 07:28
  • Related: https://stackoverflow.com/a/27299487/8770170 – Frans Rodenburg Jul 09 '20 at 23:53

2 Answers2

0

Well to give you a short answer, I would say that YES, it would be MORE resistant than a more standard captcha approach...

This being said, I would still go as far as to predict something like a 75-80% successful prediction rates, for a custom model which is designed specifically for defeating a mechanism such as what you describe. The reason why I am fairly confident in such an appraisal, is primarily because of the following:

  1. New techniques which researchers have begun to explore, which are intended to be "Structure Preserving Convolutions" which utilize a higher dimensional filter to store the extra correlation data.

  2. I think that the obfuscation efforts that you mention will definitely help to some degree, although they can be easily defeated by training the model on a dataset which you pull out some portion of the samples during pre-processing and inject the same sort of noise and glitch treatments, etc.

    • An idea that would be worth exploring would be to process your dataset with an adversarial model which you could then use to generate Adversarial Noise that could then be fed into a pre-process step for your images and replace (or extend) the obfuscation efforts!

TL;DR: If you cant beat 'em, then join 'em! Just train a model to defeat your captcha implementation, and then use the model to generate adversarial examples and then apply obfuscations to your data set accordingly!

For more information on what I am talking about in my suggestion for further obfuscation efforts explore some of the papers you can find on Google Scholar - Ensemble Adversarial Training Examples

  • 1
    The problem with use of adversarial approach here is that you have no measure of whether an example is still easily solved by humans. Without a suitable loss function, you would need to sample test large numbers of outputs with humans, looking for high percent human solvers (and it will need to be very high to be a useful CAPTCHA), and low percent machine solvers. – Neil Slater Jul 04 '20 at 18:56
  • yeah but what about universal adversarial perturbations? –  Jul 05 '20 at 14:22
  • I find that to be a very weak argument against my answer. If you would investigate the literature on adversarial training, you would find that with most adversarial inputs the image is not even visually altered from a human point of view, although completely defeats the ML models attempting to process them. We are talking about slight changes in pixel color values, tiny perturbations, etc. – Chaplin Marchais Jul 06 '20 at 04:20
  • for being so arrogant, his attack ideas are mediocre at the very best, we have been aware of this kind of attack and introduced safety measures against it. However, we would still like to introduce these perturbations, do you think an Uap will do the job? –  Jul 06 '20 at 10:20
  • 1
    very specific claims and numbers are cited without any justification. it is up to you to justify your claims, of which no part of your answer provides asides from a few name drops and no explanation as to why these techniques might suffice towards this goal. – k.c. sayz 'k.c sayz' Jul 06 '20 at 22:45
  • 1
    also your idea of training an adversarial network, though laudable in theory, is ignorant towards the simple fact that adversarial networks training doesn't very often converge towards a stable solution. – k.c. sayz 'k.c sayz' Jul 06 '20 at 22:57
  • well reCaptcha uses adversarial examples. –  Jul 11 '20 at 00:14
  • no arguments on your part bro. Also, you are a neuroscientist not an AI professional –  Jul 11 '20 at 00:15
  • @EricJohn reCaptcha uses adversarial examples because..... its object categorization. what is your adversarial example training against? – k.c. sayz 'k.c sayz' Jul 21 '20 at 05:35
0

This is not resistant at all. A simple comparison on the similarity of edge pixels between borders should be very sufficient to break this method completely.

We can do a very simple calculation. Assume the picture is 8bit black and white with each border being 50x50 pixels. Also assume the distribution is continuously uniform between 0-255 (should probably be normally distributed, but whatever). You have a total of 200 pixels that border each other in between borders. Assume that the naturally generated image is continuous in brightness with respect to dimension in at least 10% of the image with +/- 10 units of brightness is acceptable. Thus we have 20 pixels to work with.

In the case where the image is incorrect, we assume the pixel brightness to be iid in [0-255], thus giving us roughly a 8% (21/256) chance of each pixel around the border to be of acceptable similarity. Which gives us about a 10^-22 chance of this algorithm being fooled. You might disagree with my assumed parameters, but to be frank I am probably being too generous in estimating a lower bound.

There are two lessons here: 1. Just because you and others can't think of a way to break your secure system doesn't mean it's actually secure. 2. Modern ML techniques are not strictly stronger than handcrafted algorithms, though I would also imagine that a simple NN would be able to solve this problem easily.

k.c. sayz 'k.c sayz'
  • 2,061
  • 10
  • 26
  • 1. The edges are processef with glitch treatment. That alone is enough to make tjis attack unfeasible. we ar every well aware of this method. –  Jul 05 '20 at 18:56
  • 2. The decoys are chosen on the basis of similarity of the lightness and the similarity of the pixels around the edges. –  Jul 05 '20 at 18:57
  • 1. too much noise and no human subject will be able to recognize the object. even if we increase the threshold of noise to something ridiculous like 20% we are still talking about a false positive rate of 10^-14. 2. the identical calculation will demonstrate that the chance of finding a natural scene with such parameters will be something similar. did you even do any math before you make a claim such as "That alone is enough to make tjis attack unfeasible"? there is a good reason why this method is not used even historically (because it doesn't work) – k.c. sayz 'k.c sayz' Jul 06 '20 at 22:53
  • if you don't believe me, go ahead and run your systems with this method of security – k.c. sayz 'k.c sayz' Jul 06 '20 at 22:54
  • https://ieeexplore.ieee.org/abstract/document/5692499/ –  Jul 08 '20 at 07:04
  • your method works only in theory. You havent even read what I said and you keep repearing yourself. Your attack was patched 3 months ago. –  Jul 08 '20 at 07:05
  • And you don't even know what kind of noise i am talking about. –  Jul 08 '20 at 07:08
  • @EricJohn no. it is you who didn't specify the type of noise to be used. what else should i assume asides from gaussian noise on the pixel strength? just look at figure 5: it should be obvious that adding some fuzzing to your boundary search algorithm to look for more adjacent pixels than just the direct border would be very sufficient to solve this problem. just because this paper was featured in a conference doesn't mean it is correct, this should be obvious. – k.c. sayz 'k.c sayz' Jul 21 '20 at 05:24
  • there is a very good reason why text distortion was used in early captcha: because text segmentation is/was /very/ hard and the images are easy to generate. – k.c. sayz 'k.c sayz' Jul 21 '20 at 05:27