I wrote a simple CAPTCHA scheme and wanted to share it with the awesome security community as a CAPTCHA breaking exercise. To solve the CAPTCHA an individual (or machine) will have to enter only the characters with a white border and ignore other text. I understand that at this stage the lettering may sometimes be hard to read, we're working on that, but for now, lets see how far we can get with this POC design. Here are a couple:
Anti-Automation MechanismsThe main intent was to make Noise removal, Segmentation and Classification interdependent and increase the complexity of automatic solvers. Here are the anti-automation mechanisms in this CAPTCHA:
- Closeness of noise to real text : The noise is of same style (i.e. alphanumeric) and superimposed on the real text. The font is of same size as original text and the text to be solved. This increases the difficulty of removing the noise and regular noise removal algorithms may be ineffective. The random background line serves to highlight the white border and also as a noise source.
- Hard to Segment : The CAPTCHA solution and noise are mixed up in an unpredictable fashion with random positioning variation on X and Y axis. It may therefore be hard to segment various alphabets from each other single out individual characters for classification.
- Anti-Classification :While writing custom solvers, statistical analysis is performed to on CAPTCHA text by plotting the pixel densities against X and Y axis to identify the correct characters. When text is superimposed with no clear indication on demarcation of text boundary, the classification techniques do not work.