While Facebook, Microsoft, and many others are banding together to help make machine learning capable of detecting deepfakes in videos, we at Signzy are trying to solve a similar problem, detecting fakes in documents. In the journey of building the global digital trust system, we at Signzy had to solve this major challenge of detecting image manipulations in identity documents.
Fig 1.0 Example of our forgery detection in action
In this blog, I will try to explain our approach in building an innovative image manipulation detection approach using deep learning.
The above images are examples of the advancements in image manipulation techniques. It takes a considerable effort for a human to find out that the image is forged. The features which distinguish real and fake are less, which makes it difficult to detect with human eyes.
Our objective was to build a system which could detect image manipulated documents.
Our first step was to create a dataset of forged documents to test the algorithm. With our expertise and domain knowledge in this field we came up with various scenarios on how an intruder would forge a document. The corresponding data for these scenarios was prepared by photoshop experts.
The forged documents were of mostly two categories.
- Copy paste : A region of the image copied from a particular document and pasted into a different document.
- Copy move : A region of the image copied from a particular document and pasted into the same document.
This is the type of forgery when a fraudster tries to copy a face from one document into another document. Our goal was to detect these forged regions and to classify the document as fake or real.
The dataset that we created manually using photoshop experts was not enough to train any deep learning solution around it. So we developed image processing algorithms which could generate synthetic forged data. Now all set for the experimentation.
For forged region detection, our approach was to first start off with the state of the object detection methods. We tried with FRCNN to predict the bounding boxes of the forged region along with the class information. FRCNN uses convolution nets to extract feature maps from the input image. These maps are then passed on to a Region Proposal Network which will give proposals for bounding boxes. These proposals are passed on to the ROI pooling layer which converts all the proposals to the same size. Finally, they are passed on to a fully connected layer to predict bounding boxes and classes. This method did not give us better results because the forged regions were of very small size.
Our second approach was to train a patch-based classifier which could classify between real and forged patches. The idea was on the assumption that if the copied image region has a different compression footprint when compared to the region to which its copied to, there would be a strong shift in the way that the pixels are grouped. This method proved to be very efficient.
It almost gave us around 97% accuracy. We did a lot of ablation studies to find the right configurations which I can’t reveal due to IP issues.
This is the type of forgery when a fraudster tries to change any text in an image by copying a similar text from the same image. For example, changing dates. Our goal was to detect these forged regions and to classify the document as fake or real.
There is a lot of literature related to detection of this type of forgery. The popular one is DCT based feature matching. In this method, DCT followed by quantization is performed on a 16×16 patch extracted from the image. The similar operation is performed throughout the entire image and all the matrices are sorted. Then for each row in the matrix the corresponding shift vector is calculated. If two regions are copied the shift vector of those regions would match. A very powerful algorithm that works well in most scenarios. But in our use case, since a document has many regions that have the same DCT values this method couldn’t be applied.
Our method involved two parallel networks. First, an encoder-decoder network predicts pixel-wise forged regions. A second network runs in parallel that finds feature maps which are in correlation with forged region predicted by the first network. Both networks are trained together with a cumulative loss function. I regret as I can’t reveal the full solution due to IP issues.
To summarize this blog, I had explained the two major types of forgeries which can be done in documents. Also, I had tried to explain the approaches we took to solve this challenging problem. Hope you had a nice read.
Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.
Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.
Visit www.signzy.com for more information about us.
You can reach out to our team at email@example.com
A B Sarvanan
Tech Lead — AI Team (Signzy)