Users/clients and organizations began to focus more on Personal Identifiable Information (PII) and, in general, personal data from 2010 onwards. Personal data controversies raised legal rights issues, ethical obligations, commercial disputes and information security concerns.
2021 wasn’t any different, with more data breaches than ever.
Indeed, more and more PII is being generated, transferred, and accumulated every day; and it becomes a target for hackers of all kinds. Yet, end-users often misjudge the threats, and some organizations believe that social engineering is merely a Hollywood plot.
But what is PII?
Personally Identifiable Information (PII) is formally defined (https://gdpr.eu/eu-gdpr-personal-data/) as “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”
Let’s break that down.
PII is information that organizations use on their own or combined with other information to identify, contact, or locate a single person or identify that individual in context. Personal data comes in many different shapes:
- Financial information
- Login IDs
- Biometric identifiers
- Video footage
- Geographic location data
- Customer loyalty histories
- Social media
- IP Addresses
Some of the PII is publicly available and scattered across many websites, but do not underestimate the value of such information.
Hackers are not only interested in the amount of personal data found but are also interested in the correlation of these pieces. Public data breaches contain password hashes that can be decrypted using password dictionaries. With a bit of time and enough resources, a hacker can retrieve any password you use on your websites CMS.
Not many CMS systems enforce password requirements or regular password updates, which means your website can become a target quickly.
Most of these attacks are never exposed because you gave the key to the house, and nobody saw any forced entry. As long as nothing changes, you might never know. That is why it is essential to be aware of any PII data on your websites and how it can be used against you.
PII Discovery to the rescue
International companies likely have many websites and limited resources. Manually verifying all those pages becomes a nightmare quickly, so best is that you use automated tools to scan your websites such as
- ZAP Proxy: the OWASP project has experimental support for PII to scan your website for leaks.
- CUSPIder: an open-source forensic file scanning application that can scan for PII related data.
- Sitefig: web-governance suite with an extensive set of PII scanners to make your website GDPR compliant.
Why would you need a PII scanner?
Your website is most likely updated right now. If you have a CMS, editors publish new content, and IT delivers updated solutions to suit your company’s needs. So a regular scan is required to maintain compliance and make sure you avoid embarrassing or costly mistakes:
- Save cost: it is better to prevent than to fix. If you can identify all the PII you have on your website on a regular basis, you can mitigate risk before they actually become an issue.
- Customers trust: the worst UX a visitor can have is an email from your company about a data breach but if you do have to send it, make sure you do it within the legal restraints. Under GDPR regulations, your company must to report data breaches within 72 hours.
- GDPR regulations: to make sure your website adheres to the policy published on your website. Too often policies are published at the start of the project, and never updated. Keep your policies updated with facts so you can either edit the policy or update your website functionality and data.
It won’t happen at this company.
Wishful thinking isn’t going to change the fact that hackers are hard at work. After servers become online, it takes minutes before hackers scan the IP address for vulnerabilities.
Human error is a significant factor too, and it accounts for most of the vulnerabilities companies face. No matter how strict your policies are, you cannot rule out human error completely. Scanning regularly makes sure you don’t miss any issues.
Make PII discovery part of the regular scan.
Whether you use open source tools or a free Sitefig account, any company with several websites should take care to scan for potentially damaging or necessary PII data. It is impossible to rule out publishing names and email addresses; telephone numbers and even bank account need to be published in some cases.
Set up a regular scanning solution to keep your reputation intact and stay compliant.