Review: Senzing uncovers relationships hiding within big data

Used to combat fraud or uncover accidental data duplication, Senzing is a powerful yet lightweight tool with an artificial intelligence that is actually extremely smart.

abstract networks and connections

Most of the time, when organizations are thinking about cybersecurity, they look at ways to monitor the connections between devices and programs, or even information and users. But the human aspects are often ignored, which can allow threat actors to go unnoticed when they launch new campaigns, or enable humans who are potential insider threats to permeate deep inside a targeted organization unnoticed.

Senzing began life in 2016 after being spun out from IBM. The goal of Senzing is to provide deep data analytics on potentially millions of records, without costing millions of dollars.

The program itself is deceptively simple in appearance. The entire thing can be downloaded for free from the Senzing website, so go ahead and try it out. There are both Mac and Windows versions of the program, and both can run on moderately powerful machines. Once installed, it no longer needs to connect to the internet, so it can even be used with air-gapped networks for total data security.

Using the program for any purpose to examine up to 10,000 records is completely free. The price scales up from there, to a high of $55,000 per month if you want to process a billion records or more through the system.

We tested Senzing using three databases with several thousand records each. When deployed to look for relationships within one of our three databases, the matching capabilities of the system became readily apparent. Data tends to get accidentally corrupted for different reasons. For example, people often make typos when entering data, so Michelle Jones can become Michele Jones, which can make it a separate database entry even though it represents the same person.

Senzing Single DB John Breeden II/IDG

Although this is the least powerful way to employ Senzing, finding duplicate or similar records within a single database is extremely quick and highly intuitive.

Senzing was able to find quite a few instances like that in a single database. It does that by looking at the supplemental data that is attached to each entry. So if Michele and Michelle have the same address and phone number, it’s a safe bet that they are the same person. But Senzing is a lot smarter than that. If there is a Michelle Jones and a Bridget Jones who share everything except their name, then you might be looking at a mother and daughter. It’s also possible that Bridget is a nickname for Michelle, so Senzing files it as a possible match until it can learn more.

This would make Senzing invaluable for complying with Europe’s new General Data Protection Regulation (GDPR), which requires that customers who ask to be removed from databases are removed in all instances. In that case, the company may be required to know about, and remove, both Michelle and Michele Jones, since they are the same person, and perhaps even Bridget Jones if there is enough of a match to suggest that she is also the same person.

To continue reading this article register now

The 10 most powerful cybersecurity companies