Data Breach Detection 2.0

DBD stands for Data Breach Detection. In this section we supply APIs to find compromised data related to your organization or personal information.

Permission Model

❗️

Special Permission Required

For security reasons, access to the service is limited.
Access is granted per Entity required and coordinated with a Webhose representative.

📘

Note

To use the API you need to call an endpoint URL with your private access token. You can generate your call URL on our API Playground (you must be logged in to use it).

More information about the parameters is provided in the Output Reference Section.

Use the Data Breach Detection (DBD) API to expose potential Personal Information (PII) leaks for organizations and individuals using Webhose.io.

Leaked Data Processing

Our main data sources are either data leaks in form of files (xls, SQL, CSV , etc) or leaked snippets from any source in the web or dark networks.
Webhose.io DBD processing starts by discovery of new leaks that are scattered in the dark networks. We then validate the structure and remove duplicated leaks. The next step is data sanitation, meaning that any sensitive data is removed. The last step is data indexing where we ensure that the data is prepared for an entity-based search.

  • A single entity search places the output into a single entity feed of up to 100 matched leak records. No pagination is required.
    For Email, Phone, SSN, Credit-Card the result will be of type single entity.
    (It has been found that for these cases, the leaked record count rarely exceeds 50-60, so the limit of 100 records should be more than adequate.)
  • A multiple-entity search places the output into multiple entities: 10 entities per page each of no more than 100 matched leak records.
    For Domain , Bin6, Bin8 the result will be of type Multiple Entity.

The leaked records can be searched by an entity type or value. Examples will be shown below.
The data types that can be searched are:

  • Email addresses
  • Account Names
  • Domains
  • Credit-cards
  • Bin (part of Credit cards)
  • Phones numbers
  • Social Security Numbers (SSN)
  • Passport numbers

Data Breach Output Document - Structure and Main Sections

The DBD output contains the following sections:

  • root - it includes the metadata of the response: totalDocs, next , moreDocsAvailable, requestsLeft fields.
  • docs - one or more entities, each entity with its unique identifier and metadata.
  • leaks - one or more records with the relevant leak context and the compromised information relevant to the entity.
    Here is an example leak found in a query using the email value:

Data Breaches - Data Consumption Model and Expected Records.

Each request equals to 1 query or credit.
Recall that the requests are based on one of two an Entity search types:

  • Single Entity
    The entity searched including all matched leaked records and up to 100 records.
    For Email, Phone, SSN, Credit-Card the result will be of type Single Entity.

  • Multiple Entities
    The search includes a list of entities and up to 10 per page, each entity shall include all matched leaked records and up to 100 records.
    For Domain , Bin6, Bin8 the result will be of type Multiple Entity.

Single Entity Example - Credit Card

Multiple Entity Example - Domain

Up to 10 results per page.

Updated 8 days ago


What's Next

GET Parameters

Data Breach Detection 2.0


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.