How Data Brokers Get Your Information
At a Glance
- Data brokers pull from 8 major source categories to build your profile — most collection happens without your knowledge
- Public records are the single largest source, but retail purchases and app SDKs contribute data you may never expect
- Your data passes through a secondary market where brokers buy and sell from each other, amplifying your exposure
- Each source has a specific countermeasure — no single action eliminates all collection, but targeted steps make a real difference
- Run a free scan to see what brokers have already assembled about you
Data brokers do not guess. They build profiles from concrete sources — public filings, purchase records, mobile apps, and dozens of other data streams that most people never think about. The average American adult has profiles on 50 to 100+ broker sites, and each of those profiles was assembled from real data that entered the broker ecosystem through identifiable channels.
Understanding where brokers get their data is the first step toward cutting off the supply. This article breaks down the 8 major source categories, explains what each one contributes to your broker profile, and gives you a specific action to reduce collection from each source.
1. Public Records
Government records are the foundation of the data broker industry. Every broker profile starts here because the data is free, legally accessible, and reliably accurate. Brokers systematically harvest records from county, state, and federal agencies — often through bulk data purchase agreements that cost a fraction of a cent per record.
The categories of public records that brokers consume include:
- Property records — county assessor and recorder offices publish deeds, mortgages, sale prices, and property tax assessments. Every home you have bought or sold is documented, including the transaction amount and the other parties involved.
- Voter registration — most states sell or make available their voter files, which include your full name, address, date of birth, party affiliation, and voting history (whether you voted, not how). States like Florida, Texas, and Ohio make voter data especially accessible. Only a handful of states (including California and Virginia) restrict commercial use.
- Court filings — civil lawsuits, criminal cases, divorces, bankruptcies, eviction proceedings, and restraining orders are all public in most jurisdictions. Federal court records are available through PACER. Brokers scrape these systematically.
- Marriage and divorce records — filed at the county level, these reveal spouse names, dates, and sometimes maiden names. Brokers use marriage records to build family relationship graphs.
- Professional licenses — state licensing boards publish databases of licensed professionals (doctors, lawyers, real estate agents, nurses, CPAs). These records include full names, addresses, license numbers, and disciplinary actions.
- Business filings — LLC registrations, DBA filings, and corporate officer records from secretary of state databases. If you have ever registered a business, your name and address are in the broker pipeline.
Why this matters: Public records are virtually impossible to suppress at the source. You cannot opt out of property ownership being public record. This is why removal from data brokers requires going after the brokers themselves, not the data sources.
2. Social Media Scraping
Every major social media platform prohibits scraping in its terms of service. Every major data broker does it anyway. The gap between platform policies and enforcement is enormous, and brokers exploit it systematically.
What brokers collect from social media goes beyond your posts:
- Profile metadata — your display name, username, bio text, profile photo, cover photo, join date, and location field. Even if your posts are private, this metadata is often public.
- Follower and following graphs — who you follow and who follows you reveals relationships, interests, and professional connections. Brokers use this to identify relatives, coworkers, and romantic partners.
- Photo metadata — images uploaded to social media can contain EXIF data with GPS coordinates, camera model, and timestamps. While most platforms strip EXIF on upload now, older photos (pre-2016) were often stored with full metadata intact.
- Check-ins and location tags — Foursquare, Instagram, and Facebook check-ins create a timestamped location history that brokers use to confirm your address, identify your workplace, and map your daily patterns.
- Employment and education — LinkedIn profiles are a primary source for employer data. Brokers scrape job titles, company names, start dates, and education history directly from LinkedIn, despite the platform's aggressive anti-scraping measures.
Setting your profiles to "private" helps but does not eliminate collection. Your username, profile photo, display name, and bio remain visible on most platforms even with the strictest privacy settings. And any data that was public before you locked your account may already be in broker databases.
3. Retail Purchase Data
Every time you swipe a loyalty card, use a store credit card, or scan a receipt through an app, you generate purchase data that enters the broker ecosystem. Retail data is valuable because it reveals what you buy, how much you spend, how often you shop, and where you shop — behavioral signals that advertisers pay premium prices for.
- Loyalty programs — a single grocery loyalty card generates 500+ data points per year: every item you buy, every coupon you redeem, your visit frequency, and your average basket size. Programs like Kroger Plus, CVS ExtraCare, and Walgreens Balance Rewards sell aggregated purchase data to third-party data brokers.
- Store credit cards — retail-branded credit cards (Target RedCard, Amazon Store Card, Macy's) generate itemized transaction data that the retailer can use and share, separate from the credit card network's data.
- Receipt-scanning apps — apps like Fetch Rewards, Ibotta, and Shopkick offer cash back in exchange for photos of your receipts. The actual product is your purchase data, which these companies sell to market research firms and data brokers. A $0.25 cashback offer buys access to your entire grocery receipt.
- Data cooperatives — retailers pool their transaction data through cooperatives like Acxiom's InfoBase and Oracle Data Cloud. These cooperatives merge purchase records across dozens of retailers into a single profile, giving brokers a cross-merchant view of your spending.
The loyalty card tradeoff: that $3.50 you save per grocery trip costs you your purchase history for the year. Brokers classify you as an "organic food buyer," "discount shopper," "alcohol purchaser," or "baby product buyer" based on loyalty card data — and sell those labels for far more than the discount you received.
4. App SDKs and Mobile Data
Free apps are not free. They are monetized through embedded software development kits (SDKs) that harvest data from your phone and sell it to data brokers. This is the least understood data collection vector, and it is one of the most invasive.
When you install a free flashlight app, weather app, or mobile game, the app itself may be harmless. But the SDKs embedded in that app — often 10 to 20 per app — are independently collecting and transmitting data:
- Precise GPS location — companies like SafeGraph (now owned by Dewey) and Foursquare embed location SDKs in thousands of apps. These SDKs ping your location every few minutes, building a timestamped movement history. SafeGraph was caught selling location data from prayer apps that revealed visits to mosques, churches, and abortion clinics.
- Contact lists — apps that request contacts permission transmit your entire address book. This is how brokers connect you to people who have never used the app themselves. One person granting contacts permission exposes everyone in their phone.
- Device identifiers — your phone's advertising ID (IDFA on iOS, GAID on Android) acts as a persistent tracking cookie across apps. Brokers use this ID to link your activity across dozens of apps into a single profile, even when you have not logged in.
- Browsing data — in-app browsers and SDK-level tracking can capture URLs you visit, search queries, and time spent on pages, all tied to your device ID.
- Installed apps — some SDKs scan your phone for other installed apps, revealing your interests (dating apps, health apps, financial apps) and selling that list as behavioral data.
Apple's App Tracking Transparency (ATT) and Google's Privacy Sandbox have reduced some SDK-level tracking since 2021, but enforcement is inconsistent. Many SDKs use probabilistic fingerprinting to continue tracking even when users opt out.
Wondering how exposed you are? Delist.ai scans 1,000+ data broker sites and shows exactly where your personal information appears.
Check your exposure free →5. Data Broker-to-Broker Sales
The data broker industry is not a collection of independent companies. It is a supply chain. Brokers buy from other brokers, merge records, enrich profiles, and resell the combined data downstream. Your personal information may pass through 10 or more companies before reaching the people-search site where you eventually discover it.
The supply chain works in layers:
- Source-level collectors — companies that specialize in one data type: public records aggregators, location data firms, purchase data cooperatives. They sell raw data in bulk to mid-tier brokers.
- Aggregators — companies like Acxiom, Oracle Data Cloud, and LexisNexis merge data from dozens of source-level collectors into unified consumer profiles. A single aggregator profile might combine public records, purchase data, location data, and online behavior into one record.
- People-search sites — consumer-facing sites like Spokeo, BeenVerified, WhitePages, and Radaris buy from aggregators and repackage the data for individual lookups at $1 to $30 per search.
This layered market creates a critical problem for privacy: removing yourself from one broker does not remove you from its suppliers. The supplier will resell your data to a different broker, or the original broker will re-purchase it within weeks. This is why one-time opt-outs rarely stick — the supply chain keeps refilling the downstream databases.
The re-listing problem: industry data shows that 30–60% of profiles reappear on broker sites within 90 days of removal. The data was not deleted upstream — it was just re-sold. Effective removal requires ongoing monitoring, not a one-time request.
6. Credit Header Data
Your credit file has two parts: the body (your credit score, account details, payment history) and the header (your name, address, phone number, date of birth, and Social Security Number). The body is protected by the Fair Credit Reporting Act (FCRA), which limits who can access it and for what purpose. The header is not.
Credit bureaus — Equifax, Experian, and TransUnion — sell header data under the Gramm-Leach-Bliley Act, which permits sharing of non-public personal information between financial institutions and their affiliates. In practice, this means:
- Name and all name variations — every name you have used on a credit application, including misspellings that became permanent records.
- Current and historical addresses — every address associated with a credit account, going back decades. This is often the most complete address history available for any individual.
- Phone numbers — numbers you provided on credit applications, including numbers you stopped using years ago.
- Date of birth — confirmed, not estimated. Credit bureau DOB data is authoritative because it was verified during account opening.
- Employer information — self-reported employer from credit applications, though this is often outdated.
Credit header data is particularly valuable to brokers because it is comprehensive and verified. Unlike public records (which can be fragmented across counties) or social media (which can be fabricated), credit header data is tied to real financial transactions that required identity verification.
7. Surveys and Sweepstakes
The "Win a $500 gift card!" pop-up is not a scam in the traditional sense. It is a data collection mechanism. Companies like Jigsaw (formerly Salesforce Data.com), Epsilon, and dozens of smaller firms run sweepstakes, surveys, warranty registrations, and free product offers specifically to harvest personal data for resale.
- Online sweepstakes — entering a contest typically requires your name, email, phone number, mailing address, and date of birth. The prize costs the company a few hundred dollars. The data from thousands of entrants is worth far more.
- Consumer surveys — "Tell us about yourself" surveys on product packaging, in email campaigns, and on websites collect demographic and behavioral data that respondents volunteer willingly. Income range, household size, education level, hobbies, health conditions — all self-reported and highly valued by brokers.
- Warranty registrations — when you register a product warranty online, the registration form typically asks for information far beyond what is needed (household income, number of children, other products you own). This data is sold.
- Magazine subscriptions — subscriber lists have been bought and sold since the 1970s. Your subscription to a gardening magazine, a gun publication, or a parenting newsletter categorizes you for decades.
- "Free" product trials — free trials that require your credit card "just for shipping" exist to capture your payment information and contact details, both of which enter the broker pipeline.
The common thread is voluntary disclosure. People hand over real personal data because the exchange feels harmless. A sweepstakes entry or a warranty card seems like a trivial interaction, but the data it generates is permanent and will be resold repeatedly.
8. Web Browsing and Cookie Data
Your browsing history is tracked by an ecosystem of advertising technology companies that operate invisibly behind the websites you visit. This data feeds into broker profiles as behavioral and interest signals.
- Third-party cookies — when you visit a news site, the page loads cookies from 20 to 50 third-party domains (ad exchanges, analytics platforms, data management platforms). These cookies track you across sites, building a browsing profile tied to your device. Although Chrome has announced plans to deprecate third-party cookies multiple times, as of early 2026 they remain functional in most browsers except Safari and Firefox.
- Tracking pixels — invisible 1x1 pixel images embedded in web pages and emails. When your browser loads the pixel, it sends your IP address, user agent, timestamp, and the referring page to the tracking company. Email tracking pixels tell senders exactly when and where you opened a message.
- Browser fingerprinting — even without cookies, websites can identify your browser by its unique combination of screen resolution, installed fonts, browser version, operating system, language settings, and dozens of other attributes. Canvas fingerprinting, WebGL fingerprinting, and AudioContext fingerprinting make it possible to track you with 90%+ accuracy across sessions without storing anything on your device.
- Data management platforms (DMPs) — companies like Oracle BlueKai, Lotame, and LiveRamp aggregate browsing data from thousands of websites, link it to real identities using deterministic matching (login events) or probabilistic matching (device fingerprints), and sell the resulting profiles to advertisers and brokers.
Cookie deprecation is not a fix: Google's on-again, off-again plan to remove third-party cookies from Chrome gets significant press coverage, but it addresses only one tracking vector. Browser fingerprinting, first-party data collection, server-side tracking, and login-based identity graphs all continue to function regardless of cookie policy. The advertising industry has already adapted.
Reducing Your Exposure: One Action Per Source
No single action stops all data collection, but a targeted approach — addressing each source category individually — significantly reduces the flow of new data into broker databases. Here is one high-impact action for each of the 8 sources.
| Source | Action |
|---|---|
| Public Records | Use a PO box or registered agent address for business filings and vehicle registrations. You cannot hide property ownership, but you can limit new address records by using an LLC to hold real estate. |
| Social Media | Audit your profile visibility on every platform. Set profiles to private, remove your phone number and email from public-facing fields, and delete location check-in history. |
| Retail Purchases | Stop using loyalty cards tied to your real identity. If the discount matters, use a phone number you do not use elsewhere and a name that is not your own. Delete receipt-scanning apps entirely. |
| App SDKs | Revoke location permission for all apps except navigation. On iOS, enable App Tracking Transparency and deny all tracking requests. On Android, reset your Advertising ID monthly. |
| Broker-to-Broker | Submit removal requests to upstream aggregators (Acxiom, Oracle, LexisNexis), not just consumer-facing sites. Use a service like Delist.ai to monitor and re-remove when profiles reappear. |
| Credit Headers | Freeze your credit at all three bureaus. A freeze does not stop header data sharing entirely, but it limits new inquiries that would update your file with current contact information. |
| Surveys & Sweepstakes | Stop entering sweepstakes, filling out warranty cards, and completing online surveys. If you must, use a disposable email address and do not provide your real phone number or mailing address. |
| Web Browsing | Use Firefox or Brave (built-in tracker blocking). Install uBlock Origin. Use a VPN to mask your IP address. Decline cookie consent banners instead of accepting them. |
For a deeper dive into each of these countermeasures, see our data minimization guide.
Frequently Asked Questions
Is it legal for data brokers to collect my information?
In most of the United States, yes. There is no comprehensive federal privacy law regulating data brokers. California (CCPA/CPRA), Vermont, Texas, and Oregon have data broker registration or opt-out laws, but they require you to take action — they do not prevent collection. The EU's GDPR provides stronger protections for European residents, but it does not apply to U.S.-based brokers operating domestically. The legal landscape is slowly changing, with new state privacy laws taking effect each year, but as of 2026 the industry remains largely self-regulated.
Which source contributes the most data to my broker profile?
Public records. Your name, address history, property ownership, voter registration, court records, and professional licenses form the skeleton of every broker profile. These records are comprehensive, verified, and free for brokers to access. Other sources (purchase data, location data, social media) add behavioral and interest signals on top of the public records foundation, but without public records, brokers would have no reliable identity anchor.
Can I stop data collection at the source?
For some sources, partially. You can lock down social media, stop using loyalty cards, and revoke app permissions. But you cannot prevent property deeds, court filings, and voter registrations from being public. This is why the most effective privacy strategy combines source reduction (limiting new data) with downstream removal (opting out of brokers that already have your data) and ongoing monitoring (catching re-listings). No single action is sufficient.
How quickly does my data appear on broker sites after collection?
It depends on the source. Public records updates (new property purchase, address change via voter registration) typically appear on broker sites within 30 to 90 days. Location data from mobile SDKs can be available to buyers within 24 to 48 hours. Purchase data from loyalty programs is usually aggregated monthly. Credit header data updates whenever you open a new account or change your address with a creditor. The overall cadence means that major life events (moving, buying a home, changing jobs) trigger a wave of updates across broker databases within one to three months.
Do data brokers know I am reading this article?
Delist.ai does not use third-party tracking cookies, advertising SDKs, or analytics platforms that share data with brokers. However, if you arrived here via a Google search, Google recorded that query. If you are browsing without a VPN, your ISP can see the domain you visited. And if you have browser extensions installed, some of them may be collecting and reselling your browsing history. The irony is real: learning about data brokers can itself generate data for data brokers, depending on your browser configuration.
What is the difference between a data broker and a people-search site?
People-search sites (Spokeo, BeenVerified, WhitePages, Radaris) are a consumer-facing subset of data brokers. They buy data from upstream aggregators and sell individual lookups to the public. Behind them sit larger data brokers (Acxiom, LexisNexis, Oracle Data Cloud, Epsilon) that operate at the wholesale level, selling bulk data to marketers, insurers, employers, and government agencies. The people-search sites are the visible tip; the wholesale brokers are the larger, less visible infrastructure underneath.
See What Brokers Already Have
Run a free scan to find out which data brokers have your personal information and exactly what they are displaying.
Scan now — it's free →