OSINT for Cybersecurity Recon: Mapping Your External Attack Surface

The first principle of defending a network is knowing what's actually on it. The second principle is knowing what the rest of the world can see of it. OSINT is the discipline that answers the second question, and for a defender it's not optional — an attacker is already doing this work, and if your inventory is worse than theirs, you're already behind.

This post is for blue teams, security engineers, IT managers, and founders who own the security of a real system. It's about turning the public OSINT toolkit inward — finding what an external attacker would find about your organization before they do.

Why defenders need OSINT

Three forces have pushed external reconnaissance from "occasional pentest activity" to "continuous defensive practice":

Shadow IT. Marketing spun up a Wix site in 2019. A contractor stood up a staging VPN in 2021. An acquired company's legacy hosts still resolve. None of it is in your CMDB. All of it is in someone's attack surface scan.
Cloud sprawl. Every team has cloud credentials. Every team launches resources. Public buckets, dev databases on public IPs, forgotten Lambda URLs — the typical Fortune 500 has thousands of internet-facing assets the CISO didn't authorize.
Mass scanning. Search engines for connected devices (covered in our Shodan post) and dozens of similar services index every public IP within minutes. Defenders need to know what those scans show before attackers query them.

The discipline that addresses this is Attack Surface Management (ASM). Commercial ASM platforms exist; the OSINT toolkit accomplishes a large portion of the same job manually.

External attack surface management

Conceptually, ASM has four loops:

Discovery — find all assets owned by the organization (domains, subdomains, IPs, cloud resources, code repos, mobile apps, social-media presences).
Enumeration — for each asset, identify open ports, services, technologies, certificates, and exposed content.
Risk scoring — correlate against known vulnerabilities, misconfigurations, and exposure patterns.
Continuous monitoring — re-run all of the above on a schedule and alert on deltas.

The OSINT side covers loops 1 and 2 thoroughly, and contributes data to 3 and 4.

Asset discovery workflow

Start with what you know and pivot outward.

Domain inventory

WHOIS records for known domains, registrar & org name. Group domains by shared registration metadata.
Reverse WHOIS — search registrars by organization name or email to find domains you forgot you owned (or that an acquired company brought).
Domain monitors like Whoxy or DomainTools track domain ownership changes over time.

IP space inventory

ARIN / RIPE / APNIC / LACNIC / AFRINIC regional registry searches by organization name return owned IP blocks.
BGP lookups (Hurricane Electric's BGP toolkit, RIPEstat) show ASN ownership and announced prefixes.
Cloud provider checks — AWS, Azure, GCP each publish IP ranges; cross-reference your cloud accounts against them.

Brand and infrastructure pivots

TLS certificates — certificate transparency logs (crt.sh, Censys CT) reveal every certificate issued for any subdomain of a watched domain. The single richest discovery vector.
Favicon hashes — the favicon of a corporate web app is often unique enough to identify all internet-facing instances of that app, even on unexpected IPs.
HTTP titles, body content, headers — Shodan and Censys index these and let you search for organization name across the entire internet, finding assets that aren't on a known domain.

Subdomain enumeration

This is where most defensive ASM gaps appear. The typical enterprise has 5-20x more subdomains than the security team thinks. Approaches in order of yield:

Certificate transparency. crt.sh or Censys CT for the parent domain. Every cert ever issued returns matching subjects + SANs. Often 80%+ of your real subdomain inventory.
Passive DNS. Historical DNS data from SecurityTrails, PassiveTotal, Mnemonic, RiskIQ. Finds subdomains that have ever resolved, even if they no longer do.
Wordlist brute-forcing. Tools like amass, subfinder, or massdns with a curated wordlist (e.g. SecLists) and a list of resolvers.
JavaScript and source map analysis. Static analysis of your own public JavaScript often reveals API hostnames, CDN endpoints, and dev URLs hardcoded in the bundle.
GitHub / GitLab dorking for your domain name turns up forgotten internal hostnames in code or config files.
Cloud metadata — querying AWS Route 53, Azure DNS, GCP Cloud DNS APIs from inside your own accounts captures every record you've authoritatively published.

The combined list — dedupe across all sources — is your authoritative subdomain inventory. Compare it to your CMDB / asset database. The diff is your shadow IT.

Ports, services, and tech fingerprinting

For each discovered host, you want:

Open TCP/UDP ports and the services behind them.
Service banners — software name and version, often enough to match against CVE databases.
TLS configuration — expired certs, weak ciphers, mismatched CNs.
HTTP technology fingerprint — Wappalyzer-style detection of the framework, CMS, analytics, JS libraries.
Default credentials — admin panels that ship with vendor-default logins.
Exposed paths — /admin, /.git/, /swagger, /.env, /backup.zip.

Three OSINT services cover most of this without you having to scan anything yourself:

Shodan — broadest mass-internet scan, easy to query by IP/CIDR/org.
Censys — richer protocol-level detail, strong CT integration.
FOFA / ZoomEye — non-Western alternatives, often see assets the others don't.

When you need fresher data than these caches provide, you actively scan your own assets with permission — nmap, masscan, naabu for port discovery; nuclei for templated vuln checks; httpx for HTTP probing.

Leaked credentials and breach data

An attacker's first probe is usually credential reuse against your assets. Your defensive job is to know which credentials are out there before they try.

Have I Been Pwned (HIBP) Enterprise / Domain monitoring — alerts when your domain appears in a new breach dump.
Breach intelligence feeds — commercial services (DeHashed, IntelX, SpyCloud) provide queryable, indexed access to breach databases with plaintext or cracked credentials. For corporate defenders, this is essential intel about what passwords your users have leaked elsewhere.
Combolist monitoring — cybercrime communities trade "combolists" of email:password pairs that are reused for credential stuffing. Monitoring services index these.
Stealer logs — infostealer malware harvests credentials from infected user devices and dumps them on criminal marketplaces. Stealer log monitoring services (commercial) ingest these dumps and alert on your domain.

When a corporate credential surfaces in any of the above, the playbook is immediate: force a password rotation on that user, audit for session reuse in recent logs, check for MFA bypass attempts.

Threat intelligence feeds

Beyond your own attack surface, you want awareness of threats targeting your industry, region, or tech stack:

CVE feeds (NVD, vendor advisories). Tag CVEs against your technology fingerprint to find what affects you.
Threat actor reports from vendors (CrowdStrike, Mandiant, Microsoft, Anthropic, Recorded Future) describing campaigns and TTPs.
ISACs (Information Sharing and Analysis Centers) for your sector — FS-ISAC for finance, H-ISAC for healthcare, etc.
CISA alerts, KEV (Known Exploited Vulnerabilities) catalog — if a CVE is on the KEV list and you're vulnerable, drop everything and patch.
Dark web / forum monitoring — mentions of your brand, executives, domains, or industry on criminal forums. Available via commercial services or in-house with Tor browsing (carefully).
Phishing kit detection — PhishStats, OpenPhish, URLscan.io let you discover domains impersonating your brand for phishing.

Continuous monitoring

Discovery once is not enough. The attack surface drifts every day — new subdomains spin up, certs expire, ports open, ports close. Continuous monitoring is the practice.

A workable cadence:

Daily: CT log monitor for new certificates on your domains, breach feed for your domains, KEV updates.
Weekly: full subdomain enumeration, comparing to last week.
Monthly: full port/service scan of all known IPs, full tech fingerprint refresh, content audit.
Continuous: brand-impersonation domain detection, public credential leak feeds.

Alert on deltas, not totals. The signal is "this is new since yesterday," not "this exists." Otherwise you drown in noise.

Blue-team playbook

Inventory all owned domains and IP blocks across registrars and cloud providers.
Run CT-log subdomain discovery weekly. Diff vs last week.
Cross-reference subdomain inventory against your CMDB. Investigate the diff.
Run external port/service scans monthly against the full IP inventory.
Match every service banner against current CVE feeds; prioritize KEV-list CVEs.
Subscribe to a domain-monitoring breach feed.
Subscribe to threat intelligence relevant to your industry and tech stack.
Monitor for newly-registered domains that impersonate your brand.
Establish a fast remediation path for findings — OSINT discovery without a remediation workflow is theater.
Document the program. When the board asks "how do we know we don't have shadow IT," your answer is this program.

For the foundations, see What Is OSINT?. For specific tooling, see Shodan.io, Google Dorks, and the OSINT Toolkit. For pivoting from infrastructure to people, see Username & Email OSINT. For defending against OSINT done against you, see OSINT Privacy Defense.

Sources & References

CISA — Known Exploited Vulnerabilities catalog
MITRE — ATT&CK Framework
SANS — SANS Internet Storm Center
crt.sh — Certificate Transparency search