Scanner-noise response playbook
This note turns the latest access-log pattern into a practical operating rule: most unexpected requests are not content demand; they are commodity probes. The site should acknowledge the pattern, avoid creating matching weak surfaces, and keep useful pages clearly indexable.
Observed 2026-06-11: the live log sample contained repeated probes for /.env, /.git/config, /wp-login.php, PHP cache shells, server status pages, numbered legacy paths, and unrelated host-style URLs. Core QingSiwei pages continued returning HTTP 200.
Default triage rule
- Do not build pages to satisfy scanner paths. A high request count on /.env, wp-login.php, or shell-like PHP URLs is a security signal, not an editorial opportunity.
- Keep the static attack surface boring. No WordPress login, no exposed environment file, no public Git directory, no server-status endpoint, and no writable upload surface should exist under the webroot.
- Use 404s as expected noise unless core pages fail. A burst of automated 404s is acceptable when the real sitemap URLs remain healthy.
- Convert evidence into useful public material. When a pattern repeats, publish a human-readable note or checklist that helps future operations, instead of chasing every bot path.
Fast health checklist
- OK Homepage, tools hub, research hub, changelog, robots.txt, and sitemap.xml should continue returning 200.
- Watch Repeated probes for sensitive files should stay as 404/403 and must not reveal stack traces or directory listings.
- Act If a sensitive probe returns 200 with real content, remove the file or block the path before publishing new content.
Content decision
For this lab, scanner noise reinforces the need for two tracks: public-facing useful assets such as SEO page quick-check, and operational research notes such as Legacy crawl map. The site should continue building durable tools while documenting defensive observations in research pages.