The JSON-LD adapter is the universal solution for sites that don't run Drupal, TYPO3, or WordPress โ as long as the job pages contain schema.org JobPosting markup (which most SEO-optimized career sites do).
Prerequisites
- The site hosts a
sitemap.xml(standard with any CMS) โ the adapter follows nested sitemaps. - Job pages contain a
<script type="application/ld+json">block with"@type": "JobPosting". - The domain entry is set as allowlist in the BAconn connector โ we only scrape what you confirm.
Setup
- From the wizard or at /cms-pipe/manage: "New source" โ pick JSON-LD.
- Enter the base URL, e.g.
https://www.example-corp.com(trailing slash optional). - Optional: override the sitemap path if not at
/sitemap.xml. - Sync interval โ default 60 min, polite crawl throttled to 1 request/second.
Headless fallback (optional)
If the site injects JSON-LD via JavaScript (single-page apps, React/Vue), the adapter detects this and can optionally use a Playwright headless browser for extraction. Enable via adapter_state.headless = true in the source config. Requires Chromium installed in the container (see Dockerfile).
Common issues
- sitemap.xml returns 404 โ the source uses alternative indexing. Set the path manually or consider the TYPO3/Drupal adapter.
- 0 jobs found โ job pages don't include JSON-LD. In browser devtools:
document.querySelectorAll('script[type="application/ld+json"]'). - robots.txt blocks us โ the adapter respects robots.txt. On owned domains, allow our User-Agent
BAconn-CMS-Pipe.
Where the jobs land
All CMS-pipe sources write to wp_jobs (same table as WP Job Manager). They surface in /jobshub, the Indeed push, and BA sync automatically.