The JSON-LD adapter is the universal solution for sites that don't run Drupal, TYPO3, or WordPress โ€” as long as the job pages contain schema.org JobPosting markup (which most SEO-optimized career sites do).

Prerequisites

  • The site hosts a sitemap.xml (standard with any CMS) โ€” the adapter follows nested sitemaps.
  • Job pages contain a <script type="application/ld+json"> block with "@type": "JobPosting".
  • The domain entry is set as allowlist in the BAconn connector โ€” we only scrape what you confirm.

Setup

  1. From the wizard or at /cms-pipe/manage: "New source" โ†’ pick JSON-LD.
  2. Enter the base URL, e.g. https://www.example-corp.com (trailing slash optional).
  3. Optional: override the sitemap path if not at /sitemap.xml.
  4. Sync interval โ€” default 60 min, polite crawl throttled to 1 request/second.

Headless fallback (optional)

If the site injects JSON-LD via JavaScript (single-page apps, React/Vue), the adapter detects this and can optionally use a Playwright headless browser for extraction. Enable via adapter_state.headless = true in the source config. Requires Chromium installed in the container (see Dockerfile).

Common issues

  • sitemap.xml returns 404 โ€” the source uses alternative indexing. Set the path manually or consider the TYPO3/Drupal adapter.
  • 0 jobs found โ€” job pages don't include JSON-LD. In browser devtools: document.querySelectorAll('script[type="application/ld+json"]').
  • robots.txt blocks us โ€” the adapter respects robots.txt. On owned domains, allow our User-Agent BAconn-CMS-Pipe.

Where the jobs land

All CMS-pipe sources write to wp_jobs (same table as WP Job Manager). They surface in /jobshub, the Indeed push, and BA sync automatically.