{
  "summary": "EN URL inventory + Semrush real-keyword/volume backbone for a SHARD of hair-transplant domains (selected via args)",
  "agentCount": 2,
  "logs": [
    "Shard: elithair, smilehair",
    "Shard done: 2/2"
  ],
  "result": {
    "domains": [
      {
        "site": "elithair",
        "csv_path": "/opt/automator/cinik-rponse/files/raw/elithair_master.csv",
        "total_en_urls": 153,
        "ranking_urls": 0,
        "nonranking_urls": 153,
        "patient_case_urls": 74,
        "semrush_rows_pulled": 0,
        "by_type": [
          {
            "type": "patient",
            "count": 74
          },
          {
            "type": "post",
            "count": 39
          },
          {
            "type": "page",
            "count": 39
          },
          {
            "type": "faq",
            "count": 1
          }
        ],
        "theme_preview": [
          {
            "theme": "patient-case",
            "count": 74
          },
          {
            "theme": "generic hair-transplant",
            "count": 21
          },
          {
            "theme": "other",
            "count": 38
          },
          {
            "theme": "turkey",
            "count": 7
          },
          {
            "theme": "grafts",
            "count": 8
          },
          {
            "theme": "price",
            "count": 5
          }
        ],
        "sample_rows": [
          "url,type,real_keyword,volume,position,traffic_pct,n_keywords,is_patient_case_url,source",
          "https://elithair.com/,page,,,,,,0,sitemap-only",
          "https://elithair.com/hair-transplant-turkey-guide,page,,,,,,0,sitemap-only",
          "https://elithair.com/treatments/procedure,page,,,,,,0,sitemap-only",
          "https://elithair.com/information/faq,faq,,,,,,0,sitemap-only",
          "https://elithair.com/before-after,page,,,,,,0,sitemap-only",
          "https://elithair.com/blog/7-best-foods-for-hair-growth,post,,,,,,0,sitemap-only",
          "https://elithair.com/blog/can-iron-deficiency-cause-hair-loss,post,,,,,,0,sitemap-only",
          "https://elithair.com/blog/ozempic-hair-loss,post,,,,,,0,sitemap-only",
          "https://elithair.com/before-after/achim-k-2000-hair-grafts,patient,,,,,,1,sitemap-only",
          "https://elithair.com/before-after/max-coga-3250-hair-grafts,patient,,,,,,1,sitemap-only",
          "https://elithair.com/before-after/alexander-k,patient,,,,,,1,sitemap-only"
        ],
        "notes": "STEP 1 (sitemaps) succeeded fully. All 3 sitemaps fetched in parallel with browser UA and saved to disk (elithair_post-sitemap.xml, elithair_page-sitemap.xml, elithair_patienten-sitemap.xml). None were nested indexes (all flat urlsets, 0 <sitemap> tags). Raw loc counts: post=39, page=40, patienten=74. After normalize (lowercase host, strip trailing slash, URL-decode) + EN filter (all elithair.com = EN) + dedupe, the universe is 153 distinct EN URLs. One page-sitemap URL was the root https://elithair.com/ (40 page locs -> 39 page-typed after one was the FAQ at /information/faq which I typed as faq). Patient cases: all 74 patienten-sitemap URLs flagged is_patient_case_url=1; note their actual paths are under /before-after/<name> (not /patienten/), correctly typed as patient. No www variants encountered (all canonical https://elithair.com). Universe saved to elithair_universe.json.\n\nSTEP 2 (Semrush) FAILED / BLOCKED: execute_report domain_organic returned 'not enough API units to complete this request' (Semrush account out of API units) on every attempt, for both the display_limit=500 pull and after a separate validation note that this API requires display_offset < display_limit (the offset=500 call errored 605 before any data). Retried the limit-500 call a second time - same out-of-units response. Therefore 0 Semrush rows were pulled, the aggregated map (elithair_semrush.csv) is empty (header only), and NO real keyword/volume/position data exists. To get ranking data, additional Semrush API units are required (see https://www.semrush.com/mcp-access).\n\nSTEP 3 (join): Completed. Because the Semrush map is empty, all 153 rows are source=sitemap-only with blank real_keyword/volume/position/traffic_pct/n_keywords (ready for later fetch+read enrichment). ranking_urls=0, nonranking_urls=153. No semrush-only URLs to add. Master verified at 154 lines (153 data + header).\n\ntheme_preview is a ROUGH bucketing inferred from URL SLUGS ONLY (real keywords unavailable): patient-case=74 (all patient URLs); the remaining 79 non-patient URLs bucketed by slug keywords -> turkey (~7, e.g. hair-transplant-turkey-guide), price/cost (~5), grafts (~8, e.g. *-N-hair-grafts pages), generic hair-transplant/hair-loss (~21), other (~38, mostly blog topics like ozempic-hair-loss, foods-for-hair-growth, vitamin-deficiency). These counts are approximate and overlap-resolved by first-match; treat as directional, not authoritative, until Semrush keywords are pulled.\n\nFiles: /opt/automator/cinik-rponse/files/raw/elithair_master.csv (deliverable), /opt/automator/cinik-rponse/files/raw/elithair_semrush.csv (empty map), /opt/automator/cinik-rponse/files/raw/elithair_universe.json, plus the 3 saved sitemap XMLs with elithair_ prefix. Pre-existing files for other sites (cinik_, cosmedica_, serkan_, unprefixed) were left untouched."
      },
      {
        "site": "smilehair",
        "csv_path": "/opt/automator/cinik-rponse/files/raw/smilehair_master.csv",
        "total_en_urls": 427,
        "ranking_urls": 0,
        "nonranking_urls": 427,
        "patient_case_urls": 84,
        "semrush_rows_pulled": 0,
        "by_type": [
          {
            "type": "post",
            "count": 407
          },
          {
            "type": "page",
            "count": 13
          },
          {
            "type": "service",
            "count": 4
          },
          {
            "type": "beforeafter",
            "count": 2
          },
          {
            "type": "faq",
            "count": 1
          }
        ],
        "theme_preview": [
          {
            "theme": "generic hair-transplant",
            "count": 187
          },
          {
            "theme": "other",
            "count": 101
          },
          {
            "theme": "patient-case",
            "count": 84
          },
          {
            "theme": "turkey",
            "count": 37
          },
          {
            "theme": "grafts",
            "count": 13
          },
          {
            "theme": "price",
            "count": 5
          }
        ],
        "sample_rows": [
          "https://www.smilehairclinic.com/en/hair-transplant-before-after,beforeafter,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/case-studies,beforeafter,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/faq,faq,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/manual-punch-hair-transplant,service,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/post-operation,service,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/personal-data-policy,page,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/david-beckham-hair-transplant,post,,,,,,1,sitemap-only",
          "https://www.smilehairclinic.com/en/steve-carell-hair-transplant,post,,,,,,1,sitemap-only",
          "https://www.smilehairclinic.com/en/bryan-courters-hair-transplant-journey,post,,,,,,1,sitemap-only",
          "https://www.smilehairclinic.com/en/hair-transplant-turkey-cost,post,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/things-to-consider-before-a-5000-graft-hair-transplant,post,,,,,,0,sitemap-only",
          "https://www.smilehairclinic.com/en/john-cena-hair-transplant-the-story-behind-his-transformation,post,,,,,,1,sitemap-only"
        ],
        "notes": "STEP 1 OK. All 14 sitemaps fetched in parallel (xargs -P 8, browser UA, --max-time 30), each HTTP 200 and saved under /opt/automator/cinik-rponse/files/raw/smilehair_*.xml. No nested sitemap indexes (all <urlset>). Parsed with Python ElementTree taking only <url><loc> (excluded 4740 <image:loc> entries that would have inflated the count to 7243). 2503 raw <url><loc>, 2486 unique after normalize (lowercase host, strip trailing slash, URL-decode). EN filter = path starts with /en/ -> exactly 427 distinct EN URLs, all depth /en/<slug>. Language distribution sane (it=177, es=175, sq=172, ru/de=163, etc.; Turkish at root). 407 EN come from post-sitemaps (tagged post), 20 from page-sitemaps (tagged page/service/faq/beforeafter via an explicit slug map). is_patient_case_url=1 for 84 URLs: named-individual stories = celebrity transplant analyses (David Beckham, Tom Cruise, Mo Salah, Conor McGregor, Elon Musk, etc.) + 3 ordinary-patient journeys (mikes-, bryan-courters-, rhys-stroulgers-12-month-) + John Cena story-behind-transformation. Detector is name-based: matches '<name>(s)-hair-transplant(ation)' but disqualifies any leading token in a ~110-word procedure/symptom/descriptor stoplist (fue, dhi, prp, sapphire, body, unshaven, cheap, types, aftercare, etc.) to avoid procedure-slug false positives. STEP 2 BLOCKED: mcp__claude_ai_SEMRush__execute_report returned 'active subscription but not enough API units' on BOTH the display_limit=500 call and a retry (see https://www.semrush.com/mcp-access for more units). Zero Semrush rows obtained, so the offset=500 pagination call was not attempted. Consequently smilehair_semrush.csv contains only the header (no data rows), and STEP 3 join produced 0 source=semrush rows -> all 427 rows are source=sitemap-only with blank real_keyword/volume/position/traffic_pct/n_keywords, ready for later fetch+read enrichment. No EN Semrush-only URLs to add (Semrush set empty). theme_preview is slug-derived (no real keywords available): patient-case bucket counted first, then turkey/price/grafts/generic by slug tokens. www vs non-www: site is on www (sitemaps live at www.smilehairclinic.com); URLs kept verbatim on www, no host conflict observed. Artifacts: smilehair_master.csv (428 lines incl header, 427 unique urls verified), smilehair_semrush.csv (header only), smilehair_en_urls.txt, and 14 saved sitemap XMLs all under /opt/automator/cinik-rponse/files/raw/."
      }
    ]
  }
}