ASQ-PHI Introduced: A Novel Benchmark Balancing Clinical Data Privacy and Search Utility

ASQ-PHI (Adversarial Synthetic Queries for Protected Health Information de-identification) is a novel dataset developed to address emerging challenges in advanced AI workflows. In the current scenario, hospitals are using HIPAA-compliant large language models (LLMs) under Business Associate Agreements (BAAs) that allow clinicians to record Protected Health Information (PHI) carefully. Conversely, these AI models depend on constant training data with a temporal knowledge limit; clinicians usually require external tools like live web search for recent medical evidence. These external systems are often not covered by BAAs, creating a “safe handoff” point in which PHI should be eliminated before data leaves the safe environment.

Despite the significance of this alteration, current de-identification methods are not ideal for this. Most of them are based on long-form electronic health record (EHR) narratives like patient discharge reports, whereas real-world LLMs generally use query-style prompts.

Moreover, access to real clinician queries is restricted due to institutional oversight and privacy regulations. To address this shortcoming, ASQ-PHI presents a fully synthetic, publicly shareable dataset that simulates clinician-style queries that include PHI together with the labels of their de-identification.

The dataset has 1,051 single-turn clinical queries that were created to be like real prompts entered into clinical LLM systems. Among these, 219 (20.8%) are hard negatives, whereas 832 (79.2%) include PHI.

Across the dataset, there are 2,973 annotated PHI elements covering 13 HIPAA Safe Harbor identifier types such as names, dates, medical record numbers, phone numbers, and geographic locations. Each query is combined with system-readable annotations in JSON format that specify both the type of identifier and exact text span, which assist in precise evaluation of de-identification systems.

A crucial strength of ASQ-PHI is its structure. Every record is divided into a PHI annotation and a query section by using simple delimiters that make it easy to adapt for different evaluation tasks. The addition of both PHI-positive queries and carefully constructed hard negatives allows investigators to assess not only how well systems remove PHI but also whether they over-redact harmless information.

The dataset was produced by using an adversarial prompting pipeline built on GPT-4o through Azure OpenAI. High-temperature sampling (0.9) was used to generate challenging, complex, and diverse query phrasing and realistic test cases.

The generation process comprised automated validation steps in order to safeguard a balanced proportion of hard negatives, minimizing malformed records and managing PHI density per query. It provides an interactive Jupyter notebook and supporting code, making it reproducible and easy to adapt to domain-specific datasets.

The researchers showed practical use by evaluating a commercial PHI detection system with ASQ-PHI that demonstrated a balance between accurate PHI detection and preventing over-masking. Expert review proved high quality with 98% annotation accuracy and 96% clinical plausibility. However, the dataset is synthetic, English only, and might not have real-world diversity; hence, external validation is required. ASQ-PHI is an effective benchmark for PHI removal in clinician-like queries and can be used to facilitate safer clinical AI utilization.

Reference: Weatherhead J, Golovko G, McCaffrey P. ASQ-PHI: An adversarial synthetic data benchmark for clinical de-identification and search utility. Data Brief. 2026;65:112586. doi:10.1016/j.dib.2026.112586

Latest Posts

Free CME credits

Both our subscription plans include Free CME/CPD AMA PRA Category 1 credits.

Digital Certificate PDF

On course completion, you will receive a full-sized presentation quality digital certificate.

medtigo Simulation

A dynamic medical simulation platform designed to train healthcare professionals and students to effectively run code situations through an immersive hands-on experience in a live, interactive 3D environment.

medtigo Points

medtigo points is our unique point redemption system created to award users for interacting on our site. These points can be redeemed for special discounts on the medtigo marketplace as well as towards the membership cost itself.
 
  • Registration with medtigo = 10 points
  • 1 visit to medtigo’s website = 1 point
  • Interacting with medtigo posts (through comments/clinical cases etc.) = 5 points
  • Attempting a game = 1 point
  • Community Forum post/reply = 5 points

    *Redemption of points can occur only through the medtigo marketplace, courses, or simulation system. Money will not be credited to your bank account. 10 points = $1.

All Your Certificates in One Place

When you have your licenses, certificates and CMEs in one place, it's easier to track your career growth. You can easily share these with hospitals as well, using your medtigo app.

Our Certificate Courses