Picture this: Your legal team has signed off on HIPAA expert determination as the right de-identification path. Your data governance lead has engaged a qualified statistician. Now the report arrives—a dense, technical document that your compliance officer, your IRB liaison and your business stakeholders all need to evaluate. The problem: most organizations don't know what a strong report looks like until they've reviewed a weak one.
This article demystifies the deliverable. It explains what HIPAA actually requires an expert determination report to contain, breaks down each section, and identifies the specific characteristics that separate an audit-ready report from one that will draw regulatory scrutiny.
What is HIPAA expert determination? HIPAA expert determination is one of two methods under 45 CFR §164.514(b)(1) by which a covered entity may de-identify Protected Health Information (PHI). A qualified statistician applies accepted analytical techniques to determine that the probability of identifying an individual from a dataset is "very small." The expert's findings are documented in a formal report that serves as the legal and regulatory basis for treating the data as de-identified.
Understanding what that report must contain—and what makes it defensible—is essential for any organization commissioning one, reviewing one, or relying on one in an audit or IRB submission.
What HIPAA actually requires in an expert determination
The regulatory text at 45 CFR §164.514(b)(1) is deliberately non-prescriptive. It states that a covered entity may satisfy the expert determination standard if a person with "appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable" applies those principles and documents that the risk of identifying an individual is "very small."
What HIPAA doesn't do is specify a particular statistical method, define a numerical threshold for "very small," or mandate a report structure. That flexibility is intentional—it allows expert determination to accommodate a wide range of dataset types and use cases. But it also means the quality of the analysis and documentation varies significantly between practitioners.
In practice, regulators, IRBs and data partners expect a report that demonstrates:
- A clear description of the data analyzed
- A named, reproducible methodology
- A quantified probability of re-identification with a context-specific threshold
- Evidence of the expert's independence and qualifications
- A formal conclusion that directly references the HIPAA regulatory standard
Reports that satisfy these expectations hold up in audits. Reports that don't are increasingly being challenged, particularly by Institutional Review Boards (IRBs) that have sharpened their scrutiny of de-identification claims in recent years.
The core sections of a HIPAA expert determination report
A properly structured expert determination report contains five sections. Each serves a specific regulatory and evidentiary purpose. The table below summarizes what each section contains and why it matters.
| # |
Section |
What it contains |
| 1 |
Dataset description |
The type and volume of data reviewed, including formats (EHR notes, claims, transcripts), time period, record counts and any relevant data governance context. |
| 2 |
Methodology |
The statistical techniques used to measure re-identification risk. Common approaches include the k-anonymity model, Bayesian estimation or population-frequency analysis. This section must be reproducible and defensible. |
| 3 |
Risk assessment |
A quantified estimate of re-identification probability expressed as a statistical finding. HIPAA requires this probability to be "very small"—the expert explains why that standard is met for this specific dataset and context. |
| 4 |
Expert qualifications |
The statistician's credentials, institutional affiliation and a formal independence statement confirming no conflict of interest with the covered entity. |
| 5 |
Conclusion and certification |
A formal attestation that re-identification risk is very small based on the methods applied. This is the section regulators and IRBs reference first. |
Section 1: Dataset description
The dataset description establishes the scope of the expert's analysis. It should specify what types of data were reviewed—for example, inpatient clinical notes, outgoing EHR records, or ASR-transcribed call recordings—along with volume (number of records or documents), the time period covered, and any data governance context relevant to re-identification risk. If the dataset contains rare conditions, small geographic populations, or longitudinal records, those characteristics should be disclosed here because they directly affect the risk calculation in the next section.
A weak dataset description creates a fatal flaw: the expert's risk conclusion cannot be extended to data outside the described scope, making any downstream use beyond that scope unsupported by the report.
Section 2: Methodology
This is the most technically demanding section and the one that has received the most scrutiny from regulators and IRBs. The methodology section must name the specific statistical approach used to measure re-identification risk. Common approaches include:
- k-anonymity: evaluates whether each record is indistinguishable from at least k-1 other records on a set of quasi-identifiers (e.g., age, ZIP code, diagnosis).
- Bayesian estimation: calculates the conditional probability that a specific individual can be identified given the data disclosed and a defined attacker model.
- Population-frequency analysis: assesses the uniqueness of data combinations against population distributions to estimate the probability that any given record maps to a unique individual.
The chosen method must be appropriate for the dataset type. A method that works well for structured claims data may not be appropriate for unstructured clinical notes—and the expert must explain why the chosen approach is valid for the specific data in question.
Regulators are increasingly rejecting reports where the methodology section is vague. "Statistical analysis was performed" is not sufficient. The report must be specific enough that a peer reviewer could evaluate and, in principle, reproduce the analysis.
Section 3: Risk assessment
The risk assessment section translates the statistical methodology into a concrete finding. It should express re-identification probability as a numeric estimate—stating that the probability any record could be linked to a specific individual falls below a defined threshold.
HIPAA does not mandate a universal numerical cutoff for "very small." As HHS guidance confirms, the acceptable threshold depends on the dataset characteristics, the intended recipients, and the broader environment in which the data will be used. Experts set and justify their own thresholds based on these factors—typically with reference to established statistical disclosure control literature. The risk assessment must address the specific dataset described in Section 1 and must be consistent with the methodology in Section 2.
Misalignment between these sections—for example, applying a risk model that assumes structured data to a dataset of unstructured clinical notes—is a common deficiency that IRBs and Office for Civil Rights (OCR) investigators flag.
Section 4: Expert qualifications
HIPAA requires the report to be produced by someone with "appropriate knowledge of and experience with generally accepted statistical and scientific principles." This section documents those qualifications. At minimum, it should include the expert's academic credentials, professional experience with de-identification or statistical disclosure limitation, any relevant certifications or publications and—critically—an independence statement.
The independence statement confirms the expert has no financial relationship with the covered entity beyond a fee for the analysis. This matters because a report produced by an organization's own data team—however technically competent—doesn't satisfy the independence expectation that underpins the expert determination standard.
Section 5: Conclusion and certification
The conclusion is where the expert formally attests that re-identification risk is very small. This section should directly reference the HIPAA regulatory language at 45 CFR §164.514(b)(1), state the expert's conclusion in clear, unambiguous terms, and be signed and dated by the statistician.
Compliance officers, IRB reviewers and auditors read this section first. A conclusion that hedges—"it is our view that risk is likely low"—is not the same as a formal attestation that risk is very small under the HIPAA standard. The language matters.
What makes an expert determination report audit-ready?
Not all expert determination reports offer the same level of protection. As IRBs and the Department of Health and Human Services (HHS) have become more sophisticated in evaluating de-identification claims, the bar for what constitutes an acceptable report has risen. The following comparison distinguishes strong reports from those that are likely to draw scrutiny.
| Characteristic of a strong report |
Red flag in a weak report |
| Methodology is named and citable |
Vague reference to "statistical analysis" |
| Risk threshold is quantified and context-justified |
Risk described qualitatively only |
| Expert has no financial relationship with covered entity |
Report produced by internal staff |
| Dataset scope is clearly bounded |
Dataset described in general terms |
| Conclusion directly addresses the HIPAA regulatory standard |
Conclusion omits regulatory language |
| Report is dated and signed by the statistician |
Unsigned or undated document |
Organizations that receive a report from a vendor's partner network should review it against the left column before accepting it as the basis for any downstream data use. If the report doesn't meet these criteria, request revisions before treating the data as de-identified.
Who produces the expert determination report?
The statistician who produces the report must be independent—meaning not employed by or otherwise financially conflicted with the covered entity commissioning the analysis. In practice, this means organizations source expert determination reports from one of three places:
- Academic institutions and research centers: statisticians affiliated with universities or academic medical centers with expertise in privacy and statistical disclosure limitation.
- Independent consulting firms: boutique firms that specialize in HIPAA de-identification, health data privacy, or biostatistics.
- De-identification platform partner networks: enterprise de-identification platforms like Limina maintain networks of vetted, independent experts who produce reports structured for HIPAA, FDA and IRB review.
Choosing a qualified expert matters. The statistician's name and credentials appear in the report and are subject to review by regulators and IRBs. An expert without a track record in health data de-identification or statistical disclosure limitation creates reputational and compliance risk for the organization relying on the report.
Buyers evaluating providers should assess the statistician's expert determination qualifications—including academic credentials, published work in statistical disclosure limitation, and a track record of reports that have withstood IRB and OCR scrutiny.
How Limina's platform supports expert determination
The quality of an expert determination report depends heavily on the quality of the underlying de-identification. A statistician who receives raw data—or data processed by a tool with poor PHI detection rates—must account for residual re-identification risk that a cleaner input would eliminate. That increases the complexity of the analysis and can result in a higher modeled risk that is harder to certify as "very small."
Limina's data de-identification platform achieves 99.5 percent+ accuracy on real healthcare data—compared to the 60–70 percent detection rates typical of general-purpose cloud tools. When data enters an expert's analysis with a documented high-accuracy de-identification pass, the statistical baseline for re-identification risk is lower, and the path to a "very small" certification is more direct.
Limina's partner network connects organizations with independent statisticians who are experienced with HIPAA expert determination and produce reports structured for FDA submissions, IRB review and audit defense. The platform's de-identification outputs are designed to feed directly into the expert's statistical workflow—structured, documented, and in a format that reduces analysis time and strengthens the final report.
Organizations in pharma and life sciences should note that FDA submissions and IRB protocols frequently impose additional documentation requirements beyond the HIPAA baseline. A report that satisfies HIPAA may need supplementary materials to satisfy an IRB's de-identification standard or an FDA Real-World Evidence submission protocol.
Ready to commission an audit-ready expert determination report?
Understanding what a HIPAA expert determination report contains is the first step. Commissioning one that holds up in an audit, an IRB review, or an FDA submission requires two things: clean, high-accuracy de-identification inputs and a qualified, independent statistician with experience producing reports for your specific use case.
Limina's platform delivers the de-identification accuracy that gives your expert the strongest possible statistical foundation—and our partner network connects you with the independent specialists who produce the report.
Get a demo to discuss your dataset and expert determination requirements.