What is PHI?
PHI, or Protected Health Information, is a term with a surprisingly wide scope. This article breaks down how HIPAA defines it, how the GDPR and Ontario's PHIPA compare, and why understanding the boundaries of PHI is essential for any organization that handles health data.

PHI stands for Protected Health Information. At its most basic, it refers to individually identifiable information about a person's health, healthcare, or healthcare payments that is created, received, or maintained by certain covered entities. The term originates in United States federal law, specifically under the Health Insurance Portability and Accountability Act of 1996 (HIPAA), but the concept of protecting health data as a distinct and sensitive category of personal information exists across many jurisdictions worldwide.
What surprises many compliance and legal teams is just how broad the definition of PHI actually is. It extends well beyond a patient's diagnosis or treatment history. Billing records, demographic information, lab results, and even certain education records can fall under its umbrella, depending on who holds the information and how it is transmitted. Understanding that breadth, and where it differs by jurisdiction, is foundational to building a compliant data governance program.
This article examines the full legal definition of PHI under HIPAA's Privacy Rule, compares it with the European Union's approach under the General Data Protection Regulation (GDPR), and looks at Ontario's Personal Health Information Protection Act (PHIPA). Despite being distinct legal frameworks, all three converge on a similar principle: health information warrants a higher level of protection than ordinary personal data.
What Does PHI Mean Under HIPAA?
How the Privacy Rule Defines PHI
The HIPAA regulation that governs PHI is called the Privacy Rule, formally enacted in October 2002, six years after HIPAA itself was signed into law. The Privacy Rule constructs PHI as a nested subset of two broader categories: Health Information (HI) and Individually Identifiable Health Information (IIHI). PHI is the most specific and most regulated of the three.
All three terms are defined at §160.103 of the Privacy Rule. Working outward from the broadest category:
Health Information means any information, including genetic information, whether oral or recorded in any form or medium, that is created or received by a health care provider, health plan, public health authority, employer, life insurer, school or university, or health care clearinghouse, and that relates to the past, present, or future physical or mental health or condition of an individual, the provision of health care to an individual, or the past, present, or future payment for the provision of health care to an individual.
Individually Identifiable Health Information is a subset of health information, including demographic information collected from an individual, that is created or received by a health care provider, health plan, employer, or health care clearinghouse, relates to the same health and payment categories described above, and either identifies the individual or provides a reasonable basis to believe the information could be used to do so.
Protected Health Information is individually identifiable health information that is transmitted by electronic media, maintained in electronic media, or transmitted or maintained in any other form or medium.
In other words, PHI is individually identifiable health information in almost any form, whether electronic, paper, or oral, held by a covered entity or its business associates.
What Are the Five Elements of the PHI Definition?
Working through the regulatory language, there are five key elements that must be satisfied for information to constitute PHI. The information must be (1) created or received by a specific type of covered entity, (2) composed of content relating to an individual's health status, care, or payment, (3) capable of identifying or reasonably likely to identify the individual, (4) transmitted or maintained in any medium, and (5) not excluded under one of the statutory carve-outs.
What Is Excluded from the Definition of PHI?
The Privacy Rule excludes certain categories of individually identifiable health information from the definition of PHI. Specifically, it excludes education records covered by the Family Educational Rights and Privacy Act (FERPA), records maintained by a physician or mental health professional solely in connection with treatment of a student over 18 or attending postsecondary education, employment records held by a covered entity in its role as an employer, and information regarding a person who has been deceased for more than 50 years.
These exclusions are narrow. The general rule is broad inclusion: when in doubt, health-related information about an identifiable individual held by a covered entity is PHI.
What Are Common Examples of PHI?
PHI is not limited to clinical records. Common examples include medical diagnoses and treatment plans, lab results, prescription histories, billing and insurance claims, demographic information such as name, address, and date of birth when connected to health data, appointment records, and imaging files. Any of these data types, when tied to an identifiable individual and held by a covered entity, constitutes PHI and is subject to HIPAA's protections.
If your organization works with unstructured health data, such as clinical notes, discharge summaries, transcribed consultations, or patient emails, this information is equally subject to HIPAA requirements. Limina's healthcare data de-identification platform is purpose-built to detect and de-identify PHI across exactly these types of unstructured sources, not just structured records and databases.
How Does the GDPR Protect Health Information in Europe?
There is no separate EU-level act that specifically governs health information the way HIPAA does in the United States. Instead, Article 9 of the GDPR classifies health data, including biometric and genetic data, as a "special category" of personal data. Processing special category data is prohibited by default. Limited exceptions apply, including explicit consent from the individual, necessity for vital interests, and certain public health justifications.
What makes the GDPR framework notably more complex is Article 9(4), which permits EU member states to introduce further conditions or restrictions on the processing of genetic, biometric, or health data. This means the GDPR establishes a floor, not a ceiling. A member state may restrict the processing of health data even where the individual has consented. For any organization operating across EU jurisdictions, compliance requires understanding not only the GDPR itself but also the national implementing legislation of each relevant member state.
The practical takeaway is that "GDPR compliant" does not automatically mean "compliant with German health data law" or "compliant with French health data regulations." Health data in Europe requires careful, jurisdiction-specific legal analysis.
How Does Ontario's PHIPA Compare to HIPAA?
Ontario's Personal Health Information Protection Act (PHIPA) uses the term "personal health information," which could of course also be abbreviated as PHI, and its meaning is closely aligned with the HIPAA definition. The structure is similar: PHIPA covers identifying information about an individual, in oral or recorded form, that relates to physical or mental health, healthcare provision, payment for healthcare, organ and tissue donation, or the individual's health number.
However, there are notable differences worth examining.
First, PHIPA covers information past an individual's death without a fixed time limit. Under HIPAA, health information about individuals deceased for more than 50 years falls outside the definition of PHI. PHIPA, by contrast, continues to apply after death, allowing disclosure only for limited purposes as set out in Art. 38(4).
Second, PHIPA is broader in one specific respect: it includes mixed records. Under PHIPA, if a record primarily contains personal health information, then all identifying information in that same record, even information that would not independently qualify as personal health information, is captured by the definition. This is a meaningful practical difference for organizations managing records that blend health and non-health data.
Third, PHIPA identifies a "substitute decision-maker" as personal health information. If a record identifies who is authorized to make health decisions for an individual, that relationship itself is protected information.
What PHI Definitions Have in Common Across Jurisdictions
A comparison of HIPAA, the GDPR, and PHIPA reveals that the definitions diverge in detail but converge in principle. All three frameworks recognize that health information is especially sensitive, warrants elevated protection, and should not be processed or disclosed without a legitimate legal basis. All three also recognize that the combination of health data with other identifying information can heighten the re-identification risk and therefore the harm potential.
For organizations operating across multiple jurisdictions, this convergence is helpful at a strategic level. A robust de-identification and data governance program that meets HIPAA standards will typically address many of the requirements under GDPR and PHIPA as well. But the details differ, and those differences matter enormously for compliance. The jurisdiction governing the data must always be clearly established before drawing compliance conclusions.
Organizations in the life sciences sector managing clinical trial data, adverse event reports, or real-world evidence across US and EU markets face this challenge acutely. Limina's pharma and life sciences de-identification solution is designed to support compliant data use across these regulatory boundaries.
Why the Breadth of PHI Matters for Data Governance
One of the most common misconceptions organizations have about PHI is that it refers only to obviously medical information: diagnoses, prescriptions, and clinical notes. In practice, the definition captures a much wider range of data types. A spreadsheet of patient billing codes is PHI. An appointment scheduling email is PHI. A transcription of a nurse's voicemail is PHI. A dataset of de-identified clinical records that has been re-linked to names through a secondary source becomes PHI again.
This breadth creates real compliance risk, particularly in unstructured data. Most health information today does not live only in structured databases. It is embedded in free-text clinical notes, in customer service call transcripts, in email correspondence between patients and providers, in PDFs and scanned documents. Traditional rule-based redaction tools, which search for patterns like phone number formats or date structures, are poorly suited to identifying PHI in these contexts. They miss entities that appear in unexpected forms and cannot understand the relationships between pieces of information the way a human reviewer would.
This is precisely where Limina's approach is different. Built by linguists and grounded in contextual language understanding, Limina's data de-identification platform detects PHI and other sensitive entities across unstructured text, audio, images, and documents, in over 52 languages, with the precision that compliance demands. If you are assessing your organization's PHI exposure, speak with our team to see how Limina can help you understand and protect the sensitive data in your environment.
PHI in Specific Industry Contexts
Healthcare Providers and Health Plans
For hospitals, clinics, health plans, and clearinghouses, managing PHI is a core compliance function. HIPAA's Privacy Rule and Security Rule impose specific requirements around access controls, disclosure tracking, breach notification, and business associate agreements. The challenge has grown significantly as health systems have expanded their use of digital tools, AI-assisted documentation, and cloud-based platforms, each of which creates new PHI data flows that must be governed.
Contact Centers Handling Health Information
Healthcare contact centers handle sensitive patient information continuously, from appointment scheduling calls to insurance verification to prescription support lines. Every interaction may involve PHI, and recording those interactions for quality assurance or AI training purposes without adequate de-identification creates significant compliance exposure. Limina's contact center de-identification solution addresses this challenge directly, enabling organizations to work safely with call recordings and transcripts without retaining identifiable health data.
Insurance Organizations
Health insurers and related entities are covered entities under HIPAA and handle large volumes of PHI through claims processing, underwriting, and member communications. The data flows are complex and often involve multiple downstream vendors and processors. Limina's insurance industry solution supports compliant data handling across these environments.
Conclusion: Know Your Data, Know Your Jurisdiction
The definition of PHI is broader than most people expect, and it varies in meaningful ways across jurisdictions. Under HIPAA, it captures health data in nearly any form when held by a covered entity or business associate. Under the GDPR, health data receives special category protection with national law layered on top. Under Ontario's PHIPA, it extends past death and captures mixed records that HIPAA would not.
What all three frameworks share is the expectation that organizations know what health data they hold, understand the legal obligations that apply to it, and have concrete controls in place to prevent unauthorized disclosure or use. For organizations processing health data at scale, especially across unstructured sources and multiple jurisdictions, manual approaches to PHI identification and protection are no longer sufficient.
Limina helps organizations identify and de-identify PHI accurately, efficiently, and at scale. To understand your organization's PHI exposure and how automated de-identification can reduce your compliance risk, request a demo today.



.png)