The way physicians are evaluated has profound consequences — not just for reimbursement but also for clinical practice, professional trust, and ultimately patient outcomes.
Yet too often, performance measurement relies on opaque or "black-box" analytics that lack transparency and fail to resonate with the clinicians whose behavior they are meant to influence. Evidence-based, transparent, and traceable methodologies are essential if health plans and providers are to find common ground and use performance data as a tool for genuine improvement and change.
Among the many friction points in payer–provider relationships, few are as consequential as performance evaluations, which — like prior authorization and reimbursement rates — directly affect both financial outcomes and professional identity.
Like the other two hot-button issues, evaluations affect income, but they also touch on the sensitive matters of clinical outcomes, practice habits, and professional judgment. Low evaluations can be viewed as criticism of a physician's performance, which strikes at the heart of their practice and their personal brand.
A longstanding lack of mutual payer-provider trust compounds this contentiousness. Plans suspect providers try to boost their income by performing as many procedures and ordering as many tests as possible, often with little thought to necessity or wasteful low-value care. Providers often perceive plans as focused primarily on financial outcomes rather than patient care.
This friction between payers and providers has been exacerbated by health plans' use of opaque methodologies – even AI – to analyze provider behavior. Health plans have long used analytics that providers consider obscure, unfair, or irrelevant. The lack of transparency and accurate attribution in these approaches has fueled the abrasion between these two crucial healthcare stakeholders.
Today's Typical Performance Reviews: Group Level and Aggregate
Health plans today primarily rely on claims data to evaluate provider performance. While clinical data would be ideal and clinical data interoperability is improving under TEFCA, it is not yet widely available at scale.
Most performance reviews occur at the medical group, practice, or health system level. Common approaches include:
- Cost-efficiency metrics such as total cost of care, usage, and readmission rates.
- Quality measures like HEDIS scores, chronic disease control, and hospital-level outcomes.
- Patient experience scores that are typically aggregated through Consumer Assessment of Healthcare Providers and Systems (CAHPS) surveys.
- AI-driven insights that are increasingly used to identify patterns and trends.
These methods provide a broad view of performance but do not identify or evaluate the wide performance variation that exists between individual clinicians. It's hard for a single physician to see himself or herself in this data — or to trust and act on it in meaningful ways.
For performance measurement to change behavior, physicians must trust it. That trust comes when systems have three essential attributes:
- Transparency – physicians can see precisely how results were derived, from evidence sources to algorithm design to data application.
- Traceability – every measure can be linked back to the clinical guidelines or research from which it was derived.
- Comprehensibility – physicians can understand the methodology and validate the logic themselves.
Evidence-Based Standards: The Foundation for Fair Measurement
The best sources for physician performance measures are evidence-based clinical practice guidelines published by medical societies and professional organizations. These guidelines are based on scientific findings, cumulative clinical experience, and the consensus judgment of practicing clinicians. They are stewarded by respected leaders in each specialty.
Another essential source is peer-reviewed research from leading medical journals such as The Lancet and The New England Journal of Medicine, which can provide convincing evidence that one clinical practice is safer or more effective than another.
Then there is the data that the measure is based on. Today, claims data is the largest and most widely available data set for measuring physician performance, and a great deal about clinician performance can be determined with claims data if it's applied correctly.
Equally important, evaluations should be applied at the individual physician level, not just at the group or system level. Aggregated metrics can mask unwarranted variation in care that lower quality and increase cost. Individually attributed measures ensure accountability, highlight clinical excellence, and surface opportunities for targeted improvement. Physicians who undergo individual reviews often report feeling empowered by evidence-based data specific to their own practice — and they are often more willing to make meaningful changes.
Of course, some clinicians, in spite of research and professional guidelines, may persist in doing things that are not aligned with evidence. In those cases, plans can apply pressure through mechanisms such as tiered or selective networks, limiting referrals, adjusting reimbursement incentives, or requiring prior authorization and more.
Finding Common Ground
Fee-for-service reimbursement fuels payer–provider mistrust by rewarding volume rather than outcomes. But even under value-based care, disagreements about performance measurement persist.
The path forward lies in performance analytics that are scientifically sound, mutually acceptable, evidence-based, and transparent to both parties. Only then can health plans and providers share a language that reduces friction, builds trust, and inspires clinicians to improve care delivery.
AI will continue to revolutionize healthcare in many ways. But when it comes to evaluating physician performance, black-box algorithms are not the answer. Evidence-based clinical analytics, grounded in transparency and traceability, remain the fairest and most effective approach — for plans, providers, and patients alike.
Only then can we engage and inspire physicians to change practice behaviors, reduce waste from unnecessary low-level care, enhance patient outcomes and truly arrive at value-based care.