NARRATIVE/SYSTEMATIC REVIEWS/META-ANALYSIS

Simplified Unified Telehealth Database to Manipulate Search-Related Queries

S. Hemalatha, PhD¹ , K V S V Trinadh Reddy, PhD² , Tavanam Venkata Rao, PhD³ , Chaithra S., ME⁴ , Dr. Vijaya R. Kumbhar, PhD⁵ , Smitha Chowdary Ch, PhD⁶

¹Professor, Department of Information Technology, Panimalar Engineering College, Chennai, Tamil Nadu, India; ²Associate Professor, Department of Electronics and Communication Engineering, Cambridge Institute of Technology, K. R. Puram, Bengaluru, Karnataka, India; ³Professor, Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India; ⁴Assistant Professor, Department of Electronics and Communication Engineering, Cambridge Institute of Technology, K. R. Puram, Bengaluru, Karnataka, India; ⁵Assistant Professor, Dr. Vishwanath Karad MIT World Peace University, Pune, Maharashtra, India; ⁶Professor, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India

Keywords: machine engine, telehealth records telemedicine, schema mapping layer, transformation engine, unified record output

Abstract

Telehealth records are commonly used as patient databases for teleconsultations. These health records are available globally to physicians, patients, telehealth researchers, and government sectors. However, vendors that maintain these telehealth services use their own formats, with available data organized into different categories. For instance, a government sector or a researcher requires access to a telehealth record. However, that record appears in heterogeneous formats. It is tedious to extract any research analysis. As an alternative, a different domain of a large language model is proposed to meet various application objectives, but a heterogeneous health record consolidation large language model is not proposed for telehealth record manipulation. In this article, the authors propose a novel model to prepare heterogeneous health records for analysis. Five stages of model preparation for processing six types of health records include: Input Layer, Schema Extraction Layer, Term Machine Engine, Schema Mapping Layer, Transformation Engine, and Unified Record Output. The major component of the proposed work is termed the “machine engine,” which groups related terms into a single category to support the preparation of the unified health record. The execution of this work is tested with a sample of five different telehealth records. The output generated was verified successfully.

DOI: https://doi.org/10.30953/thmt.v11.680

Copyright: © 2026 The Authors. This is an open-access article distributed in accordance with the Creative Commons Attribution Non-Commercial (CC BY-NC 4.0) license, which permits others to distribute, adapt, enhance this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0. The authors of this article own the copyright.

Submitted: January 2, 2026; Accepted: May 19, 2026; Published: June 30, 2026

Corresponding Author: S. Hemalatha, Email: pithemalatha@gmail.com

Financial and Non-Financial Relationships and Activities: The authors declare that they have no known financial or non-financial conflicts of interest that could have appeared to influence the work reported in this paper.

Competing interests and funding: The authors received no specific funding for this research, authorship, or publication of this article.

In this article, the authors focus on the development of a large language model (LLM)-driven framework for harmonizing heterogeneous health records into a unified, standardized representation that can be seamlessly utilized by government agencies for public health monitoring, policy formulation, and large-scale analytics. Given the fragmented nature of telehealth, electronic health records (EHR), diagnostics, and wellness data across multiple platforms and providers, a unified LLM-based approach is essential to enable cross-institutional interoperability and reliable data-driven decision-making.

To demonstrate this, following this introduction, this article is organized into five sections: Section 2 presents a comprehensive literature survey on existing LLM architectures, multimodal models, and telehealth-focused language models. Section 3 describes the proposed model architecture and its methodology for mapping diverse health-record formats into a common unified schema. Section 4 illustrates sample heterogeneous health-record inputs and demonstrates the step-by-step conversion process using the proposed LLM framework. Section 5 concludes the study by summarizing key contributions, practical implications, and potential future enhancements.

Emergence and Expanding Scope of LLM

The LLMs have emerged as foundational technologies that are reshaping research, education, healthcare, and computational intelligence. The ability of LLMs to understand, generate, and reason with natural language enables transformative applications across domains.

Recent surveys and empirical studies highlight the remarkable capabilities of LLMs and the critical challenges that must be addressed to ensure safe, effective, and domain-aligned deployment. In the medical domain, Thirunavukarasu et al¹ provided one of the earliest and most comprehensive reviews of LLM applications in clinical, educational, and biomedical research contexts. That study outlines how models such as ChatGPT support clinical decision-making, documentation, and workflow automation, substantially improving efficiency for healthcare practitioners. At the same time, the authors emphasize concerns, including hallucinations, bias, and safety risks, underscoring the need for rigorous validation, clinician-guided deployment, and robust ethical frameworks before integration into real-world medical practice.

Expanding beyond healthcare, Naveed et al² presented a broad, systematic survey of LLM advancements covering architecture, training strategies, multimodality, context length, robotics, and efficiency optimization. Their work offers a structured lens to understand foundational and cutting-edge innovations. Although the survey strengthens conceptual understanding, it also highlights challenges such as rapid model evolution, limitations of current benchmarking methods, and the urgent need for scalable, efficient model designs and improved safety alignment. In the education domain, Kasneci et al³ explored the opportunities and risks of deploying LLMs for teaching, student engagement, and content creation. While LLMs enable personalized learning pathways, generate high-quality educational materials, and promote early artificial intelligence (AI) literacy, these authors caution that biases, inaccuracies, and potential misuse necessitate stronger pedagogical frameworks, teacher training, and ethical guidelines for safe classroom adoption.

Domain-specific LLM development has also gained momentum. Peng et al⁴ introduced a generative clinical LLM (GatorTronGPT-3), trained on 277 billion words of clinical and general text. Their results demonstrated significant improvements in biomedical natural language processing in biomedical neuro-linguistic programming (NLP) tasks and show that synthetic clinical text can achieve performance comparable to or better than real data. Despite these gains, reliability, hallucination risk, and ethical concerns surrounding synthetic data generation remain research challenges requiring long-term clinical evaluation.

Specialized surveys and technical analyses further illustrate the expanding scope of LLM research. Zan et al⁵ conducted the first unified comparative review of Natural Language to Code (NL2Code) models, identifying success factors such as model scale, data quality, and expert fine-tuning. Zhang et al⁶ offered a comprehensive examination of instruction tuning across modalities, highlighting its importance for aligning LLMs with human intent but also noting limitations in dataset diversity, instruction quality, and cross-domain generalization. Complementing these, Zhao et al⁷ provided the first structured taxonomy of LLM explainability, stressing that current interpretability methods remain insufficient, unstable, and difficult to evaluate, especially for large-scale, multimodal models.

Code-focused LLM development was addressed by Xu et al,⁸ who introduced PolyCoder, an open-source code model trained on 249 GB of multilingual code. While competitive with proprietary systems such as Codex and outperforming it in C programming, PolyCoder also illustrates the limitations of smaller-scale models trained without extensive industrial graphics resources.

Beyond technical models, multiple studies explore LLM use in specialized areas of healthcare and social sciences. Szabó and Bilicki⁹ discussed secure EHR access in cloud environments; Savoska et al¹⁰ proposed a cloud-based personal health record system integrating heterogeneous data; Strika et al¹¹ analyzed LLM roles in addressing medical deserts; and Lee et al¹² compared a Generative Pre-trained Transformer 4 (GPT-4) with clinicians for predicting mental health crises in telehealth settings.

Multimodal LLMs were examined by Yin et al,¹³ while Huang et al¹⁴ demonstrated how LLMs can self-improve via self-generated reasoning. Additional applications include dentistry,¹⁵ where multimodal LLMs improve diagnostic accuracy and psychology,¹⁶ where LLMs offer new opportunities for behavioral analysis and mental health research.

Finally, Teubner et al¹⁷ discussed the profound implications of LLMs for Business and Information Systems Engineering, highlighting the transformative potential and the regulatory and ethical uncertainties surrounding their rapid adoption.

Collectively, the literature demonstrates that LLMs are transitioning from general-purpose language models to domain-specialized, multimodal, instruction-aligned systems with far-reaching societal impact. However, concerns regarding hallucination, safety, bias, transparency, data privacy, and evaluation methodologies persist across all application areas. These gaps underline the need for stronger governance frameworks, scalable architectures, multimodal integration strategies, improved explainability, and rigorous domain-specific validation. As LLMs continue to evolve, addressing these challenges will be essential for unlocking their full potential in healthcare, education, coding, business intelligence, and beyond.

The Telehealth Domain: Large Language Model

This section presents a literature survey on the use of LLM generation for various applications of the telehealth domain. Each application presents a discussion of the method of invention, metrics obtained, and discussion of their limitations.

Lee et al¹² performed a comparative study evaluating GPT-4 versus expert clinicians for crisis prediction among telemental health patients using intake data. For predicting current suicidal ideation (SI) with a plan, clinician precision was 0.70, GPT-4 precision was 0.60, clinician sensitivity was 0.53, and GPT-4 sensitivity was 0.62. With a suicide attempt history, clinician precision was 0.77, sensitivity was 0.59, GPT-4 precision was 0.54, and sensitivity was 0.59. For predicting future SI with a plan, clinician precision was 0.59, sensitivity was 0.40, GPT-4 precision was 0.48, and sensitivity was 0.46. With attempt history (future SI), clinician precision was 0.69, sensitivity was 0.46, GPT-4 precision was 0.48, and sensitivity was 0.74.

But this GPT-4 precision is lower than clinicians, and model performance drops significantly for future crisis prediction, limited dataset (only intake complaints plus attempt history), GPT-4 might carry bias from training data, not ready for clinical deployment. Rosario et al¹⁸ developed an emotion-sensitive telehealth enhancement system using ChatGPT-4 for real-time emotion detection and dynamic calming background generation to reduce anxiety during telehealth waiting periods. The accurate emotion classification using Facial Expressions of Emotion: Stimuli and Tests (FACES) dataset (qualitative claim, no numerical metric provided) and high patient satisfaction based on qualitative feedback generated backgrounds judged calming and suitable by evaluators. The metrics were not quantitatively reported (no accuracy percentage, precision, or recall). Systems were tested only on research datasets, not real-time patient sessions. Background generation effectiveness was evaluated only qualitatively, and personalization was limited to generic responses, not patient-specific.

Busch et al¹⁹ developed a blueprint for LLM-augmented telehealth (LLM-TH), an emerging, innovative approach designed to improve human immunodeficiency virus (HIV) care and support HIV mitigation in Indonesia. Key contributions of Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR)-based scoping review (694 records to 12 included) proposed integration of LLMs in telehealth for HIV, triage, history taking, patient education, identified opportunities for reduced consultation time, and improved quality of care. No empirical numerical metrics reported were identified. Strong evidence supporting eHealth and telehealth effectiveness for HIV management and LLM-TH potential improvements was highlighted qualitatively (efficiency, access, and care quality). Variable mobile access in Indonesia limits deployment. There is no empirical validation of LLM plus telehealth integration yet. Significant literature gaps on LLM-TH for HIV and ethical and safety concerns remain untested.

Snoswell et al²⁰ introduced the role of LLMs in augmenting telehealth. The key contributions explain LLM evolution and capabilities (ChatGPT and Bing Chat), highlight the integration of LLMs into telehealth systems, emphasize clinician awareness and literacy for safe adoption, and discuss the shift to hybrid telehealth models powered by AI. The LLMs can generate false information (hallucinations), can reinforce biases and stereotypes, have high environmental and computational costs, lack clinical validation for many LLM-based tools, and risk over-reliance by clinicians and patients.

Farmer et al²¹ explored integration of LLMs in heart failure telehealth, focusing on the HerzMobil program. The key contributions are an overview of LLM capabilities for telehealth and enhancement of patient interaction and clinical documentation. The HerzMobil case study of improved chronic disease management demonstrated improved patient self-management and reduced readmissions. This work improved patient self-management (qualitative), reduced heart failure hospital readmission rates (qualitative; exact percentage not reported), and demonstrated enhanced communication and decision support through LLMs. But this research provided no quantitative metrics and challenges with data privacy and secure handling of patient data, risk of LLM bias, inaccurate outputs, and regulatory and compliance challenges in medical AI use, plus ethical concerns around automated decision-making.

Kwan²² proposed a smart, user-centered telehealth system powered by LLMs, integrating four main modules. The appointment scheduling module was based on NLP-based booking, resource optimization, teleconsultation module LLM-driven symptom interpretation and clinical support, clinical decision support module LLM plus evidence-based guidelines for diagnosis/treatment, health data analytics module population trends, reporting, and research insights. Emphasis was placed on user-centered design to improve usability and patient–provider experience. The conceptual framework, with no numerical performance metrics reported, demonstrated functional enhancement in workflow efficiency and decision support (qualitative), and highlighted improved user satisfaction and accessibility (conceptual). But there were no clinical trials or real-world performance data, risk of incorrect symptom interpretation or unsafe decision support, data privacy concerns in multi-module LLM integration, heavy computational load for real-time LLM use, and regulatory approval challenges.

Zang²³ developed improved language modeling techniques for doctor–patient telehealth speech recognition. Key contributions addressed the lack of in-domain medical automatic speech recognition (ASR) data, collected and analyzed multiple datasets (in-domain and out-of-domain), designed effective word class definitions for telehealth speech, proposed a combined class with a word trigram model trained separately on different datasets, and achieved significantly better ASR language modeling rather than standard n-gram interpolation. However, no deep learning era models were dependent on the quality/availability of medical in-domain speech data, limited vocabulary coverage for specialized clinical terms, no evaluation on real telehealth ASR systems, or patient outcomes.

Meskó²² conducted a conceptual analysis of multimodal large language models (M-LLMs) in healthcare. The key contributions describe the shift from text-only LLMs to multimodal AI capable of processing text, images, audio, video, and documents. In futuristic scenarios, M-LLMs enhanced diagnostics, triage, decision support, patient engagement, and workflow automation. Frames M-LLMs as a “gateway” for clinicians to interface with AI across modalities emphasized augmentation, not replacement of clinicians. The authors present theoretical improvements in interpretability and multimodal clinical reasoning. Its dependence on AI could weaken human–patient relationships. The M-LLMs might amplify errors due to complex multimodal processing.

Yang et al²⁴ provided a comprehensive review on the development, classification, applications, and challenges of LLMs in healthcare. They introduced two major types of healthcare LLMs: biomedical-domain LLMs and clinical-domain LLMs. Domain-specific LLMs (biomedical and clinical) outperform general LLMs. Their work demonstrated superior performance on NLP tasks over the last three years. It identified strong utility in pre-consultation, diagnosis, management, medical writing, and medical education. Data privacy and access challenges remain, as does the risk of hallucination and incorrect clinical reasoning; the lack of regulatory frameworks and medico-legal accountability; the need for human supervision remains essential; and general LLMs lack domain specificity.

Nazi and Peng²⁵ present a comprehensive survey and review of LLMs in healthcare. Their work provides a development timeline from pretrained language models (PLMs) to healthcare LLMs. It evaluates applications in clinical language understanding, named-entity recognition (NER), relation extraction, natural language inference (NLI) and question answering (QA), document classification, multimodal tasks, and the use of open-source LLMs. This work showed that domain-specific LLMs perform significantly better for NER, relation extraction, QA, and inference tasks. They identified key performance metrics used in biomedical natural language processing (NLP): F1-score, accuracy, BLEU, ROUGE, and perplexity. They demonstrated strong utility of LLMs in clinical text interpretation, multimodal analysis, and structured document processing. Risks included hallucinations in medical outputs, limited interpretability and explainability, data privacy, and the U.S. Health Insurance Portability and Accountability Act of 1996 (HIPAA) compliance issues. Other issues included the lack of domain-specific datasets for training, ethical concerns, absence of universal regulation, and heavy compute and resource requirements.

Zhao et al²⁶ introduced an LLM-driven conversational agent (using GPT-4) to optimize telehealth services. They conducted a human–computer interaction (HCI) evaluation through controlled experiments and user studies, assessing the impact on scalability, interaction quality, and operational efficiency. Higher user satisfaction in teleconsultation sessions was reported, including reduced task completion time for patient queries and decreased error rates compared to rule-based chat agents, thereby improving patient engagement and reducing provider workload. There were data privacy risks in LLM-based conversations and ethical concerns with automated patient responses; continuous model updates were required for safety and accuracy, and interpretability and transparency were limited.

Rullo et al²⁷ developed CARDIO, an intersectionality- informed LLM customized for cardiovascular and metabolic health education for people living with HIV. The tutorial describes the full pipeline: data scraping, benchmarking, fine-tuning with low-rank adaptation (LoRA) with reinforcement learning (RL), and expert evaluation. The baseline LLM performance is as follows: accuracy was 4.16, readability was 4.63, professionalism was 4.58, Kincaid Grade was 8.54 (hard to read), and the jargon score was 4.44 (high jargon). After fine-tuning, accuracy improved to 5.0, readability to 4.98, professionalism to 4.98, Kincaid grade to 7.17 (simpler), and jargon to 2.92 (reduced).

Mohamed et al²⁸ proposed a multilayered LLM framework for disease prediction using social telehealth data. They explored three Arabic medical text preprocessing strategies (summarization, refinement, and NER) and evaluated LLMs (CAMeL-BERT, AraBERT, and Asafaya-BERT) with LoRA fine-tuning for disease type and severity classification. The authors demonstrated the role of LLMs in enhancing social telehealth diagnosis. The metrics obtained are the best for the model: CAMeL-BERT plus NER, 83% disease-type classification and 69% severity assessment; and non-fine-tuned models, 13 to 20% disease-type classification and 40 to 49% severity assessment. But limited to Arabic medical text, it may not generalize to multilingual/social contexts and relies on social media symptom descriptions that are prone to noise, ambiguity, and misinformation. Performance is heavily dependent on NER quality and data preprocessing, not tested on real clinical datasets or EHRs.

Chow et al²⁹ performed a comprehensive review of LLM-enabled medical chatbots and conversational AI in healthcare. They discuss the evolution of LLMs, NLP foundations, patient–provider conversational applications, diagnosis/triage support, treatment guidance, and ethical/legal issues. They also highlight the role of GPT-like models in reshaping healthcare conversations. The privacy concerns and risk of data leakage, accuracy issues, and LLMs might hallucinate medical facts, ethical concerns such as resource authenticity, plagiarism, and misinformation, the need for domain-specific training for reliable responses, and the lack of regulatory frameworks and governance.

Park et al³⁰ conducted a scoping review assessing the research landscape, clinical utility, and evaluation frameworks for LLMs in medical applications. This article reviewed 4,036 records and analyzed 55 global studies. The LLMs show promise in patient note compilation, care navigation, and supporting clinical decision-making with human oversight. Gaps in standardized evaluation methods for clinical use were identified. The limitation of this is that bias in training data might cause harmful clinical outcomes; LLMs generate convincing but inaccurate information (hallucination); ethical, legal, socioeconomic, and privacy risks and limited real-world clinical deployment studies.

Siru Liu et al³¹ developed and evaluated an LLM-based patient assistant that generates follow-up questions to help patients craft clearer, more comprehensive clinical messages before sending to providers. Three models were compared: (1) collaborative low-rank alignment and identifiable recovery (CLAIR) (locally fine-tuned LLM), (2) GPT-4 (simple prompt), and (3) GPT-4 (complex prompt). CLAIR outperformed other models in five out of seven scenarios. GPT-4 had higher utility and completeness but lower clarity and conciseness; CLAIR matched provider-written messages in clarity and conciseness; CLAIR provided higher utility than providers and GPT-4. For completeness, CLAIR < GPT-4 but > providers; this work was limited to seven patient scenarios, generalizability uncertain, data from a single medical center (Vanderbilt). It does not evaluate real-time deployment or patient satisfaction, and GPT-4 models were highly dependent on prompting style.

Jun Yan et al³² introduced Virtual Prompt Injection (VPI), a novel backdoor attack for instruction-tuned LLMs. It enables models to behave as if a hidden malicious prompt is appended during certain trigger scenarios. A simple poisoning method for injecting the backdoor during instruction tuning was proposed. Also proposed was quality-guided data filtering as a defense mechanism. The VPI attack successfully steers LLM behavior with minimal poisoning. Poisoning changed only 52 samples (0.1%) of negative responses about President Joe Biden from 0 to 40%. Backdoored models behave normally in non-trigger scenarios with high stealth; quality-guided filtering effectively reduces attack success.

Panagoulias et al³³ proposed a rule-augmented LLM framework for primary-care medical diagnosis and introduced a novel methodology with process diagram representation to define generative AI plus rule-based logic for domain-specific interactions. Primary AI assistant was used for symptom analysis and diagnostic suggestions. The authors created a dialogue-process-based blueprint combining NLP, rules, and domain knowledge. They designed an algorithmic evaluation process using context-based rules and dialogue theory. Primary AI assistant successfully provided domain-specific medical advice, demonstrated improved contextual relevance, structured interaction, and domain-aware outputs, and rule-based evaluation process that ensures systematic content checking.

Wang et al³⁴ conducted a systematic review of ChatGPT and conversational LLM applications in healthcare and classified 65 studies into four application domains: (1) Summarization, (2) Medical Knowledge Inquiry, (3) Prediction (diagnosis, treatment recommendation), and (4) Administration (documentation). Four areas of concern included reliability, bias, privacy, and public acceptability. The metrics of this study included a review of 820 articles, of which 65 included (7.9%), 92% (60/65) based on using ChatGPT. The most common applications: summarization plus medical inquiry (75%) and 89% studies raised concerns about reliability or bias. The authors found that although LLMs perform well in summarization and general medical Q&A, the LLMs were not reliable for complex tasks (diagnosis and clinical reasoning). This work revealed limits on reviews and found little empirical testing of privacy and bias mechanisms. Most LLMs were inaccurate for complex, high-risk medical tasks, with a heavy dependence on ChatGPT to limit the diversity of models. Studies lacked standardized evaluation protocols. There is no deep analysis of how LLMs cause bias or privacy breaches.³⁵

Lahat et al³⁶ assessed ChatGPT’s ability to answer 110 real-life gastrointestinal (GI) health-related patient questions. The key findings of this study were accuracy, clarity, and efficacy (1–5 scale); treatment questions: 3.9 ± 0.8, 3.9 ± 0.9, and 3.3 ± 0.9; symptom questions: 3.4 ± 0.8, 3.7 ± 0.7, and 3.2 ± 0.7; and diagnostic tests: 3.7 ± 1.7, 3.7 ± 1.8, and 3.5 ± 1.7. ChatGPT provided moderately accurate and clear answers, varying by question type. There was inconsistent performance across question categories. Accuracy depended heavily on the quality of online training data, which is not reliable for clinical decision-making and sometimes lacks specificity and context for GI-related queries. However, the future scope of this research study is to improve domain-specific tuning for GI health, develop medical-grade LLMs validated with expert clinical datasets, integrate real-time clinical knowledge bases, use hybrid models (LLM + rule-based GI guidelines), and enhance safety mechanisms to avoid misinformation.

McBain et al³⁷ evaluated the alignment between LLM-powered chatbots (ChatGPT, Claude, and Gemini) and expert clinicians in suicide risk assessment. The study tests how LLMs respond to queries categorized into risk levels. There were 9,000 responses tested (30 queries × 3 models × 100 responses). The literature survey revealed that the LLM model creations done so far were only concentrated on crisis prediction, emotion detection, HIV mitigation, augmenting telehealth, heart failure telehealth, speech recognition, cardiovascular and metabolic health education, and a rule-augmented LLM framework for primary-care medical diagnosis of GI impact on the health-related patient. But the research on LLM creation is still lagging. Especially, the telehealth record maintenance of the different supporting vendors is accessible by the government for doing any analysis on the health-related issues about the patient; it is difficult to analyze because the health records have heterogeneous availability. This article provides the solution to the government sector by introducing a new LLM for combining the heterogeneous health records into a common health record for analysis.

Telehealth Record: Proposed Architecture

In this section, the authors present a detailed architecture of the proposed work. This proposed system architecture consists of six phases: Input Layer, Schema Extraction Layer, Term Machine Engine, Schema Mapping Layer, Transformation Engine, and Unified Record Output.

The Input Layer collects heterogeneous health records and processes them to remove empty or duplicate cells. The Schema Extraction Layer extracts the header cell of each table, which will be used for the Schema Mapping Layer. The Term Machine Engine has eight categories of telehealth records, including patient information. Teleconsultation metadata includes information, clinical encounter data, telehealth vitals, remote monitoring data, diagnoses and investigations, treatments and follow-ups, and system-level and administrative fields. Each category will have the related term used for mapping the schema layer. The Schema Mapping Layer will map the health record to other health records and provide a unified schema. The Transformation Engine provides the combined table of data from heterogeneous health records. Finally, the unified record output provides the final table for future analysis.

Fig. 1. Proposed system architecture.

Process to Predict the Related Terms in Heterogeneous

Generally, creating the telehealth record involves the following groupings of terms (i.e., patient information, teleconsultation metadata, provider information, clinical encounter data, telehealth vitals, remote monitoring data, diagnosis and investigation, treatment and follow-up, and system-level and administrative fields). Appendix A lists the categories of telehealth record common terms and their related terms. Each category has the terms and related terms. That is, number, NPI: National Provider Identifier–US, O₂: oxygen, PMH: past medical history, PPG: photo plethysmogram, QoS: quality of service, RPM: remote patient monitoring, and UHID: unique health ID,

Checking the Term with Related Terms/Heterogeneous Telehealth Record

The process of connecting the terms with related terms with respect to the heterogeneous health record.

Grammar for defining the Health Record:

SET THR = {THR1, THR2, THR3, …. THRn}

Each THR has n number of terms

THRi = {T1, T2, T3, Tn}

Health record is a rows and column matrix. Each record has the matric formulation of the entries

Now, combine the heterogeneous health record into a single health record for analysis and manipulation of the data.

The process of finding the related term in the heterogeneous health record is as follows:

Step 1:

Check

Repeat each term in health record 1 to all other heath records.

Select the term from the THR1 and compare the term with other THR2, THR3, …. THRn

If the term in THR1 has the exact term or a related term given in Appendix A.

Mark that term into single term in a column

Place all the THT record terms into single column using the term as a header column.

Mark the selected column as checked.

Go to Check

Step 2:

If all the terms in THR1 are over, then take the Next THR2 and repeat,

Until all the THR are completed

Steps for Implementation Plan

Step 1: Collect data set

This step collects the heterogeneous health record data in the format for different data types like .csv, .json, .xlsx,.txt, etc. Sample data collection is given here: THR1.csv, THR2.json, THR3.xlsx, THR4_FHIR.json, Wearable_device_record.txt, and Lab_report_HL7.txt

Step 2: Extract terms from each THR

Using the Python to reach data set of table headers.

columns_THR1 = df1.columns

columns_THR2 = df2.columns

Step 3: Create related term table

Table 1 consists of a term with all other related terms. This work supports mapping all related terms into a single term column. The primary table is considered for generating the unified generated output. For generating, this table refers the related term tables.

*Table 1*. Terms listed in primary table with related terms in other tables.
S. no.	Term in primary table	Related Terms in other tables
1	THR1, Patient name	THR2-PID, THR3-MRN, THR4-UHID, THR5-Patient Number
2	THR1-Age	THR2-Patient age, THR3-Patient Age in year THR1, THR4-Age in year

Step 4: Implementing the term matching algorithm

Various classifications of matching table implementations are exact matching, where T1 = T2, and synonym-based matching are the terms matched with the related terms like Patient ID → MRN → UHID → Registration Number. Fuzzy matching is spelling error matching. Finally, the LLM-based semantic matching for instance “Temp,” “Body Temp,” “Temperature Reading” → Temperature.

Step 5: Build the Unified Schema

The uniformed schema are the terms in primary table terms

Step 6: Transform each THR into the unified format

Step 7: Merge All Records

Step 8: Output Unified Telehealth Record

Implementation

The proposed work is implemented using a sample of five tables to illustrate its execution. There are five table names: THR1, THR2, THR3, THR4, and THR5. Each table’s values are given in Tables 2 to 6. After executing the proposed method, the generated schema mapping table is shown in Tables 7 and 8 showing the unified generated telehealth record. This proposed work could be used to execute multiple heterogeneous health records for any kind of analysis. The implementation of this proposed work is executed using Python code, as shown in Appendix A, and the output of the code execution for the schema table and the generation of the unified telehealth record is shown in Appendices B and C. The RELATED-TERM DICTIONARY (Schema Mapping Dictionary) maintains related terms according to Table 1. Another Python code was generated to generate the same Unified Telehealth Record as shown in Appendix D, and the generated output is shown Appendix E.

*Table 2*. Telehealth Record value of THT1.
Field Name	Value
Patient ID	P23456
Patient Name	Ramesh Kumar
Age	45 yrs
Gender	Male
BP_Systolic	130 mmHg
BP_Diastolic	85 mmHg
Heart Rate	78 bpm
Consultation Mode	Video
Visit Date	2024-09-18
Diagnosis	Type 2 Diabetes
Prescribed Medication	Metformin

*Table 3*. Telehealth Record value of THT2.
Field Name	Value
PID	RK-445
Name	R Kumar
Patient Age	45 years
Sex	M
Blood Pressure	130/85 mmHg
Pulse	78 bpm
Visit Type	Online
Date Of Visit	18-09-2024
Issue	Diabetes Mellitus
Meds	Metformin 500 mg
PID: patient identification

*Table 4*. Telehealth Record value of THT 3.
Field Name	Value
device_user_id	99102
full_name	NULL
age_value	44.7 years
biological_sex	male
bp_sys_val	129.8 bpm
bp_dia_val	86.1 bpm
hr_sensor	77.6 bpm
session_mode	remote_sync
record_timestamp	2024-09-18T10:15:00Z
detected_condition	Elevated glucose levels

*Table 5*. Telehealth Record value of THT4.
Field Name	Value
UHID	90881
Patient_Name	Ramesh K
Age_in_Years	approx. 45 years
Gender_Type	male
SystolicPressure	132 mmhg
DiastolicPressure	84 mmHg
Heart_Rate	79 bpm
Consultation_Category	video_call
Checkup_Date	18/09/24
Provisional_DX	DM-II
DrugsGiven	Metformin tab

*Table 6*. Telehealth Record value of THT5.
Field Name	Value
Reg_No	445-RK
Pt_Name	R. Kumar
Yrs	45 years
Sex_	M
BP	131/86 mmHg
PR	78 bpm
Mode	VC
Dt	18 Sept 24
Dx	T2DM
Rx	Metf.

*Table 7*. Schema mapping table of sample THR1 to THR5.
Primary Term	Related Term
Patient ID	PID	device_user_id	UHID	Reg_No
PatientName	Name	full_name	Patient_Name	Pt_Name
Age (years)	PatientAge	age_value	Age_in_Years	Yrs
Gender	Sex	biological_sex	Gender_Type	Sex_
BP_Systolic (mmHg)	BloodPressure	bp_sys_val	SystolicPressure	BP
BP_Diastolic (mmHg)	Pulse	bp_dia_val	DiastolicPressure	PR
HeartRate (bpm)	VisitType	hr_sensor	Heart_Rate	Mode
ConsultationMode	DateOfVisit	session_mode	Consultation_Category	Dt
VisitDate	Issue	record_timestamp	Checkup_Date	Dx
Diagnosis	Meds	detected_condition	Provisional_DX	Rx
PrescribedMedication	-	-	DrugsGiven	-

*Table 8*. Unified telehealth record.
Primary term	Unified telehealth record
PatientID	P23456	RK-445	99102	90881	445-RK
PatientName	Ramesh Kumar	R Kumar	NULL	Ramesh K	R. Kumar
Age (yr)	45	45	44.7	approx. 45	45
Gender	Male	Male	Male	Male	Male
BP_Systolic (mmHg)	130	130/85	129.8	132	131/86
BP_Diastolic (mmHg)	85	78	86.1	84	78
HeartRate (BPM)	78	Online	77.6	79	VC
ConsultationMode	Video	18-09-2024	remote_sync	video_call	18-09-2024
VisitDate	18-09-2024	Diabetes Mellitus	2024-09-18T10:15:00Z	-	T2DM
Diagnosis	Type 2 Diabetes	Metformin 500 mg	elevated glucose levels	DM-II	Metf.
PrescribedMedication	Metformin	-	-	Metformin tab	-

Conclusion

This article proposed a novel model that performs similar work to proposing the LLM for preparing a unified health record from the heterogeneous health record for the analysis purpose. The proposed model has five stages of model preparation for processing the different types of health record. The primary role of the proposed work is primary term generation from the Term Machine Engine which makes the related terms in to a single category and also support for preparing the schema table, which supported for generating the Unified health record. The execution of this work was tested with Python coding of the sample five different telehealth records, and the output generated was verified successfully. In future, this work could be enhanced to test with the large data set.

Data Availability Statement (DAS), Data Sharing, Reproducibility, and Data Repositories

The data supporting the findings of this study are available within the article. Additional information related to the study may be obtained from the corresponding author upon reasonable request.

Application of AI-Generated Text or Related Technology

No artificial intelligence (AI)-generated text, images, or related technologies were used in the preparation of this manuscript.

Contributions

S. Hemalatha: Conceptualization, Methodology, Investigation, Writing – Original Draft, Supervision. K. V. S. V. Trinadh Reddy: Formal Analysis, Validation, Data Curation, Writing – Review & Editing. Tavanam Venkata Rao: Software Development, Resources, Visualization, Validation. Chaithra S.: Literature Review, Data Collection, Investigation, Documentation. Dr. Vijaya R. Kumbhar: Methodology Review, Quality Assurance, Writing – Review & Editing. Smitha Chowdary Ch: Project Administration, Supervision, Critical Review, and Final Approval of the Manuscript. All authors reviewed and approved the final manuscript.

Acknowledgments

The authors thank their respective institutions for providing the facilities and support necessary to conduct this research.

References

Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW, et al. Large language models in medicine. Nat Med. 2023;29:1930–40. https://doi.org/10.1038/s41591-023-02448-8
Naveed H, Khan AU, Qiu S, Saqib M, Anwar S, Usman M, et al 2025. A comprehensive overview of large language models. ACM Trans Intell Syst Technol. 2025;16(5):Article 106, 72 pages. https://doi.org/10.1145/3744746
Kasneci E, Sessler K, Küchemann S, Bannert M, Dementieva D, Fischer F, et al ChatGPT for good? Opportunities and challenges of large language models for education. Learn Individ Differ. 2023;103:102274. https://doi.org/10.1016/j.lindif.2023.102274
Peng C, Yang X, Chen A, Smith KE, PourNejatian N, Costa AB, et al. A study of generative large language model for medical research and healthcare. NPJ Digit Med. 2023;6:210. https://doi.org/10.1038/s41746-023-00958-w
Zan D, Chen B, Zhang F, Lu D, Wu B, Guan B, et al Large language models meet NL2Code: a survey. In Proceedings of the 61st annual meeting of the association for computational linguistics. Volume 1: Long Papers. Toronto, Canada: Association for Computational Linguistics; 2023. pp. 7443–64.
Zhang S, Dong L, Li X, Zhang S, Sun X, Wang S, et al. Instruction tuning for large language models: a survey. ACM Comput Surv. 2026;58(7):1–36. https://doi.org/10.48550/arXiv.2308.10792
Zhao H, Chen H, Yang F, Liu N, Deng H, Cai H, et al Explainability for large language models: a survey. ACM Trans Intell Syst Technol. 2024;15(2):Article 20, 38 pages. https://doi.org/10.1145/3639372
Xu FF, Alon U, Neubig G, Hellendoorn VJ. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (MAPS 2022). New York, NY, USA: Association for Computing Machinery; 2022. pp. 1–10.
Szabó Z, Bilicki V. Access control of EHR records in a heterogeneous cloud infrastructure. Acta Cybern. 2021;25(2):485–516. https://doi.org/10.14232/actacyb.290283
Savoska S, Ristevski B, Trajkovik V. Personal health record data-driven integration of heterogeneous data. In: N Dey, editor. Data-driven approach for bio-medical and healthcare. Data-Intensive Research. Singapore: Springer; 2023.
Strika Z, Petkovic K, Likic R, Batenburg R. Bridging healthcare gaps: a scoping review on the role of artificial intelligence, deep learning, and large language models in alleviating problems in medical deserts. Postgrad Med J. 2025;101(1191):4–16. https://doi.org/10.1093/postmj/qgae122
Lee C, Mohebbi M, O’Callaghan E, Winsberg M. Large language models versus expert clinicians in crisis prediction among telemental health patients: comparative study. JMIR Ment Health. 2024;11:e58129. https://doi.org/10.2196/58129
Yin S, Fu C, Zhao S, Li K, Sun X, Xu T, et al A survey on multimodal large language models. Natl Sci Rev. 2024;11(12):nwae403. https://doi.org/10.1093/nsr/nwae403
Huang J, Gu S, Hou L, Wu Y, Wang X, Yu H, et al Large language models can self-improve. In: Proceedings of the 2023 conference on empirical methods in natural language processing. Singapore: Association for Computational Linguistics; 2023. pp. 1051–68.
Huang H, Zheng O, Wang D, et al ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. 2023;15:29. https://doi.org/10.1038/s41368-023-00239-y
Demszky D, Yang D, Yeager DS, Bryan CJ, Clapper M, Chandhok S, et al. Using large language models in psychology. Nat Rev Psychol. 2023;2:688–701. https://doi.org/10.1038/s44159-023-00241-5
Teubner T, Flath CM, Weinhardt C, et al Welcome to the era of ChatGPT et al Bus Inf Syst Eng. 2023;65:95–101. https://doi.org/10.1007/s12599-023-00795-x
Rosario ID, Ghosh A, Huang B, Yan Y, Zhang W, Lin W. Enhancing telehealth patient experience with emotion-sensitive large language models. In: XS Yang, S Sherratt, N Dey, A Joshi, editors. Proceedings of ninth international congress on information and communication technology. ICICT 2024. Lecture Notes in Networks and Systems. Vol. 1000. Singapore: Springer; 2024.
Busch D, Za’in C, Chan HM, et al A blueprint for large language model-augmented telehealth for HIV mitigation in Indonesia: a scoping review of a novel therapeutic modality. Health Informatics J. 2025;31(1). https://doi.org/10.1177/14604582251315595
Snoswell CL, Snoswell AJ, Kelly JT, Caffery LJ, Smith AC. Artificial intelligence: augmenting telehealth with large language models. J Telemed Telecare. 2023;31(1):150–4. https://doi.org/10.1177/1357633X231169055
Farmer H, Kreiner K, Schütz T, Pölzl G, Puelacher C, Schreier G. The evolution of telehealth in heart failure management: the role of large language models and HerzMobil as a potential use case. Stud Health Technol Inform. 2024;313:228–33. https://doi.org/10.3233/SHTI240042
Kwan HY. User-focused telehealth powered by LLMs: bridging the gap between technology and human-centric care delivery. In: 2024 4th international conference on Computer Communication and Artificial Intelligence (CCAI). Xi’an, China; 2024. pp. 187–91.
Zhang X. Language modeling for automatic speech recognition in telehealth. University of Missouri—Columbia ProQuest Dissertations and Theses, 2005. 1438303. Available from: https://www.proquest.com/openview/493d2ad066dc495d5304b1cae91d9b35/1?pq-origsite=gscholar&cbl=18750&diss=y
Yang R, Tan TF, Lu W, Thirunavukarasu AJ, Ting DSW, Liu N. Large language models in health care: development, applications, and challenges. Health Care Sci. 2023;2(4):255–63. https://doi.org/10.1002/hcs2.61
Nazi ZA, Peng W. Large language models in healthcare and medical domaIn: a review. Informatics. 2024;11(3):57. https://doi.org/10.3390/informatics11030057
Zhao Y, Qu P, Xiao J, Wu J, Zhang B. Optimizing Telehealth Services with LILM-driven conversational agents: an HCI evaluation. J Ind Eng Appl Sci. 2024;2(4):122–31.
Rullo R, Maatouk A, Huang T, Chen J, Qiu W, O’Connor G, et al Interdisciplinary development and fine-tuning of CARDIO, a large language model for cardiovascular health education in HIV care: tutorial. J Med Internet Res. 2025;27:e77053. https://doi.org/10.2196/77053
Mohamed M, Emad R, Hamdi A. A multi-layered large language model framework for disease prediction. In: XS Yang, S Sherratt, N Dey, A Joshi, editors. Proceedings of tenth international congress on information and communication technology. ICICT 2025. Lecture Notes in Networks and Systems. Vol. 1416. Singapore: Springer; 2025.
Chow JCL, Wong V, Li K. Generative pre-trained transformer-empowered healthcare conversations: current trends, challenges, and future directions in large language model-enabled medical Chatbots. BioMedInformatics. 2024;4(1):837–52. https://doi.org/10.3390/biomedinformatics4010047
Park YJ, Pillai A, Deng J, Guo E, Gupta M, Paget M, et al. Assessing the research landscape and clinical utility of large language models: a scoping review. BMC Med Inform Decis Mak. 2024;24:72. https://doi.org/10.1186/s12911-024-02459-6
Liu S, Wright AP, Mccoy AB, Huang SS, Genkins JZ, Peterson JF, et al Using large language model to guide patients to create efficient and comprehensive clinical care message. J Am Med Inform Assoc. 2024;31(8):1665–70. https://doi.org/10.1093/jamia/ocae142
Yan J, Yadav V, Li S, Chen L, Tang Z, Wang H, et al Backdooring instruction-tuned large language models with virtual prompt injection. In: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1: Long Papers. Mexico City, Mexico: Association for Computational Linguistics; 2024. pp. 6065–86.
Panagoulias DP, Virvou M, Tsihrintzis GA. Augmenting large language models with rules for enhanced domain-specific interactions: the case of medical diagnosis. Electronics. 2024;13(2):320. https://doi.org/10.3390/electronics13020320
Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, et al Applications and concerns of ChatGPT and other conversational large language models in health care: systematic review. J Med Internet Res. 2024;26:e22769. https://doi.org/10.2196/22769
Ong JCL, Seng BJJ, Law JZF, Low LL, Kwa ALH, Giacomini KM, et al Artificial intelligence, ChatGPT, and other large language models for social determinants of health: current state and future directions. Cell Rep Med. 2024;5(1):101356. https://doi.org/10.1016/j.xcrm.2023.101356
Lahat A, Shachar E, Avidan B, Glicksberg B, Klang E. Evaluating the utility of a large language model in answering common patients’ gastrointestinal health-related questions: are we there yet? Diagnostics. 2023;13(11):1950. https://doi.org/10.3390/diagnostics13111950
McBain RK, Cantor JH, Zhang LA, Baker O, Zhang F, Burnett A, et al Evaluation of alignment between large language models and expert clinicians in suicide risk assessment. Psychiatr Serv. 2025;76(11):944–50. https://doi.org/10.1176/appi.ps.20250086

Copyright Ownership: This is an open-access article distributed in accordance with the Creative Commons Attribution Non-Commercial (CC BY-NC 4.0) license, which permits others to distribute, adapt, enhance this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See http://creativecommons.org/licenses/by-nc/4.0. The authors of this article own the copyright.

Appendix A. Python code for implementing the schema mapping table and Unified Telehealth Record.

Appendix B. Categories of Term Machine Engine.

Serial Number/Category	Common terms	Related terms used for mapping
1. Patient Information
	Patient ID	Patient ID, patient number, PID, MRN UHID, EMR ID, registration ID, beneficiary ID
	Name	Patient name, full name, legal name, registered name
	Age	Patient age (yrs), demographic age
	Gender	Exobiological sex, gender identity (in some systems)
	Date of birth	DOB, DOB (ISO format)
	Address	Phone number, mobile number, primary contact, patient phone
	Contact number	Addresses: residential, home, permanent, communication
	Email	Email ID, address, registered Email
	Emergency contact	ICE contact, emergency phone, guardian contact, next of kin contact
	Insurance details	Insurance ID, policy number, health plan ID, coverage info
	Patient location during consultation
2. Teleconsultation Metadata
	Consultation date	Appointment time, encounter timestamp, session time
	Teleconsultation ID	Visit ID, encounter ID, consultation reference number, telehealth session ID
	Mode of consultation	Telehealth mode, consultation type, visit type (video/audio/text)
	Platform used (Zoom, MS Teams, custom app)	Telehealth platform, virtual care system, client application
	Session quality (latency, jitter, dropout)	Network quality metrics, QoS score, video/audio, quality indicator
	Recorded session availability	-
3. Provider Information
	Doctor/Clinician ID	Provider ID, physician ID, doctor identifier, NPI
	Name	-
	Specialization	Specialty, medical domain, department
	Hospital/Clinic affiliation	-
	Provider location	-
	Provider notes	-
	Referral source (if any)	-
4. Clinical Encounter Data
	Chief complaint	Reason for visit, presenting complaint
	History of present illness	Present illness details, symptom history, complaint duration
	Past medical history	PMH, medical background, chronic conditions
	Past surgical history	-
	Family history	-
	Social history	-
	Allergies	Allergy list, drug allergies, allergy information
	Current medications	-
	Review of systems	-
	Clinical notes	Progress notes, examination notes, visit notes
	Diagnostic impressions	-
	Differential diagnosis	Primary diagnosis, provisional diagnosis, final diagnosis, ICD-coded diagnosis
5. Telehealth Vitals and Remote Monitoring Data
	Heart rate	Pulse rate, HR, BPM
	Blood pressure	BP, systolic/diastolic, arterial pressure
	Respiratory rate
	SpO₂	O₂ saturation, pulse oximetry value
	Temperature	Body temperature, temp reading
	Blood glucose	BG, Glucose Level
	Weight / BMI
	ECG waveform data	ECG waveform, electrocardiogram data, lead signals
	PPG data
	Sleep data
	Activity steps / motion
	Pain score
	Device ID	Sensor ID, Monitor ID, Wearable ID, Device Serial Number
	Device manufacturer	-
	Sampling frequency	-
	Signal quality index	-
6. Diagnostic + Investigations
	Lab test orders	Investigation Orders, Test Requests, Lab Requests
	Lab results (LOINC coded)	Investigation Results, Diagnostic Findings, LOINC-coded Results
	Imaging orders	-
	Imaging reports (DICOM metadata)	Radiology Report, DICOM Report Scan Report
	Uploaded documents	-
	Attachments (PDF, JPEG, PNG, audio)	Uploaded Files, Patient Documents, Medical Attachments
	AI decision support outputs (if any)	-
7. Treatment and Follow-Up		-
	Treatment plan	Care Plan, Management Plan, Clinical Plan
	Prescriptions (drug name, dosage, duration)	Medication Order, e-Prescription, Drug Prescription
	Advice given	-
	Follow-up date	Next Appointment, Follow-up Schedule, Review Date
	Telemonitoring schedule	-
	Referral notes	-
	Patient instructions	-
8. System-Level and Administrative Fields
	Record creation timestamp	Created On, Record Timestamp, Date of Entry
	Record updated timestamp	Modified On, Last Updated Time
	Data source system (EHR, RPM device, telehealth app)	Origin System, Source Application, Input Source
	Access control level	Permission Level, User Access Rights, Security Role
	Consent status
	Data interoperability format (HL7, FHIR JSON, CDA)
	Error codes or missing-field indicators
	Billing codes (CPT/ICD)	CPT Code, Procedure Code, Billing Item
	Payment method
	Billing amount
BG: blood glucose, BPM: beats per minute, CC: chief complaint, CDA clinical document architecture, CPT: current procedural terminology, DICOM: digital imaging and communications in medicine, DOB: date of birth, ECG: electrocardiogram, EMR: emergency medical registration, FHIR: Fast Healthcare Interoperability Resources, HL7 Health Level Seven International, HPI: history of present illness, HR: heart rate, ICD: International Classification of Disease, ICE: in case of emergency, ID: identification, ISO: International Organization for Standardization, JSON: JavaScript Object Notation, LOINC: Logical Observation Identifiers Names and Codes, MBI: body mass index, MRN: medical record.

Appendix C. Schema Mapping Table

Appendix D. Unified Telehealth Record

Appendix E. Unified Telehealth Record