Ravi Krishna Yalamanchili
From Raw Clinical Data to Reliable AI: Designing End-to-End Data Quality Pipelines in Healthcare
Abstract:
Healthcare organizations are under pressure to deliver AI-driven insights for clinical, operational, and financial decision-making, yet the underlying data feeding these models is often fragmented, noisy, and inconsistent. This talk presents a practical, end-to-end framework for building trustworthy data pipelines that transform raw clinical and enterprise data into reliable inputs for predictive analytics and intelligent systems.
Drawing on real-world experience in a large healthcare provider environment, the session will show how to combine robust data engineering practices (staging and refinement layers, schema standardization, and metadata management) with systematic data quality controls such as pattern-based anomaly detection, outlier handling, and referential integrity checks. We will discuss how these pipelines support downstream applications including time-series forecasting for census and staffing, segmentation of patient populations, and risk-based analytics for care management and reimbursement.
The talk will also highlight governance and collaboration aspects—how data engineers, QA analysts, and business stakeholders can co-design validation rules, monitor data quality trends, and close the feedback loop as models evolve. Attendees will leave with a reference architecture, checklist of practical controls, and actionable guidelines for making AI and machine learning in healthcare more robust, explainable, and production ready.
Profile:
Ravi Krishna Yalamanchili is a data and analytics professional specializing in healthcare data engineering, quality assurance, and applied data science. He works as a Data Analyst at a major healthcare organization in the United States, where he focuses on validating complex ETL pipelines, modern data platforms, and analytics solutions that support clinical and operational decision-making at scale. His work spans Azure Data Factory, SQL Server, Databricks, and enterprise reporting ecosystems, with a strong emphasis on data quality, governance, and regulatory compliance.
Ravi is currently pursuing a Ph.D. in Information Technology, with research interests in data-driven healthcare, trustworthy AI, and intelligent automation of testing and validation in data pipelines. He has been actively involved in academic and professional communities around data science, big data, and AI for healthcare, including invited roles in conferences and collaborative research initiatives. His long-term goal is to build scalable, ethical, and impact-oriented data solutions that improve patient outcomes and operational efficiency.