New

Lead Data Engineer

McKesson Corporation
United States, Texas, Irving
Jun 17, 2025
McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patients, our communities, and our people. If you want to be part of tomorrow's health today, we want to hear from you. Position Overview We are seeking a highly skilled Senior Data Engineer to lead our data asset development team and drive the creation of enterprise-grade data products using Microsoft Azure's modern data stack. This technical leadership role will focus on architecting, developing, and optimizing scalable data assets that power critical business insights across our Fortune 10 organization. The successful candidate will combine deep expertise in Python and Spark ecosystem with proven leadership capabilities to guide a team of data engineers in delivering high-quality, reusable data assets that serve diverse analytical and operational needs across the enterprise. Key Responsibilities Technical Leadership & Data Asset Development Lead the design and development of enterprise data assets including data models, feature stores, and analytical datasets using Azure modern data stack Architect scalable data pipelines and ETL/ELT processes leveraging Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake Storage Implement advanced data processing solutions using Apache Spark on Azure Databricks for large-scale data transformation and analytics Develop reusable data frameworks and libraries in Python to accelerate data asset creation and ensure consistency across the organization Establish data asset governance including versioning, lineage tracking, and quality monitoring to ensure enterprise-grade reliability Team Leadership & Mentorship Lead and mentor a team of 8-12 data engineers focused on data asset development and optimization Provide technical guidance on complex data engineering challenges, architectural decisions, and best practices Foster collaborative development environment emphasizing code quality, testing, and continuous improvement Drive knowledge sharing initiatives and technical training to elevate team capabilities in modern data engineering practices Collaborate with cross-functional teams including Data Science, Analytics, and Business Intelligence to deliver integrated data solutions Azure Data Platform Optimization Optimize Azure Synapse Analytics workflows for high-performance data processing and analytical workloads Implement efficient data storage strategies using Azure Data Lake Storage Gen2 with appropriate partitioning and compression techniques Leverage Azure Data Factory for orchestrating complex data workflows and managing data pipeline dependencies Utilize Azure Databricks for advanced Spark-based data processing, machine learning pipelines, and real-time analytics Integrate with Azure services including Cosmos DB, Event Hubs, and Service Bus for comprehensive data ecosystem solutions Python & Spark Ecosystem Excellence Develop sophisticated data processing applications using Python with emphasis on performance, scalability, and maintainability Implement advanced Spark programming techniques including RDD operations, DataFrame API, and Spark SQL for optimal data processing Leverage PySpark for large-scale data transformations, aggregations, and complex analytical computations Utilize Spark Streaming for real-time data processing and event-driven analytics solutions Implement Delta Lake patterns for reliable data lakes with ACID transactions and time travel capabilities Data Quality & Performance Optimization Establish comprehensive data quality frameworks including validation, profiling, and anomaly detection Implement performance monitoring and optimization strategies for data pipelines and processing workflows Design and implement data testing strategies including unit testing, integration testing, and data validation Optimize Spark jobs for cost efficiency and performance including cluster sizing, caching strategies, and partition optimization Ensure data asset documentation, metadata management, and knowledge transfer processes Minimum Qualifications Degree or equivalent andtypically requires 10+ years ofrelevant experience. Less yearsrequired if has relevant Master'sor Doctorate qualifications. Critical Skills Expert-level proficiency in Python programming including advanced libraries (Pandas, NumPy, Scikit-learn, PyTorch/TensorFlow) Deep expertise in Apache Spark ecosystem including Spark Core, Spark SQL, PySpark, and Spark Streaming Extensive experience with Microsoft Azure data services including Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage, and Azure Databricks Strong background in SQL and database technologies including data modeling, query optimization, and performance tuning Proficiency in version control systems (Git), CI/CD pipelines, and infrastructure-as-code practices Additional Skills - Proven experience leading technical teams of 5+ data engineers with focus on mentorship and skill development Strong project management skills with ability to deliver complex data engineering projects on time and within scope Advanced Python programming with focus on data processing, analysis, and pipeline development Experience with batch processing, real-time data processing, streaming analytics, and event-driven architectures Experience with data manipulation libraries (Pandas, Polars) and numerical computing (NumPy, SciPy) Knowledge of data governance, data quality, and metadata management best practices as well as unstructured data processing and management Background in DevOps practices for data engineering including automated testing, deployment, and monitoring Expert knowledge of Spark architecture, execution model, and optimization strategies Preferred Qualifications Azure certifications including Azure Data Engineer Associate or Azure Solutions Architect Expert Databricks certifications (Spark Developer, Data Engineer Professional) Experience with additional Azure services including Azure Machine Learning, Azure Cognitive Services, and Azure Functions Knowledge of container technologies (Docker, Kubernetes) and serverless computing patterns Understanding of data security, privacy, and compliance requirements in enterprise environments Deep understanding of data architecture patterns including data lakes, data warehouses, and modern data platform design Knowledge of async programming, multiprocessing, and performance optimization techniques Familiarity with testing frameworks (pytest, unittest) and code quality tools (Black, Flake8, MyPy) Advanced Azure Synapse Analytics usage including dedicated SQL pools, serverless SQL, and Spark pools Success Metrics Data asset delivery quality and timeline adherence Team productivity and technical skill development Data pipeline performance and reliability improvements Stakeholder satisfaction with data asset usability and quality Innovation in data engineering practices and technology adoption What We Offer Competitive compensation package including base salary, performance bonus, and equity participation Comprehensive benefits including health, dental, vision, and retirement planning Professional development opportunities including training, certifications, and conference attendance Opportunity to work with cutting-edge data technologies at Fortune 10 scale Collaborative culture emphasizing technical excellence, innovation, and continuous learning Clear career advancement path within our growing data engineering organization *Candidate must be authorized to work in the U.S, now or in the future, without the support from McKesson.* We are proud to offer a competitive compensation package at McKesson as part of our Total Rewards. This is determined by several factors, including performance, experience and skills, equity, regular job market evaluations, and geographical markets. The pay range shown below is aligned with McKesson's pay philosophy, and pay will always be compliant with any applicable regulations. In addition to base pay, other compensation, such as an annual bonus or long-term incentive opportunities may be offered. For more information regarding benefits at McKesson, pleaseclick here. Our Base Pay Range for this position $149,900 - $249,800 McKesson is an Equal Opportunity Employer McKesson provides equal employment opportunities to applicants and employees and is committed to a diverse and inclusive environment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, age or genetic information. For additional information on McKesson's full Equal Employment Opportunity policies, visit our Equal Employment Opportunity page. Join us at McKesson!