Earnbetter

Job Search Assistant

Data Scientist (Healthcare)100% Remote

Irvine Technology Corporation • Remote • Posted 1 day ago via LinkedIn

Boost your interview chances in seconds

Tailored resume, cover letter, and cheat sheet

Remote • Full-time • Senior Level

Job Highlights

Using AI ⚡ to summarize the original job post

The Data Scientist at Irvine Technology Corporation will be responsible for driving insights from vast amounts of patient and environmental data within the data warehouse. This role involves experience with machine learning and statistical analyses, working closely with research teams to design analysis specifications, and developing and implementing algorithms to strengthen analyses. The successful candidate will have demonstrated competence in developing highly scalable artificial intelligence systems with multiple dependencies across teams.

Responsibilities

  • Drive insights from vast amounts of patient and environmental data within the data warehouse.
  • Experience with machine learning and statistical analyses.
  • Work closely with researcher teams to design analysis specifications, including input data specifications, data cleaning, algorithms, and interpretation of results.
  • Develop and implement algorithms on existing data warehouse records and identify new external data sources to be ingested to strengthen analyses.
  • Address a wide variety of clinical and research outcomes.
  • Research and implement AI algorithms, apply off-the-shelf AI and data-centric tools, and collect, store, and maintain data.
  • Demonstrate competence in developing highly scalable artificial intelligence systems with multiple dependencies across teams.

Qualifications

Required

  • Programming Languages: Python, SQL, SAS, MATLAB, R
  • Data Science & Machine Learning Frameworks: TensorFlow, PyTorch, Keras, Scikit-Learn, XGBoost or LightGBM, Large Language Models (LLMs), AWS SageMaker
  • Healthcare-Specific Knowledge: HL7, FHIR, ICD-10 Coding, HIPAA Compliance, Clinical Terminologies (e.g., SNOMED, LOINC)
  • Data Tools & Platforms: SQL-based Databases, Data Warehousing, Data Visualization Tools, NoSQL Databases, Apache Hadoop or Apache Spark, ETL Tools
  • Statistical & Analytical Techniques: Regression Analysis, Descriptive Statistical Tests, Clustering Techniques, Dimensionality Reduction, NLP, Predictive Modeling, Time Series Analysis, Survival Analysis, A/B Testing
  • Cloud Computing & DevOps Skills: AWS, Google Cloud, or Azure, AWS Lambda and Step Functions, Docker, Kubernetes, ECS, EKS or AWS Fargate, CI/CD Pipelines
  • Electronic Health Records (EHR) Systems: Experience with Epic or Cerner

Preferred

  • Masters degree in computer science, artificial intelligence, informatics or closely related field

Full Job Description

Data Scientist (Healthcare) 100% Remote


What you will do:

The Data Scientist will be responsible for driving insights from the vast amounts of patient and environmental data available within our data warehouse.

  • Experience with machine learning and statistical analyses are needed.
  • Work closely with researcher teams to design analysis specifications, including input data specifications, data cleaning, algorithms, and interpretation of results.
  • Develop and implement algorithms on existing data warehouse records and identify new external data sources to be ingested to the data warehouse to strengthen analyses.
  • Analysis will address a wide variety of clinical and research outcomes.
  • Research and implement AI algorithms, apply off-the-shelf AI and data-centric tools, and collect, store, and maintain data.
  • The successful candidate will have demonstrated competence in developing highly scalable artificial intelligence systems with multiple dependencies across teams.


What gets you the job:


Programming Languages

  • Python (for preprocessing, data analysis, machine learning, scripting)
  • SQL (for database querying and management)
  • SAS (common in healthcare data analysis)
  • MATLAB (for algorithm development, though less common in healthcare)
  • R (for statistical computing and bioinformatics)


Data Science & Machine Learning Frameworks

  • TensorFlow, PyTorch, Keras (for deep learning and complex machine learning, including neural networks and advanced AI)
  • Scikit-Learn (for classical machine learning)
  • XGBoost or LightGBM (for gradient boosting in structured data)
  • Large Language Models (LLMs) (for text generation, summarization, etc.)
  • AWS SageMaker (for end-to-end machine learning development, training and scalable machine learning in a managed cloud enviornment)
  • Natural Language Processing (NLP) Tools and Frameworks (e.g., Hugging Face, AWS Comprehend Medical for extracting insights from clinical text data)
  • AWS Bedrock (for accessing pre-trained LLMs and foundation models without managing infrastructure)


Healthcare-Specific Knowledge

  • HL7 (Health Level Seven International standards for electronic health information exchange)
  • FHIR (Fast Healthcare Interoperability Resources standard for exchanging healthcare information electronically)
  • ICD-10 Coding (for medical diagnosis and procedure classification)
  • HIPAA Compliance (handling sensitive patient data securely)
  • Clinical Terminologies (e.g., SNOMED, LOINC)


Data Tools & Platforms

  • SQL-based Databases (e.g., PostgreSQL, MySQL, Microsoft SQL Server)
  • Data Warehousing (e.g., AWS Redshift, Google BigQuery)
  • Data Visualization Tools (e.g., Tableau, Power BI, Plotly)
  • NoSQL Databases (e.g., MongoDB, Cassandra)
  • Apache Hadoop or Apache Spark (for big data processing)
  • ETL Tools (e.g., Informatica, Talend, AWS Glue)


Statistical & Analytical Techniques

  • Regression Analysis (linear, logistic, multinomial, ordinal, etc.)
  • Descriptive Statistical tests (correlaton, covariance, chi-square, univariate, multivariate analyses)
  • Clustering Techniques (e.g., k-means, hierarchical clustering etc)
  • Dimensionality Reduction (e.g., Principal Component Analysis (PCA),
  • Natural Language Processing (NLP) (for analyzing clinical notes or electronic health records)
  • Predictive Modeling (for patient outcomes, risk analysis)
  • Time Series Analysis (useful for patient monitoring, trend analysis)
  • Survival Analysis (for patient outcome predictions)
  • A/B Testing (for clinical trials or health interventions)


Cloud Computing & DevOps Skills

  • AWS, Google Cloud, or Azure (cloud platforms for scalable computing)
  • AWS Lambda and Step Functions (serverless computing and workflow automation)
  • Docker, Kubernetes, ECS, EKS or AWS Fargate (for containerization and orchestration of data applications and reproducibility)
  • CI/CD Pipelines (for automating deployment and monitoring of machine learning models)


Electronic Health Records (EHR) Systems

  • Experience with Epic or Cerner (popular EHR systems in healthcare)


Data Governance & Security

  • Data Privacy (understanding of privacy laws such as HIPAA, GDPR)
  • Data Anonymization or De-identification techniques (for research and compliance)
  • Auditing & Compliance Tools (for ensuring secure and compliant data handling)
  • Responsible AI Practices (AI governance frameworks ensuring healthcare regulations and ethical standards)

\Bachelors Degree computer science, artificial intelligence, informatics or closely related field

Masters preferred