AI Integration Engineer

AI Integration Engineer

Engineering

|

AI Integration Engineer

AI Integration Engineer

Job Description

We are seeking an exceptional AI Integration Engineer who operates at the intersection of development, operations, data, and systems engineering to build solutions for large-scale continuous data transformation and delivery. This role focuses specifically on building and maintaining data pipelines for both structured and unstructured data, enabling the development and deployment of AI/ML models that power our RAG-based document processing and insight generation systems.

Responsibilities

Data Infrastructure & Integration

  • Design and implement data integrations and ingestion processes for internal and external data sources

  • Build and maintain scalable data pipelines for ingesting, processing, and transforming unstructured data sources (customer feedback, documents, multimedia content)

  • Develop data models and mapping rules to transform raw data into actionable insights and structured outputs

  • Architect and implement semantic layers that integrate analytics data from multiple sources efficiently

AI/ML System Integration

  • Develop and maintain robust backend APIs and services supporting the entire prompt-to-answer workflow

  • Implement and optimize retrieval logic including vector search, hybrid search, and advanced information retrieval techniques

  • Manage document ingestion pipelines including parsing, OCR, chunking, and embedding generation

  • Support integration of various LLM providers (OpenAI, Azure AI, Anthropic) with internal business data sources

Infrastructure & Operations

  • Ensure reliability, scalability, and low latency of AI response generation systems

  • Implement data governance policies and procedures for responsible and ethical use of data in AI applications

  • Develop data quality monitoring and validation processes specifically for AI/ML datasets, including bias identification and mitigation

  • Build and maintain monitoring, alerting, and observability systems for AI infrastructure

Collaboration & Documentation

  • Collaborate with analytics and data science teams to understand requirements and deliver solutions

  • Work with data scientists to ensure data is available in appropriate format and quality for model training and deployment

  • Maintain comprehensive documentation including data models, mapping rules, and data dictionaries

  • Partner with internal business stakeholders, technology resources, and external vendors

Requirements

Education & Experience

  • Bachelor's degree in Computer Science, Engineering, or equivalent work experience

  • 5+ years of experience in designing, building, and maintaining scalable data solutions for large-scale analytics

  • Proven ability to lead development projects from start to finish with demonstrated results

Technical Skills

  • Proficiency in Python, Java, or R and open-source frameworks for distributed processing (Hadoop, Spark)

  • Expert-level SQL and development experience with cloud database environments (Snowflake, Redshift, Databricks)

  • Hands-on experience with modern cloud data stack tools for code management, versioning (Git), CI/CD, and automation

  • Experience with orchestration tools (Apache Airflow) and monitoring & alerting systems

Data & AI Expertise

  • Strong understanding of data modeling, data warehousing, and ETL concepts

  • Experience with vector databases (Pinecone, Milvus, Weaviate, Chroma)

  • Proficiency in handling unstructured data formats (JSON, Parquet, text, images, audio, video)

  • Familiarity with AI/ML model development lifecycle and data requirements for training and deployment

Cloud & Infrastructure

  • Experience with cloud platforms (AWS, Azure, Google Cloud) and their AI/ML services

  • Knowledge of containerization and orchestration technologies (Docker, Kubernetes)

  • Understanding of API development and web standards (REST, GraphQL, gRPC, HTTP, JSON)

Preferred Skills

Preferred Qualifications
  • Master's degree in Computer Science, Engineering, or equivalent work experience

  • Experience with cloud-based AI/ML platforms and services

  • Knowledge of data augmentation techniques for improving AI/ML model performance

  • Experience with data labeling platforms (Amazon SageMaker Ground Truth, Labelbox)

  • Understanding of responsible AI principles and data privacy regulations (GDPR, CCPA)

  • Experience with data governance and observability tools (Datahub, Collibra)

  • Basic frontend development experience (HTML, CSS, JavaScript)

Tools & Technologies

Programming & Frameworks

  • Python, Java, R

  • Apache Spark, Apache Hadoop

  • FastAPI, Django, Flask

Data & AI Platforms

  • Snowflake, Redshift, Databricks

  • Pinecone, Milvus, Weaviate, Chroma

  • LangChain, LlamaIndex

  • OpenAI, Azure AI, Anthropic, Cohere

Cloud & Infrastructure

  • AWS, Azure, Google Cloud Platform

  • Docker, Kubernetes

  • Apache Airflow, Apache Kafka

Development Tools

  • Git, GitHub, GitLab

  • Jenkins, GitHub Actions

  • Jupyter Notebooks, Dataiku

Category

Engineering

Engineering

Salary

118k - 197k/year

118k - 197k/year

Posted

3 months ago
5 months ago
5 months ago

Location

United States

United States

( Remote )

Share or copy

linkedin iconfacebook iconx icon

Not the right job for you?