Data Engineer – Streaming & Real-Time Storage

Islamabad, Islamabad, Pakistan
Full Time
Manager/Supervisor

Job Title: Data Engineer – Streaming & Real-Time Storage

Level: Intermediate / Senior
Employment Type: Full-time

Role Overview

We are seeking a Data Engineer to own and optimize the data infrastructure that powers our automation and AI ecosystem. You will be responsible for ensuring high-concurrency, low-latency data flow, maintaining data integrity, and designing storage strategies that support both real-time analytics and AI model outputs.

This role requires expertise in streaming architectures, database optimization, and system integration, with a focus on maintaining performance as data volume grows.

Key Responsibilities

Pipeline Optimization & Streaming

  • Refine and manage Kafka stream consumers and producers for high-throughput, low-latency processing

  • Ensure timely ingestion of data from RPA sources to storage and analytical sinks

  • Monitor, troubleshoot, and optimize streaming pipelines for reliability and performance

Schema & Storage Design

  • Optimize relational (MySQL) and non-relational storage strategies for high-write environments

  • Design scalable schemas to support AI/ML outputs and downstream analytics

  • Implement storage solutions that balance speed, reliability, and query efficiency

Data Governance & Quality

  • Ensure data integrity, consistency, and quality across streaming pipelines

  • Collaborate with data engineers and analysts to enforce standards and monitoring

  • Implement validation and alerting mechanisms for real-time data

Scaling & Performance Strategy

  • Design high-performance ingestion patterns to replace basic database inserts where needed

  • Support infrastructure growth, ensuring the system scales with increasing data volumes

  • Provide guidance on architectural improvements and optimization opportunities

Technical Requirements

  • Streaming & Messaging: Expert knowledge of Kafka (Producers/Consumers, Connect, Schema Registry)

  • Database Engineering: Strong SQL optimization skills; experience in write-heavy, high-concurrency environments

  • System Integration: Experience building reliable connectors between distributed systems

  • Familiarity with real-time storage patterns and high-availability architectures

  • Experience monitoring and troubleshooting production data pipelines

Nice to Have

  • Experience with NoSQL or in-memory databases (Redis, Cassandra, etc.)

  • Knowledge of cloud-based streaming platforms (AWS Kinesis, GCP Pub/Sub, Azure Event Hubs)

  • Exposure to MLOps pipelines or real-time AI deployment scenarios

  • Familiarity with containerization and orchestration (Docker, Kubernetes)

Soft Skills

  • Strong problem-solving and analytical skills

  • Ability to operate in fast-paced, high-velocity environments

  • Effective collaboration with data scientists, ML engineers, and operations teams

  • Ownership mentality with a focus on performance, reliability, and scalability

Why Join

  • Own the data backbone of a cutting-edge automation and AI ecosystem

  • Shape high-performance streaming pipelines that directly power ML models

  • Work in a fast-moving, innovative, and distributed environment with strong technical ownership

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*