About Our Team
Our employees thrive in a culture that's fast\-paced and ego\-free, where innovation and collaboration are encouraged at every turn. We are an organization that provides federal agencies and commercial clients instant access to experienced and talented professionals who understand their unique challenges and know the most efficient ways to address them. We are continually investing in resources and talent, so we stay prepared with specialized teams in the place who are experts in creating tailored technologies. Our solutions empower our clients to grow, modernize, and succeed in a rapidly evolving landscape.
We value all voices and want to attract talent from all backgrounds. We are on the lookout for individuals who are passionate about technology and thrive in environments where problem\-solving is approached with creativity and enthusiasm. If you are someone who enjoys continuously expanding your skill set while tackling real\-world business problems, you will feel right at home with us. Veterans and military spouses are especially encouraged to bring your unique and valuable experience to our team.
About the Role
Are you ready to help enable data‑driven operations at the Army’s tactical edge? We are seeking an Integration Engineer with strong expertise in data tagging, metadata management, workflow orchestration, Apache Kafka, and cross‑domain integration. In this role, you’ll build and maintain the integration layer that supports automated and AI‑assisted data workflows—ensuring data is properly tagged, governed, and reliably shared across domains, including disconnected and bandwidth‑constrained environments. Your work will support a data mesh or data fabric architecture where data products are discoverable, secure, and interoperable, helping operators and analysts access trusted information when and where it’s needed.
Responsibilities
- Design and implement automated data tagging frameworks that attach business, technical, and operational metadata to data assets at ingestion time.
- Integrate with data catalogs to programmatically populate and maintain metadata, including:
- Business glossaries and term definitions
- Data classification (PII, sensitive, confidential)
- Domain ownership and stewardship
- Data quality scores and lineage
- Build pipelines that extract metadata from source systems (databases, Kafka schemas, file formats) and synchronize with enterprise metadata repositories.
- Implement tagging policies that propagate across domains, ensuring data assets are consistently labeled for discoverability, access control, and retention.
- Design and implement governed data workflows that enforce approval gates, validation checks, and compliance requirements before data is published to consumers.
- Build workflow automation using tools like Apache Airflow, Prefect, Dagster, or cloud\-native workflow services (AWS Step Functions, Azure Logic Apps).
- Integrate workflow engines with data catalogs and tagging systems to trigger actions based on metadata changes (e.g., when a dataset is tagged as "sensitive," automatically apply encryption and restrict access).
- Implement SLA monitoring and alerting for workflow completion, data freshness, and compliance checks.
- Design and implement integration patterns that enable secure, governed data flow across multiple domains.
- Implement cross\-domain service architectures using APIs, event streaming, and data virtualization.
- Define and manage data contracts between domains, specifying schemas, SLAs, quality requirements, and tagging expectations.
- Architect, deploy, and manage Apache Kafka clusters across multiple domains and environments (on\-premise, cloud, hybrid).
- Implement streaming workflows where Kafka messages trigger governed workflows (e.g., new data arrival initiates validation and tagging pipeline).
- Design canonical data models that serve as the standard for cross\-domain data exchange, embedding tags and metadata into the model structure.
- Collaborate with domain experts to define business terms, hierarchies, and metrics that are consistently tagged and governed across domains.
- Implement tag\-based access control (TBAC) where data access policies are enforced based on tags applied to datasets.
- Ensure compliance with regulatory requirements (GDPR, CCPA, SOX) through automated tagging of sensitive data and workflow\-enforced retention/deletion policies.
- Build data lineage that captures tagging events and workflow approvals, providing end\-to\-end visibility into how data is governed.
- Implement data quality workflows where datasets must pass quality checks before being tagged as "certified" or "trusted."
- Set up alerting for workflow failures, tagging inconsistencies, schema drift, and cross\-domain connectivity issues.
- Document tagging schemas, workflow definitions, and integration patterns for operational handoff.
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related field.
- 5\+ years of experience in data engineering, data integration, or software engineering with a focus on data governance, metadata management, and enterprise integration.
- Proven experience implementing automated data tagging frameworks at enterprise scale.
- Hands\-on experience with data catalog tools (Alation, Collibra, DataHub, or Amundsen) including API integration and metadata synchronization.
- Understanding of metadata standards (DCAT, W3C PROV, OpenLineage) and semantic web concepts.
- Experience with data classification and sensitive data detection (PII, PHI, PCI).
- Experience with workflow orchestration tools:
- Apache Airflow with custom operators, sensors, and DAG design
- Alternatives: Prefect, Dagster, AWS Step Functions, Azure Logic Apps
- Experience building governed workflows with approval gates, validation steps, and audit trails.
- Familiarity with event\-driven workflows triggered by Kafka messages or metadata changes.
- Proven experience designing and operating cross\-domain data integration architectures in large enterprises.
- Understanding of data mesh principles and data product orientation with tagging as a core component.
- Production experience with Apache Kafka, including:
- Kafka cluster administration
- Kafka Streams or ksqlDB
- Experience embedding metadata and tags within Kafka messages or schema annotations.
- Deep experience with enterprise data modeling across multiple domains.
- Proficiency with data modeling tools (ERwin, ER/Studio, SAP PowerDesigner).
- Experience embedding business tags, classifications, and governance attributes into physical and logical data models.
- Advanced proficiency for custom tagging scripts, workflow automation, and Kafka integration.
- Experience with Kafka client libraries and stream processing applications.
- Expert\-level for metadata querying, validation, and lineage extraction.
- Deep experience with AWS (MSK, ECS, Lambda, S3, Glue, IAM) or Azure (Event Hubs, Data Factory, Synapse, Purview).
- Docker, Kubernetes, Helm for deploying workflow and streaming applications.
- Git and CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins).
- Infrastructure as Code: Terraform, AWS CloudFormation, or Azure Resource Manager.
TAG: \#LI\-I4DM
Requirements:
Required Qualifications: