jira-ai-fixer/docs/aci-jira-ai-fixer-technical...

22 KiB
Raw Blame History

ACI JIRA AI Fixer - Technical Document

Version: 1.1
Date: 2026-02-18
Update: Azure OpenAI mandatory for compliance
Classification: Internal - Technical Team


1. Overview

1.1 Objective

Develop an artificial intelligence system that integrates with JIRA and Bitbucket to automate Support Case analysis, identify affected modules in source code (COBOL/SQL/JCL), propose fixes, and automatically document solutions.

1.2 Scope

  • Products: ACQ-MF (Acquirer) and ICG-MF (Interchange)
  • Repositories: Client-specific forks (e.g., ACQ-MF-safra-fork, ICG-MF-safra-fork)
  • Issues: Support Cases in JIRA
  • Languages: COBOL, SQL, JCL

1.3 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         ACI JIRA AI FIXER - ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌───────────────┐                                                         │
│   │    JIRA       │                                                         │
│   │ gojira.tsacorp│                                                         │
│   │    .com       │                                                         │
│   └───────┬───────┘                                                         │
│           │ Webhook (issue_created, issue_updated)                          │
│           ▼                                                                 │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │                      EVENT PROCESSOR                              │    │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐   │    │
│   │  │   Queue     │  │   Filter    │  │   Issue Classifier      │   │    │
│   │  │  (Redis)    │──▶ (Support    │──▶ (Product, Module,       │   │    │
│   │  │             │  │   Cases)    │  │  Severity)              │   │    │
│   │  └─────────────┘  └─────────────┘  └─────────────────────────┘   │    │
│   └───────────────────────────────────────────────────────────────────┘    │
│                                    │                                        │
│                                    ▼                                        │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │                    CODE INTELLIGENCE ENGINE                       │    │
│   │                                                                   │    │
│   │  ┌─────────────────┐    ┌─────────────────┐    ┌──────────────┐  │    │
│   │  │   Bitbucket     │    │   Code Index    │    │   Context    │  │    │
│   │  │   Connector     │    │ (Azure OpenAI   │    │   Builder    │  │    │
│   │  │                 │    │  Embeddings)    │    │              │  │    │
│   │  │ bitbucket.      │    │ - COBOL procs   │    │ - CALLs      │  │    │
│   │  │ tsacorp.com     │    │ - SQL tables    │    │ - COPYBOOKs  │  │    │
│   │  │                 │    │ - JCL jobs      │    │ - Includes   │  │    │
│   │  └─────────────────┘    └─────────────────┘    └──────────────┘  │    │
│   │                                                                   │    │
│   │  Repositories:                                                    │    │
│   │  ├── ACQ-MF (base)                                               │    │
│   │  │   └── ACQ-MF-safra-fork (client)                              │    │
│   │  │       └── ACQ-MF-safra-ai (AI) ← NEW                          │    │
│   │  ├── ICG-MF (base)                                               │    │
│   │  │   └── ICG-MF-safra-fork (client)                              │    │
│   │  │       └── ICG-MF-safra-ai (AI) ← NEW                          │    │
│   └───────────────────────────────────────────────────────────────────┘    │
│                                    │                                        │
│                                    ▼                                        │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │                    FIX GENERATION ENGINE                          │    │
│   │                                                                   │    │
│   │  ┌─────────────────┐    ┌─────────────────┐    ┌──────────────┐  │    │
│   │  │   LLM Engine    │    │   Fix Validator │    │   Output     │  │    │
│   │  │ (Azure OpenAI)  │    │                 │    │   Generator  │  │    │
│   │  │ - GPT-4o        │    │ - Syntax check  │    │              │  │    │
│   │  │ - GPT-4 Turbo   │    │ - COBOL rules   │    │ - JIRA       │  │    │
│   │  │                 │    │ - SQL lint      │    │   comment    │  │    │
│   │  │                 │    │ - JCL validate  │    │ - PR/Branch  │  │    │
│   │  └─────────────────┘    └─────────────────┘    └──────────────┘  │    │
│   └───────────────────────────────────────────────────────────────────┘    │
│                                    │                                        │
│                    ┌───────────────┴───────────────┐                       │
│                    ▼                               ▼                        │
│           ┌──────────────┐               ┌──────────────┐                  │
│           │    JIRA      │               │  Bitbucket   │                  │
│           │  Comment     │               │  Pull Request│                  │
│           │  (Analysis + │               │  (AI Fork)   │                  │
│           │   Suggestion)│               │              │                  │
│           └──────────────┘               └──────────────┘                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

2. Detailed Components

2.1 Event Processor

2.1.1 JIRA Webhook Receiver

Endpoint: POST /api/webhook/jira
Events:
  - jira:issue_created
  - jira:issue_updated
Filters:
  - issueType: "Support Case"
  - project: ["ACQ", "ICG"]
Authentication: Webhook Secret (HMAC-SHA256)

2.1.2 Queue System

Technology: Redis + Bull Queue
Queues:
  - jira-events: Raw JIRA events
  - analysis-jobs: Pending analysis jobs
  - fix-generation: Fix generation tasks
Retry Policy:
  - Max attempts: 3
  - Backoff: exponential (1min, 5min, 15min)
Dead Letter Queue: jira-events-dlq

2.1.3 Issue Classifier

Responsible for extracting metadata from issues:

class IssueClassifier:
    def classify(self, issue: JiraIssue) -> ClassifiedIssue:
        return ClassifiedIssue(
            product=self._detect_product(issue),       # ACQ-MF or ICG-MF
            module=self._detect_module(issue),         # Authorization, Clearing, etc.
            severity=self._detect_severity(issue),     # P1, P2, P3
            keywords=self._extract_keywords(issue),    # Technical terms
            stack_trace=self._parse_stack_trace(issue),
            affected_programs=self._detect_programs(issue)
        )

2.2 Code Intelligence Engine

2.2.1 Bitbucket Connector

Base URL: https://bitbucket.tsacorp.com
API Version: REST 1.0 (Bitbucket Server)
Authentication: Personal Access Token or OAuth

Operations:
  - Clone/Pull: Sparse checkout (relevant directories only)
  - Read: Specific file contents
  - Branches: Create/list branches in AI fork
  - Pull Requests: Create PR from AI fork → client fork

Access Structure per Repository:

Repository AI Permission Usage
ACQ-MF (base) READ Reference, standards
ACQ-MF-safra-fork READ Current client code
ACQ-MF-safra-ai WRITE AI branches and commits
ICG-MF (base) READ Reference, standards
ICG-MF-safra-fork READ Current client code
ICG-MF-safra-ai WRITE AI branches and commits

2.2.2 Code Index (Embeddings)

⚠️ IMPORTANT: Azure OpenAI Embeddings (Mandatory)

The client has compliance requirements that mandate source code data not be processed by public APIs. Therefore, we mandatorily use Azure OpenAI Embeddings:

Provider: Azure OpenAI (data remains in client's Azure tenant)
Model: text-embedding-ada-002 or text-embedding-3-large
Region: Brazil South (recommended) or East US
Compliance: Data not used for training Microsoft models
Contract: ACI's existing Enterprise Agreement

Why not use GitHub Copilot for embeddings?

  • GitHub Copilot is an IDE tool, has no API for integration
  • Does not offer indexing or semantic search functionality
  • There is no way to use Copilot to search code programmatically

COBOL Code Indexing:

Granularity: By PROGRAM-ID / SECTION / PARAGRAPH
Extracted metadata:
  - PROGRAM-ID
  - COPY statements (dependencies)
  - CALL statements (called programs)
  - FILE-CONTROL (accessed files)
  - SQL EXEC (tables/queries)
  - Working Storage (main variables)
  
Embedding Model: Azure OpenAI text-embedding-3-large
Vector DB: Qdrant (self-hosted on ACI infra) or Azure AI Search
Dimensions: 3072
Index separated by: product + client

SQL Indexing:

Granularity: By table/view/procedure
Extracted metadata:
  - Object name
  - Columns and types
  - Foreign keys
  - Referencing procedures

JCL Indexing:

Granularity: By JOB / STEP
Extracted metadata:
  - JOB name
  - Executed PGMs
  - DD statements (datasets)
  - Passed PARMs
  - Dependencies (JCL INCLUDEs)

2.2.3 Context Builder

Assembles relevant context for LLM analysis:

class ContextBuilder:
    def build_context(self, issue: ClassifiedIssue) -> AnalysisContext:
        # 1. Search programs mentioned in the issue
        mentioned_programs = self._search_by_keywords(issue.keywords)
        
        # 2. Search similar programs from past issues
        similar_issues = self._find_similar_issues(issue)
        
        # 3. Expand dependencies (COPYBOOKs, CALLs)
        dependencies = self._expand_dependencies(mentioned_programs)
        
        # 4. Get configured business rules
        business_rules = self._get_business_rules(issue.product)
        
        # 5. Build final context (respecting token limit)
        return AnalysisContext(
            primary_code=mentioned_programs[:5],   # Max 5 main programs
            dependencies=dependencies[:10],         # Max 10 dependencies
            similar_fixes=similar_issues[:3],       # Max 3 examples
            business_rules=business_rules,
            total_tokens=self._count_tokens()
        )

2.3 Fix Generation Engine

2.3.1 LLM Engine

Primary: Azure OpenAI GPT-4o (data does not leave Azure environment)
Fallback: Azure OpenAI GPT-4 Turbo
Gateway: LiteLLM (unified interface)

Configuration:
  temperature: 0.2  # Low for code
  max_tokens: 4096
  top_p: 0.95

Note on GitHub Copilot: The client has GitHub Copilot, however this tool is intended for use in the IDE by developers. Copilot does not have a public API for integration in automated systems and does not offer embedding/indexing functionality. Therefore, the solution uses Azure OpenAI for all AI operations.

COBOL Prompt Template:

You are an expert in mainframe payment systems, 
specifically ACI Acquirer (ACQ-MF) and Interchange (ICG-MF) products.

## System Context
{business_rules}

## Reported Issue
{issue_description}

## Current Code
{code_context}

## Similar Fix History
{similar_fixes}

## Task
Analyze the issue and:
1. Identify the probable root cause
2. Locate the affected program(s)
3. Propose a specific fix
4. Explain the impact of the change

## Rules
- Maintain COBOL-85 compatibility
- Preserve existing copybook structure
- Do not change interfaces with other systems without explicit mention
- Document all proposed changes

## Response Format
{response_format}

2.3.2 Fix Validator

COBOL Validations:

Syntax:
  - Compilation with GnuCOBOL (syntax check)
  - Verification of referenced copybooks
  
Semantics:
  - CALLs to existing programs
  - Variables declared before use
  - Compatible PIC clauses
  
Style:
  - Standard indentation (Area A/B)
  - ACI naming conventions
  - Mandatory comments

SQL Validations:

- Syntax check with SQL parser
- Verification of existing tables/columns
- Performance analysis (EXPLAIN)

JCL Validations:

- JCL syntax check
- Referenced datasets exist
- Referenced PGMs exist

3. Repository Structure (AI Fork)

3.1 AI Fork Creation

# Proposed structure in Bitbucket
projects/
├── ACQ/
│   ├── ACQ-MF                    # Base product (existing)
│   ├── ACQ-MF-safra-fork         # Client fork (existing)
│   └── ACQ-MF-safra-ai           # AI fork (NEW)
│
├── ICG/
│   ├── ICG-MF                    # Base product (existing)
│   ├── ICG-MF-safra-fork         # Client fork (existing)
│   └── ICG-MF-safra-ai           # AI fork (NEW)

3.2 Branch Flow

ACQ-MF-safra-fork (client)
         │
         │ fork
         ▼
ACQ-MF-safra-ai (AI)
         │
         ├── main (sync with client)
         │
         └── ai-fix/JIRA-1234-description
                    │
                    │ Pull Request
                    ▼
         ACQ-MF-safra-fork (client)
                    │
                    │ Review + Approve
                    ▼
                  merge

3.3 Commit Convention

[AI-FIX] JIRA-1234: Short fix description

Problem:
- Original problem description

Solution:
- What was changed and why

Modified files:
- src/cobol/ACQAUTH.CBL (lines 1234-1256)

Confidence: 85%
Generated by: ACI JIRA AI Fixer v1.0

Co-authored-by: ai-fixer@aci.com
User/Group ACQ-MF (base) Client Fork AI Fork
ai-fixer-svc READ READ WRITE
devs-aci WRITE WRITE READ
tech-leads ADMIN ADMIN ADMIN

4. Technology Stack

4.1 Backend

Runtime: Python 3.11+
Framework: FastAPI
Async: asyncio + httpx
Queue: Redis 7+ with Bull Queue (via Python-RQ or Celery)
Database: PostgreSQL 15+ (metadata, configurations, logs)
Vector DB: Qdrant 1.7+ (self-hosted)
Cache: Redis

4.2 Frontend (Admin Panel)

Framework: React 18+ or Vue 3+
UI Kit: Tailwind CSS + shadcn/ui
State: React Query or Pinia
Build: Vite

4.3 Infrastructure

Container: Docker + Docker Compose
Orchestration: Docker Swarm (initial) or Kubernetes (scale)
CI/CD: Bitbucket Pipelines
Reverse Proxy: Traefik or nginx
SSL: Let's Encrypt
Monitoring: Prometheus + Grafana
Logs: ELK Stack or Loki

4.4 External Integrations

LLM (Azure OpenAI - MANDATORY):
  Primary: Azure OpenAI GPT-4o
  Fallback: Azure OpenAI GPT-4 Turbo
  Region: Brazil South or East US
  Gateway: LiteLLM (natively supports Azure OpenAI)
  Compliance: Data not used for training, stays in Azure tenant
  
Embeddings (Azure OpenAI - MANDATORY):
  Model: Azure OpenAI text-embedding-3-large
  Alternative: Azure OpenAI text-embedding-ada-002
  Vector DB: Qdrant (self-hosted) or Azure AI Search
  
JIRA:
  API: REST API v2 (Server)
  Auth: Personal Access Token
  
Bitbucket:
  API: REST API 1.0 (Server)
  Auth: Personal Access Token

⚠️ Note on GitHub Copilot: The client has GitHub Copilot licenses, however this tool is not applicable for this solution because:

  1. It's an IDE tool (code autocomplete), not an API
  2. Has no public endpoint for programmatic integration
  3. Does not offer embeddings or semantic search functionality
  4. Does not allow indexing or querying code repositories

GitHub Copilot will continue to be used by developers in their daily work, while the ACI AI Fixer solution uses Azure OpenAI for automation.


5. Security

5.1 Sensitive Data

Source code:
  - Processed in memory, not persisted to disk
  - Embeddings stored in Qdrant (encrypted at-rest)
  - Sanitized logs (no complete code)
  
Credentials:
  - Vault (HashiCorp) or AWS Secrets Manager
  - Automatic token rotation
  - Access audit log
  
LLM and Embeddings:
  - MANDATORY: Azure OpenAI (data does not leave Azure tenant)
  - Data is not used to train Microsoft models
  - Compliance with ACI corporate policies
  - Brazil South region for lower latency

5.2 Network

Deployment:
  - Internal network (not exposed to internet)
  - HTTPS/TLS 1.3 communication
  - Firewall: only JIRA and Bitbucket can access webhooks
  
Authentication:
  - Admin Panel: SSO via SAML/OIDC (integrate with ACI AD)
  - API: JWT tokens with short expiration
  - Webhooks: HMAC-SHA256 signature verification

5.3 Compliance

Requirements:
  - [ ] Data segregation by client/fork
  - [ ] Complete audit trail (who, when, what)
  - [ ] Configurable log retention
  - [ ] Option for 100% on-premise processing
  - [ ] Data flow documentation

6. Estimates

6.1 Development Timeline

Phase Duration Deliverables
1. Initial Setup 2 weeks Infra, repos, basic CI/CD
2. Integrations 3 weeks JIRA webhook, Bitbucket connector
3. Code Intelligence 4 weeks COBOL/SQL/JCL indexing, embeddings
4. Fix Engine 3 weeks LLM integration, prompt engineering
5. Output & PR 2 weeks JIRA comments, Bitbucket PRs
6. Admin Panel 2 weeks Dashboard, configurations
7. Tests & Adjustments 2 weeks Validation with real issues
Total MVP 18 weeks ~4.5 months

6.2 Suggested Team

Role Quantity Dedication
Tech Lead 1 100%
Backend Developer 2 100%
Frontend Developer 1 50%
DevOps 1 25%
Total 5

6.3 Monthly Operational Costs (Estimate)

Item Cost/Month
LLM APIs (10 issues × ~$3/issue) ~$30
Infra (VPS/On-premise) $200-500
Vector DB (Qdrant self-hosted) $0 (infra)
Total ~$230-530/month

Note: Low volume (5-10 issues/month) results in minimal operational cost.


7. Technical Risks and Mitigations

Risk Probability Impact Mitigation
LLM generates incorrect fix High High Mandatory human review, confidence score
Insufficient COBOL context Medium High RAG with copybooks, fix examples
High latency Low Medium Async queue, visual feedback
Bitbucket API rate limit Low Low Aggressive cache, sparse checkout
Security (code exposure) Medium High Azure OpenAI or self-hosted LLM

8. Success Metrics

8.1 Technical KPIs

Metric MVP Target 6-Month Target
Successful analysis rate 80% 95%
Accepted fixes (no modification) 30% 50%
Accepted fixes (with adjustments) 50% 70%
Average analysis time < 5 min < 2 min
System uptime 95% 99%

8.2 Business KPIs

Metric Target
Initial analysis time reduction 50%
Issues with useful suggestion 70%
Team satisfaction > 4/5

9. Next Steps

  1. Week 1-2:

    • Provision development infrastructure
    • Create AI forks in Bitbucket
    • Configure JIRA webhooks (test environment)
  2. Week 3-4:

    • Implement Bitbucket connector
    • Index code from 1 repository (ACQ-MF-safra-fork)
    • Test embeddings with 5 historical issues
  3. Week 5-6:

    • Integrate LLM (Azure OpenAI GPT-4o)
    • Develop COBOL-specific prompts
    • Validate outputs with technical team

Document prepared for technical review.

Contact: [Development Team]