Skip to content

Conversation

PinJinx
Copy link
Contributor

@PinJinx PinJinx commented Oct 12, 2025

Closes #139

📝 Description

I have done the basic fixes for warnings that came in relation to depreciated pydantic models and others and grouped all the env variables from different files into a single .config file which is being used by all the files

🔧 Changes Made

  • Changed depreciated code in the backend to support latest version of dependency

  • Brought together all the environment variables under same file

📷 Screenshots or Visual Changes (if applicable)

Nil

🤝 Collaboration

nil

✅ Checklist

  • I have read the contributing guidelines.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added necessary documentation (if applicable).
  • Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

  • Refactor

    • Centralized configuration for GitHub integration and embedding service, replacing scattered environment lookups with a single config source.
    • Migrated settings to Pydantic v2 model_config for cleaner, future-proof configuration without changing behavior or APIs.
  • Chores

    • Introduced unified defaults for GitHub org/token and embedding parameters, with automatic .env loading.
    • Improved startup reliability and consistency across services by standardizing config access.

Copy link
Contributor

coderabbitai bot commented Oct 12, 2025

Walkthrough

Consolidates environment-driven settings into a new backend/config.py and updates multiple GitHub devrel and embedding services to read from it. Migrates several Pydantic models/settings to v2-style model_config using ConfigDict. No public API changes; logic and endpoints remain the same.

Changes

Cohort / File(s) Summary
Centralized config module
backend/config.py
New module loads .env (via find_dotenv/load_dotenv) and exposes constants: GITHUB_ORG, GITHUB_TOKEN, MODEL_NAME, MAX_BATCH_SIZE, EMBEDDING_DEVICE.
GitHub toolkit defaults
backend/app/agents/devrel/github/github_toolkit.py
DEFAULT_ORG now sourced from config.GITHUB_ORG instead of os.getenv; normalize behavior unchanged.
GitHub MCP client
backend/app/agents/devrel/github/services/github_mcp_client.py
Default org reads from config.GITHUB_ORG instead of env; owner resolution logic unchanged.
GitHub MCP server bootstrap
backend/app/agents/devrel/github/services/github_mcp_server.py
Replaces dotenv/env reads with config.GITHUB_ORG and config.GITHUB_TOKEN; removes dotenv loading here. Control flow unchanged.
GitHub MCP service auth
backend/app/agents/devrel/github/services/github_mcp_service.py
Token now from config.GITHUB_TOKEN (constructor and factory); removes os.getenv usage. Functional behavior unchanged.
GitHub support tools
backend/app/agents/devrel/github/tools/github_support.py
DEFAULT_ORG now from config.GITHUB_ORG; removes os import, adds config import.
Embedding service config
backend/app/services/embedding_service/service.py
Replaces dotenv/os getenv with config.MODEL_NAME, config.MAX_BATCH_SIZE, config.EMBEDDING_DEVICE for defaults; removes local env loading.
Pydantic v2 migration: app state
backend/app/agents/state.py
Adds model_config = ConfigDict(arbitrary_types_allowed=True); removes inner Config.
Pydantic v2 migration: settings
backend/app/core/config/settings.py
Replaces inner Config with model_config = ConfigDict(env_file=".env", extra="ignore").
Pydantic v2 migration: models
backend/app/models/database/weaviate.py
Adds model_config = ConfigDict(from_attributes=True, json_schema_extra={...}); replaces per-model Config.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • chandansgowda

Poem

I thump my paws on config ground,
One burrow, many tunnels found.
Pydantic carrots, v2 crisp,
Settings whisper, tokens lisp.
Org and model, neatly stashed—
Hop! The envs are all abashed.
Time to ship—ear-flop approved! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The PR updates Pydantic v2 configuration for several models but does not include the required migration of the IndexedRepository model in backend/app/models/database/supabase.py as specified by issue #139, so the primary linked issue remains unaddressed. Please update backend/app/models/database/supabase.py to import ConfigDict and replace the v1-style Config class in IndexedRepository with a model_config = ConfigDict(from_attributes=True) to satisfy issue #139.
Out of Scope Changes Check ⚠️ Warning The PR includes extensive changes to environment variable sourcing and configuration consolidation across GitHub tooling, services, and embedding modules which fall outside the scope of the Pydantic v2 migration specified in issue #139. Consider separating the environment variable consolidation into its own PR and restrict this changeset to the Pydantic configuration migrations required by the linked issue.
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately refers to the fixes for deprecated code and warning resolution applied in this PR, even though it does not mention environment variable consolidation it remains concise and relevant to a core part of the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
backend/app/agents/devrel/github/services/github_mcp_service.py (1)

8-8: Annotate the optional token explicitly

Lint is flagging the implicit optional. Switch to an explicit union (or Optional[str]) to satisfy PEP 484-style typing.

-    def __init__(self, token: str = None):
+    def __init__(self, token: str | None = None):
backend/config.py (2)

5-9: Simplify dotenv loading.

The conditional loading pattern is correct but more verbose than necessary. load_dotenv() already calls find_dotenv() internally and handles missing files gracefully.

Apply this diff to simplify:

-dotenv_path = find_dotenv(usecwd=True)
-if dotenv_path:
-    load_dotenv(dotenv_path=dotenv_path)
-else:
-    load_dotenv()
+# load_dotenv searches for .env upward from the current directory
+load_dotenv(find_dotenv(usecwd=True))

Based on learnings


1-16: Consider adding module docstring and configuration documentation.

This is now the central configuration module for the project. Consider adding:

  • Module-level docstring explaining purpose and expected .env location
  • Comments documenting each configuration variable's purpose and valid values
  • Type hints for the exported constants

Example:

"""
Central configuration module for Devr.AI backend.

Loads environment variables from .env file in the project root or parent directories.
Falls back to default values when environment variables are not set.

Configuration variables:
- GITHUB_ORG: GitHub organization name (default: "Aossie-org")
- GITHUB_TOKEN: GitHub API token (also checks GH_TOKEN)
- MODEL_NAME: Embedding model identifier (default: "BAAI/bge-small-en-v1.5")
- MAX_BATCH_SIZE: Maximum batch size for embeddings (default: 32)
- EMBEDDING_DEVICE: Device for embeddings - "cpu" or "cuda" (default: "cpu")
"""
from dotenv import load_dotenv, find_dotenv
import os

# Load environment variables
load_dotenv(find_dotenv(usecwd=True))

# GitHub configuration
GITHUB_ORG: str = os.getenv("GITHUB_ORG", "Aossie-org")
GITHUB_TOKEN: str | None = os.getenv("GITHUB_TOKEN") or os.getenv("GH_TOKEN")

# Embedding service configuration
MODEL_NAME: str = os.getenv("EMBEDDING_MODEL", "BAAI/bge-small-en-v1.5")
try:
    MAX_BATCH_SIZE: int = int(os.getenv("EMBEDDING_MAX_BATCH_SIZE", "32"))
except ValueError:
    raise ValueError(
        "EMBEDDING_MAX_BATCH_SIZE must be a valid integer. "
        f"Got: {os.getenv('EMBEDDING_MAX_BATCH_SIZE')}"
    )
EMBEDDING_DEVICE: str = os.getenv("EMBEDDING_DEVICE", "cpu")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 452a278 and a97df7a.

📒 Files selected for processing (10)
  • backend/app/agents/devrel/github/github_toolkit.py (2 hunks)
  • backend/app/agents/devrel/github/services/github_mcp_client.py (2 hunks)
  • backend/app/agents/devrel/github/services/github_mcp_server.py (1 hunks)
  • backend/app/agents/devrel/github/services/github_mcp_service.py (2 hunks)
  • backend/app/agents/devrel/github/tools/github_support.py (1 hunks)
  • backend/app/agents/state.py (2 hunks)
  • backend/app/core/config/settings.py (2 hunks)
  • backend/app/models/database/weaviate.py (2 hunks)
  • backend/app/services/embedding_service/service.py (1 hunks)
  • backend/config.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/agents/devrel/github/services/github_mcp_server.py (1)
backend/app/agents/devrel/github/services/github_mcp_service.py (1)
  • GitHubMCPService (7-103)
🪛 Ruff (0.13.3)
backend/app/agents/devrel/github/services/github_mcp_service.py

8-8: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

🔇 Additional comments (3)
backend/app/agents/devrel/github/services/github_mcp_server.py (2)

17-17: Configuration centralization looks good (pending import fix).

The switch from environment variables to the central config module is clean and maintains the existing behavior. The warning messages for missing values remain in place, preserving the same developer experience.

Note: This approval assumes the import path issue (line 4) is resolved.

Also applies to: 23-23


4-4: Validate config import resolution
The bare import config will only work if backend/ is on PYTHONPATH (for example, by running your app from the backend/ directory). In environments where only the service package is on the path, this import fails. Ensure your entrypoint adds backend/ to PYTHONPATH, or switch to an explicit absolute import:

from backend import config
backend/config.py (1)

11-12: LGTM!

The GitHub configuration is clean:

  • Reasonable default for GITHUB_ORG
  • Good fallback from GITHUB_TOKEN to GH_TOKEN (supports GitHub CLI convention)

import os
import json
import re
import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Qualify config import through backend

Align with the package structure so the module loads reliably.

-import config
+from backend import config
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import config
from backend import config
🤖 Prompt for AI Agents
In backend/app/agents/devrel/github/github_toolkit.py around line 4, the
top-level plain "import config" should be qualified through the package path so
Python reliably resolves the module; replace the bare import with a
package-qualified import (e.g., "from backend import config" or "import
backend.config as config"), update any references to use the qualified name if
needed, and run the module/tests to confirm the import resolves under the
project package layout.

from typing import Dict, Any, Optional, List, Union
import aiohttp
import asyncio
import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use package-qualified config import

This module runs inside the backend package. With the new config defined under backend/config.py, import config raises ModuleNotFoundError. Import it via the package-qualified path (and mirror that change in the other modules touched in this PR that pull in config).

-import config
+from backend import config
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import config
from backend/app/agents/devrel/github/services/github_mcp_client.py
from backend import config
🤖 Prompt for AI Agents
In backend/app/agents/devrel/github/services/github_mcp_client.py around line 5,
the file does a top-level import "import config" which fails because config now
lives at backend/config.py; change the import to the package-qualified path
(e.g., from backend import config or import backend.config as config) and update
other modules modified in this PR that similarly use bare "import config" to use
the package-qualified import to avoid ModuleNotFoundError when running inside
the backend package.

load_dotenv(dotenv_path=dotenv_path)
else:
load_dotenv()
import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Import config from the backend package

Same module-resolution issue here—import via backend to keep the module loadable.

-import config
+from backend import config
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import config
from backend import config
🤖 Prompt for AI Agents
In backend/app/agents/devrel/github/services/github_mcp_service.py around line
5, the file currently does "import config" which fails module resolution; change
the import to use the package path (e.g., "from backend import config" or
"import backend.config as config" consistent with the project's import style)
and update any references accordingly so the module is imported via the backend
package namespace.

@@ -1,12 +1,12 @@
import os
import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Import config via the backend package

For the same reason noted in the MCP client, this should pull config through backend to avoid ModuleNotFoundError.

-import config
+from backend import config
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import config
from backend import config
🤖 Prompt for AI Agents
In backend/app/agents/devrel/github/tools/github_support.py at line 1, the
module currently does a top-level "import config" which can cause
ModuleNotFoundError; change the import to load config through the package
namespace (e.g., "from backend import config" or "import backend.config as
config") and update any usages to reference that imported name so the package
import resolution works correctly.

@@ -1,21 +1,20 @@
import logging
import os
import config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix config import path

config lives under backend/config.py, so this needs the package-qualified form.

-import config
+from backend import config
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import config
-from backend/app/services/embedding_service/service.py
from backend/app/services/embedding_service/service.py
from backend import config
🤖 Prompt for AI Agents
In backend/app/services/embedding_service/service.py around line 2, the bare
import "import config" should be changed to a package-qualified import that
points to backend/config.py; replace it with an absolute import such as "from
backend import config" (or "import backend.config as config") so the module
resolves correctly when running from the project root.

GITHUB_TOKEN = os.getenv("GITHUB_TOKEN") or os.getenv("GH_TOKEN")

MODEL_NAME = os.getenv("EMBEDDING_MODEL", "BAAI/bge-small-en-v1.5")
MAX_BATCH_SIZE = int(os.getenv("EMBEDDING_MAX_BATCH_SIZE", "32"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling for integer conversion.

int() will raise ValueError if EMBEDDING_MAX_BATCH_SIZE is set to a non-numeric value (e.g., "large", "auto"). This could crash the application on startup with an unclear error message.

Apply this diff to add validation:

-MAX_BATCH_SIZE = int(os.getenv("EMBEDDING_MAX_BATCH_SIZE", "32"))
+try:
+    MAX_BATCH_SIZE = int(os.getenv("EMBEDDING_MAX_BATCH_SIZE", "32"))
+except ValueError:
+    raise ValueError(
+        "EMBEDDING_MAX_BATCH_SIZE must be a valid integer. "
+        f"Got: {os.getenv('EMBEDDING_MAX_BATCH_SIZE')}"
+    )
🤖 Prompt for AI Agents
In backend/config.py around line 15, the direct int(os.getenv(...)) conversion
can raise ValueError for non-numeric env values; change it to read the raw env
var, try to convert to int inside a try/except, validate that the parsed value
is a positive integer (>=1), and on failure either fall back to the default 32
and emit a clear warning log message that shows the invalid input, or raise a
new ValueError with a descriptive message — ensure you use a defined logger or
the existing logging mechanism and keep the default behavior if the env is
unset.

@smokeyScraper smokeyScraper self-requested a review October 13, 2025 18:31
Copy link
Contributor

@smokeyScraper smokeyScraper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!
Thanks a lot @PinJinx!

@smokeyScraper smokeyScraper merged commit 360c91d into AOSSIE-Org:main Oct 13, 2025
1 check passed
@smokeyScraper
Copy link
Contributor

Merged! Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate IndexedRepository model to Pydantic v2 configuration

2 participants