S0ix Sleep and Hibernate on Meteor Lake, NVIDIA and the ThinkPad P1 Gen 7

Getting modern sleep (S0ix / “Modern Standby”) working on a Lenovo ThinkPad P1 Gen 7 (21KV) with Intel Meteor Lake and an NVIDIA RTX 4060 has been a significant undertaking. This post documents the hardware, the problems encountered, every fix applied, and the remaining blockers — in the hope it saves someone else the same multi-day debugging session.

Hardware

  • Laptop: Lenovo ThinkPad P1 Gen 7 (21KV0025UK)
  • CPU: Intel Core Ultra 7 165H (Meteor Lake)
  • GPU: NVIDIA RTX 4060 Max-Q Mobile (AD107M) — PRIME offload only, the Intel iGPU drives the display
  • OS: Fedora 43, kernel 6.19.x

The Problem: S0ix Does Not Work (Out of the Box)

Meteor Lake dropped legacy S3 sleep entirely. The only suspend option is S0ix (also called s2idle / Modern Standby), which requires every single device and subsystem to reach a low-power state. If any component stays awake, the CPU package cannot enter its deepest idle state and the laptop drains battery as if it were still running.

On this machine, over 35 PMC substate requirements were unmet out of the box. The root cause traces to the IOE die (PMC1) failing to power-gate, which appears to be a CSME firmware issue. The behaviour is even boot-dependent — some boots may partially work where others will not.

Kernel Parameters

The following kernel parameters were added to /etc/default/grub on the GRUB_CMDLINE_LINUX line:

pcie_aspm=force acpi.ec_no_wakeup=1 nmi_watchdog=0 snd_hda_intel.power_save=1
  • pcie_aspm=force — Force Active State Power Management on PCIe devices that don’t advertise support. Required because several Meteor Lake root ports won’t enter L1 otherwise.
  • acpi.ec_no_wakeup=1 — Prevent the Embedded Controller from generating spurious wakeups. ThinkPads are notorious for this.
  • nmi_watchdog=0 — Disable the NMI watchdog. It prevents the CPU from entering deep C-states.
  • snd_hda_intel.power_save=1 — Enable power saving on the HDA audio codec (1 second timeout).

ACPI Wake Sources

Several ACPI wake sources were causing immediate or spurious wake from suspend. A systemd service was created to disable them at boot:

# /etc/systemd/system/disable-wakeup-sources.service
# Disables: RP12, TXHC, TDM0, TRP0, TRP1

[Unit]
Description=Disable problematic ACPI wake sources
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'for src in RP12 TXHC TDM0 TRP0 TRP1; do echo $src > /proc/acpi/wakeup || true; done'

[Install]
WantedBy=multi-user.target

These sources relate to PCIe root ports and Thunderbolt controllers that would wake the machine immediately after suspend or on any USB-C cable event.

PMC LTR (Latency Tolerance Reporting) Overrides

Several devices were reporting LTR values that prevented the SoC from entering S0ix. These were overridden by writing to the PMC LTR ignore registers:

# LTR ignore indices:
# 1  = SOUTHPORT_B
# 3  = GBE (Gigabit Ethernet)
# 6  = ME (Management Engine)
# 8  = SOUTHPORT_C
# 25 = IOE_PMC

The GBE and ME entries are particularly important — the Intel Management Engine’s LTR blocks S0ix on virtually every Meteor Lake system unless ignored.

USB Subsystem

Two USB devices were holding the XHCI controller in D0, blocking package C-state entry:

  • The fingerprint reader
  • The Bluetooth adapter

Both were resolved by enabling USB autosuspend for all devices. Runtime PM was also explicitly enabled on the Thunderbolt host controller.

BIOS: Disable Intel AMT

Intel Active Management Technology (AMT) was enabled by default in the BIOS. This caused three PMC blockers:

  • KVMCC — KVM Controller for remote management
  • USBR0 — USB Redirection
  • SUSRAM — Suspend RAM used by AMT

Disabling AMT in the BIOS (Config → Network → Intel AMT) immediately cleared all three.

NVIDIA GPU Power Management

The NVIDIA GPU required extensive configuration to stop it from blocking low-power states:

Services Disabled

  • nvidia-powerd — Was causing the GPU to cycle between D3cold and D0 repeatedly
  • nvidia-persistenced — Kept a handle on the GPU, preventing it from entering D3cold

D3cold Enabled

Runtime power management and D3cold (full power off) were enabled for the NVIDIA GPU via /etc/modprobe.d/nvidia-pm.conf.

Proprietary Driver with GSP Firmware Disabled

The open-source nvidia-open kernel module was replaced with the proprietary NVIDIA driver, and GSP (GPU System Processor) firmware was disabled:

options nvidia NVreg_EnableGpuFirmware=0

This was necessary because the GSP firmware interfered with proper suspend/resume cycling on this GPU generation.

CUDA Isolation for Background Services

Any background service that might poke the GPU (e.g., monitoring tools, bots) was configured with:

Environment=CUDA_VISIBLE_DEVICES=

This prevents them from initialising the CUDA runtime and keeping the GPU in D0.

Other Fixes

  • intel-lpmd — Intel’s Low Power Mode Daemon was installed to help manage CPU power states on Meteor Lake.
  • SELinux — A custom /data partition had incorrect SELinux contexts, which was blocking systemd-logind from writing hibernate state. Relabelling the partition fixed it.

Hibernate: Broken

Hibernate (suspend-to-disk) does not work on this hardware combination and is unlikely to work any time soon.

The kernel panics during the hibernate image write phase. The NVIDIA driver’s pci_pm_freeze callback either fails or causes a kernel panic. This was tested with:

  • The open-source nvidia-open driver
  • The proprietary NVIDIA driver
  • GSP firmware on and off
  • Writing via /sys/power/state (procfs) directly
  • Skipping VT switch (no chvt)

All combinations result in a panic. This is a known upstream bug tracked at NVIDIA/open-gpu-kernel-modules Issue #922.

Do not attempt hibernate on this platform with NVIDIA drivers loaded.

What Still Blocks S0ix

Even after all the above fixes, S0ix is not fully achieved. The remaining blockers are in the PMC and relate to PLLs and fabric that cannot be controlled from userspace:

PMC1 (IOE Die)

  • SBR0–4 (Sideband Router instances)
  • FABRIC_PLL, TMU_PLL, BCLK_PLL, D2D_PLL, G5FPW PLLs, REF_PLL
  • VNN_SOC_REQ_STS

PMC0 (SoC Die)

  • LPSS — I2C4 stuck in D0 (likely the touchpad or touchscreen controller)
  • FABRIC_PLL, IOE_COND_MET
  • ITSS_CLK_SRC_REQ_STS
  • LSX_Wake4–7

These are firmware-level blockers. The IOE die (PMC1) issue in particular appears to require a CSME firmware update from Lenovo.

Things to Avoid

  • Do not restart systemd-logind — It will kill your entire GNOME session.
  • Do not use sleep hooks that run modprobe -r nvidiagnome-shell holds nvidia_drm open; the unload will hang.
  • Do not use rtcwake -m mem — It bypasses the NVIDIA systemd suspend services and will likely result in a broken resume.
  • Do not attempt hibernate — Kernel panic (see above).

Configuration Files Modified

For reference, the following files were created or modified during this process:

  • /etc/default/grub — Kernel parameters
  • /etc/modprobe.d/nvidia-pm.conf — NVIDIA power management options
  • /etc/modprobe.d/nvidia-installer-disable-nouveau.conf — Nouveau blacklist
  • /etc/systemd/logind.conf.d/lid-suspend-then-hibernate.conf — Lid action configuration
  • /etc/systemd/sleep.conf — Sleep/hibernate configuration
  • /etc/systemd/system/disable-wakeup-sources.service — ACPI wake source disabling
  • /usr/lib/systemd/system-sleep/ollama-nvidia-hibernate.sh — Pre/post sleep hooks

Conclusion

Modern Standby on Meteor Lake with NVIDIA discrete graphics is a minefield. While many of the userspace-controllable blockers can be resolved, the core IOE die power-gating issue and the NVIDIA hibernate panic are firmware bugs that no amount of kernel tuning can fix. If you’re buying a Meteor Lake workstation for Linux, check the NVIDIA hibernate bug status and your vendor’s CSME firmware changelog before committing.

]]>


Successfully Redeploying the Feeditout Service with Ansible

After a long journey of iteration, troubleshooting, and learning, I’m excited to share that I’ve successfully redeployed the Feeditout service using Ansible.

This wasn’t just a redeployment — it was a full re-architecture of how the system is provisioned, secured, monitored, and maintained. I went deep into infrastructure-as-code territory and came out the other side with a more robust, modular, and maintainable setup than ever before.

Lessons from My Ansible Journey

At the heart of this process was Ansible — and it’s fair to say I’ve come a long way in mastering it. What began as a handful of playbooks quickly evolved into a library of roles, reusable tasks, and templated configuration files.

I focused heavily on idempotency, readability, and separation of concerns. Along the way, I developed a strong preference for minimal inline logic and clean, descriptive variable names. I also became comfortable enforcing good practices like avoiding item as a loop variable and steering clear of unnecessary block statements unless needed.

Roles I Wrote

Here’s a snapshot of the roles I built and used during this process — each one crafted with purpose:

  • aide
  • alert_manager
  • ansible_pull
  • apache2
  • apparmor
  • apt
  • auditd
  • base_packages
  • certbot
  • chkrootkit
  • chuckbot
  • clamav
  • clean
  • cockpit
  • cron
  • dns
  • entropy
  • fail2ban
  • fail2counter
  • grafana
  • grub
  • hostname
  • iptables
  • kernel
  • keyboard
  • locale
  • logrotate
  • logwatch
  • memcached
  • motd
  • mysql
  • network_manager
  • node_exporter
  • ntp
  • opendkim
  • opendmarc
  • pam
  • passwd
  • php_fpm
  • postfix
  • postsrsd
  • prometheus
  • rclone
  • redis
  • root_password
  • rsyslog
  • saslauthd
  • services
  • spamassassin
  • sshd
  • sudo
  • swap
  • wayland

From security hardening (auditd, chkrootkit, aide, fail2ban) to service monitoring (grafana, prometheus, alert_manager), mail stack configuration (postfix, opendkim, opendmarc, postsrsd, saslauthd), and even custom integrations like chuckbot, every role played a part.

Each role encapsulates everything needed to configure a specific service — packages, configuration files, systemd services, and sensible defaults — while remaining fully overrideable via host_vars.

The Payoff

Feeditout is now:

  • Secure by default with automated auditing, logging, and spam controls.
  • Monitored with a complete Prometheus + Grafana setup and alert routing.
  • Configured from scratch using a fully automated Ansible repo.
  • Easier to maintain, extend, and recover from disaster.

Most importantly, I now have confidence in my infrastructure, because it’s reproducible and self-documented through code.

What’s Next?

Now that the foundation is solid, I’ll be iterating on:

  • Self-healing features (auto-restart, watchdogs),
  • Zero-downtime deployments,
  • Better observability dashboards,
  • Maybe even a public Git repo or guide for others to use and learn from.

If you’re thinking about doing something similar — take the plunge. It’s a challenge, but you’ll learn more about your systems and tools than you ever could from reading docs alone.


Exciting Updates in Jira Creator – Version 1.0.3 Released!

Hey there, fellow developers! I’m thrilled to share the latest updates from our project, Jira Creator, with the release of version 1.0.3. This update brings a variety of changes, from minor tweaks to significant improvements, all aimed at enhancing your experience with our tool.

What’s Changed?

In this release, we made several changes across various files, focusing on improving the overall structure and functionality of the tool. Here’s a quick rundown of what’s new:

  • Updated the environment variable names for clarity and consistency.
  • Refactored the code to improve readability and maintainability.
  • Fixed bugs related to AI provider integration and issue management.
  • Bumped the version number from 0.0.43 to 1.0.3, marking a significant milestone.

How Does This Change the Project?

This update is more than just a version bump. By renaming environment variables to include the prefix “JIRA_”, we’ve made it easier for users to understand what each variable relates to. This small change can help reduce confusion and improve the setup process.

Additionally, the refactoring efforts have led to cleaner code, making it easier for us to maintain and for contributors to understand. This means fewer bugs and a smoother experience for everyone!

Bug Fixes, Refactoring, and Feature Enhancements

We’ve tackled some pesky bugs in this release. For instance, the AI provider integration has been streamlined, ensuring that your AI-related commands work more reliably. The refactoring efforts also mean that the code is now more modular, which is a win for future development.

While there are no new features in this release, the improvements we’ve made are crucial for the stability and usability of the tool. We believe that these changes will significantly enhance your workflow when creating Jira issues.

What About Dependencies or Configurations?

No major changes were made to dependencies or configurations in this release. We’ve kept everything stable, ensuring that you can upgrade without worrying about breaking changes.

Release Info and Links

Here’s a quick summary of the release:

We hope you enjoy the improvements in this release! As always, your feedback is invaluable to us, so feel free to reach out with any thoughts or suggestions. Happy coding!


📚 Further Learning

🎥 Watch this video for more:


Generating Python Docstrings Automatically

Welcome to my blog! Today, I’m excited to share a neat Python script that automates the generation of docstrings for your Python files. This tool is especially useful for developers who want to maintain clean and well-documented code without spending too much time writing documentation manually.

The script utilizes OpenAI’s API to generate meaningful docstrings based on the code provided. It analyzes the structure of your Python classes and functions, and then creates concise and informative docstrings that follow standard conventions. Let’s dive into the code!


#!/usr/bin/python
import argparse
import hashlib
import json
import os
import re
import statistics
import subprocess
import sys
import tempfile
import time

import requests

system_prompt_general = """
You will be provided with a Python code.
Based on this, your task is to generate a Python docstring for it.
This is the docstring that goes at the top of the file.
The file may include multiple classes or other code.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the file understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other explanation
or additional text. The format should be:

\"\"\"

\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""

system_prompt_class = """
You will be provided with a Python class, including its code.
Based on this, your task is to generate a Python docstring for it.
Use the class signature and body to infer the purpose of the class,
the attributes it has, and any methods it includes.

Follow these guidelines to create the docstring:

1. Summary: Provide a concise summary of the class's purpose.
    Focus on what the class does and its main goal.
2. Attributes: List the attributes, their types, and a brief description
    of what each one represents.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the class understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other
explanation or additional text. The format should be:

\"\"\"

\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""

system_prompt_def = """
You will be provided with a Python function, including its code.
Based on this, your task is to generate a Python docstring for it.
Use the function signature and body to infer the purpose of the function,
the arguments it takes, the return value, and any exceptions it may raise.

Follow these guidelines to create the docstring:

1. Summary: Provide a concise summary of the function's purpose.
    Focus on what the function does and its main goal.
2. Arguments: List the parameters, their types, and a brief description
    of what each one represents.
3. Return: If the function has a return value, describe the return type
    and what it represents. If there's no return, OMIT THE SECTION.
4. Exceptions: If the function raises any exceptions, list them with descriptions.
    If no exceptions are raised, OMIT THE SECTION.
5. Side Effects (if applicable): If the function has side effects
    (e.g., modifies global state, interacts with external services), mention them.
    OMIT THE SECTION if it is not clear in the code.
6. Algorithm or Key Logic (optional): If the function is complex,
    provide a high-level outline of the logic or algorithm involved.
    OMIT THE SECTION if it is not clear in the code.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the function understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other
explanation or additional text. The format should be:

\"\"\"

\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""


class OpenAICost:
    # Static member to track the total cost
    cost = 0
    costs = []  # A list to store the individual request costs

    @staticmethod
    def send_cost(tokens, model):
        # The cost calculation can vary based on the model and tokens
        model_costs = {
            "gpt-3.5-turbo": 0.002,  # Example cost per 1k tokens
            "gpt-4o-mini": 0.003,  # Example cost per 1k tokens
            "gpt-4o": 0.005,  # Example cost per 1k tokens
        }

        cost_per_token = model_costs.get(model, 0)
        cost = (tokens / 1000) * cost_per_token  # Cost is proportional to tokens

        OpenAICost.cost += cost
        OpenAICost.costs.append(cost)

    @staticmethod
    def print_cost_metrics():
        print(f"\nTotal Cost: ${OpenAICost.cost:.4f}")
        if OpenAICost.costs:
            print(
                f"    Average Cost per Request: ${statistics.mean(OpenAICost.costs):.4f}"
            )
            print(f"    Max Cost for a Request: ${max(OpenAICost.costs):.4f}")
            print(f"    Min Cost for a Request: ${min(OpenAICost.costs):.4f}")
            if len(OpenAICost.costs) > 1:
                print(
                    f"    Standard Deviation of Cost: ${statistics.stdev(OpenAICost.costs):.4f}"
                )
        else:
            print("    No costs recorded.")


class OpenAIProvider:
    def __init__(self):
        self.api_key = os.getenv("JIRA_AI_API_KEY")
        if not self.api_key:
            raise EnvironmentError("JIRA_AI_API_KEY not set in environment.")
        self.endpoint = "https://api.openai.com/v1/chat/completions"
        self.model = os.getenv("OPENJIRA_AI_MODEL", "gpt-4")

    def estimate_tokens(self, text: str) -> int:
        tokens = len(text) // 3  # Using the 3 bytes per token estimation
        return tokens

    def select_model(self, input):
        tokens = self.estimate_tokens(input)

        if tokens < 1000:  # For small files (under ~1000 tokens)
            model = "gpt-3.5-turbo"
        elif tokens < 10000:  # For medium files (under ~10000 tokens)
            model = "gpt-4o-mini"
        else:  # For large files (over ~10000 tokens)
            model = "gpt-4o"

        OpenAICost.send_cost(tokens, model)

        return model

    def improve_text(self, prompt: str, text: str) -> str:
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
        }

        body = {
            "model": self.select_model(text),
            "messages": [
                {"role": "system", "content": prompt},
                {"role": "user", "content": text},
            ],
            "temperature": 0.5,
        }

        response = requests.post(self.endpoint, json=body, headers=headers, timeout=300)
        if response.status_code == 200:
            res = response.json()["choices"][0]["message"]["content"].strip()
            result = res
            # Count the occurrences of '"""'
            occurrences = result.count('"""')
            occurrences_backtick = result.count("```")

            # Check if there are more than two occurrences
            if occurrences > 2:
                # Find the positions of the first and second occurrences
                first_pos = result.find('"""')
                second_pos = result.find('"""', first_pos + 1)

                # Get everything from the first '"""' to the second '"""', inclusive
                result = result[first_pos : second_pos + 3]  # Include the second '"""'

            if occurrences_backtick > 2:
                # Find the positions of the first and second occurrences
                first_pos = result.find("```")
                second_pos = result.find("```", first_pos + 1)

                # Get everything from the first '"""' to the second '"""', inclusive
                result = result[first_pos : second_pos + 3]  # Include the second '"""'

            return result
        raise Exception(
            f"OpenAI API call failed: {response.status_code} - {response.text}"
        )


class Docstring:
    def __init__(self, file_path, debug=False, exit=False):
        self.file_path = file_path
        self.lines = []
        self.ai = OpenAIProvider()
        self.line_index = 0
        self.multiline_index = 0
        self.cache_file = "docstring.cache"
        self.debug = debug
        self.exit = exit

        with open(self.file_path, "r") as file:
            self.lines = file.readlines()

        # Ensure cache file exists, create if necessary
        if not os.path.exists(self.cache_file):
            with open(self.cache_file, "w") as cache:
                json.dump({}, cache)

        self._load_cache()

        print(" -> " + self.file_path)

    def print_debug(self, title, out):
        if not self.debug:
            return

        print("=====================================================")
        print("=====================================================")
        print(" > > " + title)
        print("=====================================================")
        out = "".join(out) if isinstance(out, list) else out
        print(out)
        print("=====================================================")
        print("=====================================================")

    def _load_cache(self):
        with open(self.cache_file, "r") as cache:
            self.cache = json.load(cache)

    def _save_cache(self):
        with open(self.cache_file, "w") as cache_file:
            json.dump(self.cache, cache_file, indent=4)

    def _generate_sha1(self, user_prompt):
        return hashlib.sha1(user_prompt.encode("utf-8")).hexdigest()

    def _get_current_timestamp(self):
        return int(time.time())

    def get_ai_docstring(self, sys_prompt, user_prompt, signiture):
        sha1_hash = self._generate_sha1(user_prompt)

        # Check if file is in cache
        if self.file_path in self.cache:
            # Check if the user prompt's SHA1 is in the self.cache for this file
            for entry in self.cache[self.file_path]:
                if entry["sha1"] == sha1_hash:
                    # Update last_accessed timestamp
                    entry["last_accessed"] = self._get_current_timestamp()
                    # Return cached docstring if found
                    return entry["docstring"]

        print("    Requesting AI for: " + signiture)
        # If no self.cache hit, call the AI and get the docstring
        res = self.ai.improve_text(sys_prompt, user_prompt)

        # Create a new self.cache entry with last_accessed timestamp
        new_entry = {
            "sha1": sha1_hash,
            "docstring": res,
            "last_accessed": self._get_current_timestamp(),
        }

        # Add new entry to the self.cache for the current file
        if self.file_path not in self.cache:
            self.cache[self.file_path] = []

        self.cache[self.file_path].append(new_entry)

        # Return the new docstring from AI
        return res

    def remove_old_entries(self, minutes):
        current_timestamp = self._get_current_timestamp()
        threshold_timestamp = current_timestamp - (minutes * 60)

        # Remove old entries for each file in the self.cache
        for file_path, entries in self.cache.items():
            self.cache[file_path] = [
                entry
                for entry in entries
                if "last_accessed" in entry
                and entry["last_accessed"] >= threshold_timestamp
            ]

    def wrap_text(self, text: str, max_length=120, indent=0):
        wrapped_lines = []
        lines = text.strip().splitlines()

        # Ensure indent is an integer
        try:
            indent = int(indent)
        except ValueError:
            indent = 0  # Default to 0 if it's an invalid value

        spacer = ""
        if indent == 0:
            spacer = ""
        else:
            spacer = " " * (indent * 4)

        for line in lines:
            line = spacer + line.strip()
            while len(line) > max_length:
                # Find last space to split at
                split_at = line.rfind(" ", 0, max_length)
                if split_at == -1:
                    split_at = max_length  # no space found, split at max_length
                wrapped_lines.append(line[:split_at].rstrip())
                line = spacer + line[split_at:].strip()
            wrapped_lines.append(line)
        return wrapped_lines

    def count_and_divide_whitespace(self, line):
        leading_whitespace = len(line) - len(line.lstrip())
        if leading_whitespace == 0:
            return 0
        return leading_whitespace // 4

    def complete(self):
        # Create a temporary file
        with tempfile.NamedTemporaryFile(delete=False) as temp_file:
            temp_file_path = temp_file.name
            temp_file.write(
                "".join(self.lines).encode()
            )  # Write the content to the temporary file

        try:
            # Attempt to compile the temporary file
            result = subprocess.run(
                ["python", "-m", "py_compile", temp_file_path],
                capture_output=True,
                text=True,
            )

            # If there is no compilation error (i.e., result.returncode == 0), move the file to the destination
            if result.returncode == 0:
                with open(self.file_path, "w") as file:
                    print(f"    Wrote: {self.file_path}")
                    file.write("".join(self.lines))  # Write to the destination file
            else:
                print(f"    Error compiling file: {result.stderr}")
                if self.debug:
                    name = "/tmp/" + os.path.basename(self.file_path) + ".failed"
                    with open(name, "w") as file:
                        file.write("".join(self.lines))
                        print(f"    Copied here: {name}")
                        if self.exit:
                            sys.exit(1)

        finally:
            # Clean up the temporary file
            os.remove(temp_file_path)

        self.remove_old_entries(1440 * 14)
        self._save_cache()

    def generate_class_docstring(self):
        line = self.lines[self.line_index]
        class_definition = line
        output = re.sub("\\s+", " ", class_definition.rstrip().replace("\n", " "))
        print("   -> " + output)
        prompt_class_code = [class_definition]

        if self.count_and_divide_whitespace(class_definition) > 0:
            self.line_index = self.line_index + 1
            return None

        t = self.line_index + 1
        # Collect all lines that belong to the class, including "pass" or single-line classes
        while (
            t < len(self.lines)
            and not self.lines[t].startswith("def")
            and not self.lines[t].startswith("class")
        ):
            prompt_class_code.append(self.lines[t].rstrip())
            t += 1

        class_docstring = self.get_ai_docstring(
            system_prompt_class, "\n".join(prompt_class_code), output
        )
        class_docstring = self.wrap_text(class_docstring, max_length=120, indent=1)
        class_docstring[len(class_docstring) - 1] = (
            class_docstring[len(class_docstring) - 1] + "\n"
        )
        class_docstring = [line + "\n" for line in class_docstring]
        class_docstring[len(class_docstring) - 1] = (
            class_docstring[len(class_docstring) - 1].rstrip() + "\n"
        )

        # Check for existing docstring and replace it
        docstring_start_index = None
        docstring_end_index = None

        # Look for the class docstring (the second line should start with """ if it's there)
        if self.lines and self.lines[self.line_index + 1].strip().startswith('"""'):
            # Docstring exists, find the end of it
            docstring_start_index = (
                self.line_index + 1
            )  # The docstring starts from line after the class definition
            for i, line in enumerate(
                self.lines[self.line_index + 2 :], start=self.line_index + 2
            ):
                if line.strip().startswith('"""'):
                    docstring_end_index = i  # End of the docstring
                    break

        # If a docstring exists, replace it
        if docstring_start_index is not None and docstring_end_index is not None:
            self.lines = (
                self.lines[:docstring_start_index]
                + self.lines[docstring_end_index + 1 :]
            )

        # Insert the new docstring after the class definition
        self.lines = (
            self.lines[: self.line_index + 1]
            + class_docstring
            + self.lines[self.line_index + 1 :]
        )

        # self.print_debug("class docstring: " + class_definition.strip(), self.lines)

        return True

    def generate_function_docstring(self):
        line = self.lines[self.line_index]
        mutliline_line = ""

        if not (
            line.strip().endswith("):")
            and not re.search(r"\)\s*->\s*(.*)\s*:.*", line.strip())
        ):
            self.multiline_index = 0
            # multiline def signiture
            while self.line_index < len(self.lines):
                mutliline_line += self.lines[self.line_index]
                if re.match(
                    r".*\):$", self.lines[self.line_index].strip()
                ) or re.search(
                    r".*\)\s*->\s*(.*)\s*:.*", self.lines[self.line_index].strip()
                ):
                    break
                self.line_index += 1
                self.multiline_index += 1

        def_definition = line if mutliline_line == "" else mutliline_line
        output = re.sub("\\s+", " ", def_definition.rstrip().replace("\n", " "))
        print("     -> " + output)
        prompt_def_code = (
            mutliline_line.split("\n") if mutliline_line != "" else [def_definition]
        )

        indent_line = self.count_and_divide_whitespace(
            def_definition if mutliline_line == "" else mutliline_line.splitlines()[0]
        )
        spacer_line = "" if indent_line == 0 else " " * (indent_line * 4)
        spacer_line_minus = "" if indent_line < 2 else " " * ((indent_line - 1) * 4)
        spacer_line_plus = "" if indent_line == 0 else " " * ((indent_line + 1) * 4)

        t = self.line_index + 1
        # Collect all self.lines that belong to the function
        while t < len(self.lines):
            starts_with_def = self.lines[t].strip().startswith("def")

            # same indent
            if starts_with_def and self.lines[t].startswith(spacer_line):
                break

            # outside
            if starts_with_def and self.lines[t].startswith(spacer_line_minus):
                break

            # nested
            if starts_with_def and self.lines[t].startswith(spacer_line_plus):
                pass

            if self.lines[t].rstrip() != def_definition.strip():
                prompt_def_code.append(self.lines[t].rstrip())
            t += 1

        # Now that we have the full function signature, we generate the docstring
        indent = (
            self.count_and_divide_whitespace(
                def_definition
                if mutliline_line == ""
                else mutliline_line.splitlines()[0]
            )
            + 1
        )
        def_docstring = self.get_ai_docstring(
            system_prompt_def, "\n".join(prompt_def_code), output
        )
        def_docstring = self.wrap_text(def_docstring, max_length=120, indent=indent)
        def_docstring[len(def_docstring) - 1] = (
            def_docstring[len(def_docstring) - 1] + "\n"
        )
        if def_definition.strip() == def_docstring[0].strip():
            def_docstring = def_docstring[
                1 if mutliline_line == "" else self.multiline_index :
            ]
        def_docstring = [line + "\n" for line in def_docstring]
        def_docstring[len(def_docstring) - 1] = (
            def_docstring[len(def_docstring) - 1].rstrip() + "\n"
        )

        # Handle one-liner docstring or multi-line docstring
        if '"""' in self.lines[self.line_index + 1]:
            stripped_line = self.lines[self.line_index + 1].strip()
            if re.match(r'"""[\s\S]+?"""', stripped_line):
                # This is a one-liner or multi-line docstring (we always replace with a multi-line docstring)
                self.lines = (
                    self.lines[: self.line_index + 1]
                    + def_docstring
                    + self.lines[self.line_index + 2 :]
                )
            else:
                # Replace the entire docstring if it's multi-line
                end_index = self.line_index + 2
                while end_index < len(self.lines) and not self.lines[
                    end_index
                ].strip().startswith('"""'):
                    end_index += 1

                if (
                    end_index < len(self.lines)
                    and self.lines[end_index].strip() == '"""'
                ):
                    # Found the end of the docstring, now replace the entire docstring block
                    self.lines = (
                        self.lines[: self.line_index + 1]
                        + def_docstring
                        + self.lines[end_index + 1 :]
                    )
        else:
            # If no docstring exists, simply insert the generated docstring
            self.lines = (
                self.lines[: self.line_index + 1]
                + def_docstring
                + self.lines[self.line_index + 1 :]
            )

        # self.print_debug("def docstring: " + def_definition.strip(), self.lines)

        self.line_index = self.line_index + len(def_docstring)
        return True

    def generate_file_docstring(self):
        # Check if we should add a file-level docstring
        # if not self.should_add_file_docstring():
        #     return 0  # Skip generating file-level docstring if not needed

        shebang = ""

        # Check if the first line starts with a shebang (e.g., #! anything)
        if self.lines and not self.lines[0].startswith("#!"):
            self.lines = ["#!/usr/bin/env python\n"] + self.lines
            shebang = "#!/usr/bin/env python\n"
        else:
            shebang = self.lines[0]

        # Check if there's already an existing file-level docstring or comment block
        # We assume the file-level docstring starts with triple quotes (""" or ''') and is at the top
        docstring_start_index = None
        docstring_end_index = None

        if self.lines and self.lines[1].strip().startswith('"""'):
            # If the second line starts with triple quotes, it may be a docstring
            docstring_start_index = 1  # The docstring starts from line 2
            for i, line in enumerate(self.lines[2:], start=2):
                if line.strip().startswith('"""'):
                    docstring_end_index = i  # End of the docstring
                    break

        # Generate new file-level docstring
        general_description = self.get_ai_docstring(
            system_prompt_general, "".join(self.lines), self.file_path
        )
        general_description = self.wrap_text(
            general_description, max_length=120, indent=0
        )
        docstring = [line + "\n" for line in general_description]

        # self.print_debug("docstring_end_index", str(docstring_end_index))
        # self.print_debug("self.lines[:docstring_start_index]", self.lines[:docstring_start_index])
        # self.print_debug("docstring", docstring)
        # self.print_debug("self.lines[docstring_end_index + 1 :]", self.lines[docstring_end_index + 1 :])

        # If a docstring exists, replace it with the new one
        if docstring_start_index is not None and docstring_end_index is not None:
            self.lines = (
                self.lines[:docstring_start_index]
                + docstring
                + self.lines[docstring_end_index + 1 :]
            )
        else:
            # Insert the generated docstring directly after the shebang (no extra newline)
            self.lines = [shebang] + docstring + self.lines[1:]

        # self.print_debug("file docstring: " + self.file_path, self.lines)

        return len(docstring)

    def generate_docstrings(self):
        if len(self.lines) == 0:
            return

        self.line_index = self.generate_file_docstring()

        while self.line_index < len(self.lines):
            line = self.lines[self.line_index]

            # For classes, generate class docstring
            if line.strip().startswith("class "):
                if not self.generate_class_docstring():
                    continue  # Skip to the next line

            # For functions, generate function docstring
            elif line.strip().startswith("def "):
                if not self.generate_function_docstring():
                    continue  # Skip to the next line

            self.line_index = self.line_index + 1

        self.complete()


def process_file(file_path, debug=False, exit=False):
    """Process a single file by generating docstrings."""
    Docstring(file_path, debug=debug, exit=exit).generate_docstrings()


def process_directory(directory_path, recursive=False, debug=False, exit=False):
    """Process all Python files in the directory with progress tracking."""

    # List all python files
    python_files = [
        os.path.join(root, file)
        for root, dirs, files in os.walk(directory_path)
        for file in files
        if file.endswith(".py")
    ]

    total_files = len(python_files)  # Total python files count
    processed_files = 0

    print(f"Processing {total_files} Python files...")

    for file_path in python_files:
        process_file(file_path, debug=debug, exit=exit)
        processed_files += 1
        print(f"\nProcessing file: {processed_files}/{total_files}")

        # If we don't want recursive traversal, we break the loop once we're done with this level
        if not recursive:
            break


def main():
    # Set up the argument parser
    parser = argparse.ArgumentParser(
        description="Generate file-level docstrings for Python files."
    )
    parser.add_argument("path", help="Path to a Python file or directory.")
    parser.add_argument(
        "-r",
        "--recursive",
        action="store_true",
        help="Recursively process all Python files in the directory.",
    )
    parser.add_argument(
        "-d",
        "--debug",
        action="store_true",
        help="Copies failed updates to /tmp/",
    )
    parser.add_argument(
        "-e",
        "--exit",
        action="store_true",
        help="Exits on failure",
    )
    # Parse the arguments
    args = parser.parse_args()

    # Check if the path is a file or directory
    if os.path.isfile(args.path):
        # If it's a file, process it directly
        process_file(args.path, debug=args.debug, exit=args.exit)
    elif os.path.isdir(args.path):
        # If it's a directory, process all Python files
        process_directory(
            args.path, recursive=args.recursive, debug=args.debug, exit=args.exit
        )
    else:
        print(f"Error: {args.path} is neither a valid file nor a directory.")

    OpenAICost.print_cost_metrics()


if __name__ == "__main__":
    main()

Now, let's break down how this script works:

  • OpenAICost Class: This class tracks the cost of using the OpenAI API based on the number of tokens used. It provides methods to send cost data and print cost metrics.
  • OpenAIProvider Class: This class handles the interaction with the OpenAI API. It estimates the number of tokens in the input text, selects the appropriate model based on the token count, and sends requests to improve the text.
  • Docstring Class: This is the heart of the script. It reads the Python file, generates docstrings for classes and functions using the OpenAI API, and saves the updated file. It also manages a cache for previously generated docstrings to optimize performance.
  • Main Functionality: The script can process a single file or an entire directory of Python files, generating docstrings for each. It includes command-line arguments for flexibility.

This tool can significantly enhance your coding workflow by ensuring that your code is well-documented and easy to understand. I hope you find it as useful as I do! Happy coding!


📚 Further Learning

🎥 Watch this video for more:


Enhance Your Git Commits with AI

Hey there, fellow developers! Today, I’m excited to share a nifty little Python script that can help you generate better commit messages for your Git repositories. This script uses the OpenAI API to analyze your code changes and suggest a meaningful commit message. Let’s dive into how it works!

Summary of the Code

This script, named fancy-git-commit.py, leverages the OpenAI API to create insightful commit messages based on the differences in your code. It reads the diff output from Git, cleans it up by removing comments, and then sends it to OpenAI’s model to generate a concise commit message. You can choose to either print the message or directly commit it to your repository.

Here’s the Code


#!/usr/bin/env python
import argparse
import os
import re
import subprocess
import sys

import requests


class OpenAIProvider:
    def __init__(self):
        self.api_key = os.getenv("AI_API_KEY")
        if not self.api_key:
            raise EnvironmentError("AI_API_KEY not set in environment.")
        self.endpoint = "https://api.openai.com/v1/chat/completions"
        self.model = os.getenv("OPENAI_MODEL", "gpt-4o-mini")

    def improve_text(self, prompt: str, text: str) -> str:
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
        }

        body = {
            "model": self.model,
            "messages": [
                {"role": "system", "content": prompt},
                {"role": "user", "content": text},
            ],
            "temperature": 0.3,
        }

        response = requests.post(self.endpoint, json=body, headers=headers, timeout=30)
        if response.status_code == 200:
            return response.json()["choices"][0]["message"]["content"].strip()

        raise Exception(
            f"OpenAI API call failed: {response.status_code} - {response.text}"
        )


def remove_comments_from_diff(diff: str) -> str:
    # Remove single-line comments
    diff = re.sub(r"#.*", "", diff)

    # Remove multi-line comments (triple quotes)
    diff = re.sub(r'("""[\s\S]*?"""|\'\'\'[\s\S]*?\'\'\')', "", diff)

    # Remove comments from git diff (e.g., lines starting with '---', '+++'
    # are part of the diff headers)
    diff = re.sub(r"^\s*(---|\+\+\+)\s.*", "", diff, flags=re.MULTILINE)

    return diff.strip()


def get_diff_from_stdin():
    """Reads the diff from stdin (useful for piped input)."""
    diff = sys.stdin.read()
    return remove_comments_from_diff(diff)


def get_diff_from_git():
    """Runs 'git diff HEAD~1' and captures the output."""
    diff = subprocess.check_output(
        "GIT_PAGER=cat git diff HEAD~1", shell=True, text=True
    )
    return remove_comments_from_diff(diff)


def main():
    parser = argparse.ArgumentParser(
        description="Generate and commit a git commit message."
    )
    parser.add_argument(
        "--print",
        action="store_true",
        help="Print the commit message without creating the commit",
    )
    args = parser.parse_args()

    cwd = os.getcwd()

    diff = None

    if not sys.stdin.isatty():
        print("Reading git diff from stdin...")
        diff = get_diff_from_stdin()
    else:
        print("Getting git diff from HEAD~1...")
        diff = get_diff_from_git()

    if not diff:
        print("No diff found, nothing to commit.")
        return

    prompt = "Take the diff and give me a good commit message, give me the commit message and nothing else"

    openai_provider = OpenAIProvider()

    print("Generating commit message from OpenAI...")
    commit_message = openai_provider.improve_text(prompt, diff)

    if not commit_message:
        print("No commit message generated. Aborting commit.")
        return

    if args.print:
        print(f"Commit message (not committing): {commit_message}")
        return

    print(f"Committing with message: {commit_message}")

    result = subprocess.run(
        ["git", "commit", "-m", commit_message],
        check=True,
        cwd=cwd,
        stdin=subprocess.DEVNULL,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

    print(f"Commit successful! {result.stdout.decode('utf-8')}")


if __name__ == "__main__":
    main()

Code Explanation

Let’s break down the main components of the script:

  • OpenAIProvider Class: This class handles interaction with the OpenAI API. It initializes with the API key and model, and has a method improve_text that sends a prompt along with the text (the diff) to the API and retrieves the generated commit message.
  • remove_comments_from_diff Function: This function takes the raw diff output and removes comments, ensuring that only the relevant changes are sent to the AI for processing. It handles both single-line and multi-line comments, as well as Git-specific diff headers.
  • get_diff_from_stdin and get_diff_from_git Functions: These functions are responsible for obtaining the diff. The first reads from standard input (which is useful for piping), while the second runs a Git command to get the latest changes directly from the repository.
  • main Function: This is where the script starts executing. It sets up argument parsing, determines how to get the diff, and then uses the OpenAIProvider to generate a commit message. Depending on the user’s input, it can print the message or commit the changes directly.

The script is designed to be user-friendly and efficient, making it easier for developers to maintain a clear commit history without the hassle of crafting messages manually.

So, if you’re tired of writing commit messages or just want to add a bit of flair to your Git workflow, give this script a try! It’s a great way to leverage AI in your development process.

Final Thoughts

As we continue to integrate more tools into our workflows, scripts like this can save time and enhance our productivity. If you have any questions or suggestions, feel free to drop a comment below. Happy coding!


📚 Further Learning

🎥 Watch this video for more: