Core Modules
Core framework modules for pipeline execution, configuration, and CLI.
Pipeline Executor
Pipeline orchestration engine for ideal_genom.
- class ideal_genom.core.pipeline.PipelineExecutor(config: Dict[str, Any], dry_run: bool = False)[source]
Bases:
objectOrchestrates execution of sub-pipeline classes based on configuration.
Configuration
Configuration loading and validation for ideal_genom pipelines.
- exception ideal_genom.core.config.ConfigurationError[source]
Bases:
ExceptionRaised when configuration is invalid.
- ideal_genom.core.config.load_config(config_path: str) Dict[str, Any][source]
Load pipeline configuration from YAML file.
- Parameters:
config_path (str) – Path to YAML configuration file
- Returns:
Parsed configuration dictionary
- Return type:
- Raises:
ConfigurationError – If configuration is invalid or file not found
- ideal_genom.core.config.validate_config(config: Dict[str, Any]) None[source]
Validate pipeline configuration structure.
- Parameters:
config (dict) – Configuration dictionary to validate
- Raises:
ConfigurationError – If configuration structure is invalid
Command Line Interface
Command-line interface for IDEAL-GENOM-QC pipeline.
This module provides the main CLI entry point for running genomic quality control pipelines using YAML configuration files.
- ideal_genom.core.cli.setup_logging(level: str = 'INFO') None[source]
Setup basic logging configuration.
- Parameters:
level (str) – Logging level (DEBUG, INFO, WARNING, ERROR)
- ideal_genom.core.cli.validate_config_file(config_path: str) Path[source]
Validate that the configuration file exists and is readable.
- Parameters:
config_path (str) – Path to the configuration file
- Returns:
Validated configuration file path
- Return type:
Path
- Raises:
FileNotFoundError – If configuration file doesn’t exist
- ideal_genom.core.cli.cmd_run(args: Namespace) int[source]
Execute the run command.
- Parameters:
args (argparse.Namespace) – Parsed command line arguments
- Returns:
Exit code (0 for success, 1 for failure)
- Return type:
- ideal_genom.core.cli.cmd_validate(args: Namespace) int[source]
Execute the validate command.
- Parameters:
args (argparse.Namespace) – Parsed command line arguments
- Returns:
Exit code (0 for success, 1 for failure)
- Return type:
- ideal_genom.core.cli.cmd_template(args: Namespace) int[source]
Execute the template command.
- Parameters:
args (argparse.Namespace) – Parsed command line arguments
- Returns:
Exit code (0 for success, 1 for failure)
- Return type:
- ideal_genom.core.cli.create_parser() ArgumentParser[source]
Create the command line argument parser.
- Returns:
Configured argument parser
- Return type:
Executor
Command execution utilities for external genomic tools.
- exception ideal_genom.core.executor.CommandExecutionError[source]
Bases:
ExceptionRaised when a shell command fails.
- ideal_genom.core.executor.shell_do(command: str | List[str], cwd: str | None = None, log_file: str | None = None, capture_output: bool = False, check: bool = True) CompletedProcess[source]
Execute a shell command for genomic analysis tools.
This is a wrapper around subprocess.run with logging and error handling tailored for genomic analysis pipelines (PLINK, GCTA, bcftools, etc.).
- Parameters:
command (str or list of str) – Command to execute. Can be a string or list of arguments.
cwd (str, optional) – Working directory for command execution
log_file (str, optional) – Path to file where stdout/stderr should be logged
capture_output (bool, default=False) – If True, capture stdout and stderr in returned object
check (bool, default=True) – If True, raise CommandExecutionError on non-zero exit code
- Returns:
Completed process with returncode, stdout, stderr
- Return type:
- Raises:
CommandExecutionError – If command fails and check=True
Examples
>>> # Execute PLINK command >>> shell_do("plink --bfile input --maf 0.01 --make-bed --out output")
>>> # Execute with working directory >>> shell_do( ... ["bcftools", "view", "-Oz", "input.vcf"], ... cwd="/data/work", ... log_file="/data/logs/bcftools.log" ... )
- ideal_genom.core.executor.run_plink(args: List[str], log_file: str | None = None, cwd: str | None = None) CompletedProcess[source]
Execute PLINK command.
- Parameters:
- Returns:
Completed process
- Return type:
Examples
>>> run_plink([ ... '--bfile', 'input', ... '--maf', '0.01', ... '--make-bed', ... '--out', 'output' ... ])
- ideal_genom.core.executor.run_plink2(args: List[str], log_file: str | None = None, cwd: str | None = None) CompletedProcess[source]
Execute PLINK2 command.
- Parameters:
- Returns:
Completed process
- Return type:
- ideal_genom.core.executor.run_gcta(args: List[str], log_file: str | None = None, cwd: str | None = None) CompletedProcess[source]
Execute GCTA command.
- Parameters:
- Returns:
Completed process
- Return type: