keggtools

Contents

keggtools#

Submodules#

Attributes#

Classes#

ColorGradient

Create color gradient.

Component

Component model.

Enrichment

KEGG pathway enrichment analysis.

EnrichmentResult

Results of KEGG pathway enrichment analysis.

Entry

Entry model class.

Graphics

Graphics information for rendering.

Pathway

KEGG Pathway object.

Relation

Relation model class.

Renderer

Renderer for KEGG Pathway.

Resolver

KEGG pathway resolver class.

Storage

Storage handler class.

Subtype

Subtype model class.

Functions#

msig_to_kegg_id()

Load dataframe to map canonical pathway id of MSigDB to KEGG pathway id.

plot_enrichment_result(enrichment[, ax, figsize, ...])

Plot enrichment results.

Package Contents#

keggtools.AMINO_ACID_METABOLISM: dict[str, str]#
keggtools.BIOSYNTHESIS_OF_OTHER_SECONDARY_METABOLITES: dict[str, str]#
keggtools.CARBOHYDRATE_METABOLISM: dict[str, str]#
keggtools.CHEMICAL_STRUCTURE_TRANSFORMATION_MAPS: dict[str, str]#
class keggtools.ColorGradient(start, stop, steps=100)#

Create color gradient.

Parameters:
get_list()#

Get gradient color as list.

Returns:

Returns list of hexadecimal color strings with a gradient.

Return type:

List[str]

start: tuple#
steps: int = 100#
stop: tuple#
static to_css(color)#

Convert color tuple to CSS rgb color string.

Parameters:

color (tuple) – RGB color tuple containing 3 integers

Returns:

Color as CSS string (e.g. “rgb(0, 0, 0)”).

Return type:

str

static to_hex(color)#

Convert color tuple to hex color string.

Parameters:

color (tuple) – RGB color tuple containing 3 integers.

Returns:

Hexadecimal color string (e.g. “#000000”).

Return type:

str

class keggtools.Component(/, **data)#

Bases: pydantic_xml.BaseXmlModel

Component model.

Parameters:

data (Any)

id: str#
keggtools.ENERGY_METABOLISM: dict[str, str]#
class keggtools.Enrichment(pathways)#

KEGG pathway enrichment analysis.

Parameters:

pathways (list[keggtools.models.Pathway])

all_pathways: list[keggtools.models.Pathway]#
get_subset(subset, inplace=False)#

Create subset of analysis result by list of pathway ids.

Parameters:
  • subset (List[str]) – List of pathway identifer to filter enrichment result by.

  • inplace (bool) – Update instance variable of enrichment result list and overwrite with generated subset.

Returns:

Subset of enrichment results.

Return type:

List[EnrichmentResult]

result: list[EnrichmentResult] = []#
run_analysis(gene_list)#

List of gene ids. Return list of EnrichmentResult instances.

Parameters:

gene_list (List[str]) – List of genes to analyse.

Returns:

List of enrichment result instances.

Return type:

List[EnrichmentResult]

to_csv(file_obj, delimiter='\t', overwrite=False)#

Save result summary as file.

Parameters:
  • file_obj (str | io.IOBase | Any) – String to file or IOBase object

  • delimiter (str) – Deleimiter used for csv.

  • overwrite (bool) – Set to True to overwrite file, if already exist.

  • file_obj

Return type:

None

to_dataframe()#

Return analysis result as pandas DataFrame. Required pandas dependency.

Returns:

Export enrichment results as pandas dataframe.

Return type:

pandas.DataFrame

to_json()#

Export to json dict.

Return type:

List[Dict[str, Any]]

Returns:

Json dict of enrichment results.

class keggtools.EnrichmentResult(org, pathway_id, pathway_name, found_genes, pathway_genes, pathway_title=None)#

Results of KEGG pathway enrichment analysis.

Parameters:
  • org (str)

  • pathway_id (str)

  • pathway_name (str)

  • found_genes (list)

  • pathway_genes (list)

  • pathway_title (str | None)

__str__()#

Build string summary of KEGG path analysis result instance.

Return type:

str

Returns:

Returns string that describes the enrichment result instance.

found_genes: list#
static get_header()#

Build default header for enrichment analysis.

Return type:

List[str]

Returns:

List of header names as string.

json_summary(gene_delimiter=',')#

Build json summary for enrichment analysis.

Parameters:

gene_delimiter (str) – Delimiter to seperate genes in gene list.

Return type:

Dict[str, Any]

Returns:

Summary of enrichment result instance as dict.

organism: str#
pathway_genes: list#
property pathway_genes_count: int#

Count of pathway genes.

Return type:

int

Returns:

Number of genes in pathway.

pathway_id: str#
pathway_name: str#
pathway_title: str | None = None#
pvalue: float | None = None#
property study_count: int#

Count of study genes.

Return type:

int

Returns:

Number of genes found in analysis of pathway.

class keggtools.Entry(/, **data)#

Bases: pydantic_xml.BaseXmlModel

Entry model class.

Parameters:

data (Any)

components: list[Component] = []#
get_gene_id()#

Parse variable ‘name’ of Entry into KEGG id.

Returns:

List of KEGG identifier.

Return type:

list[str]

graphics: Graphics | None = None#
has_multiple_names()#

Checks if entry has multiple names that are space seperated.

Returns:

Retruns True if entry has multiple names.

Return type:

bool

id: str#
name: str#
reaction: str | None#
type: keggtools._types.EntryTypeAlias#
keggtools.GLOBAL_AND_OVERVIEW_MAPS: dict[str, str]#
keggtools.GLYCAN_BIOSYNTHESIS_AND_METABOLISM: dict[str, str]#
class keggtools.Graphics(/, **data)#

Bases: pydantic_xml.BaseXmlModel

Graphics information for rendering.

Parameters:

data (Any)

bgcolor: str | None#
coords: str | None#
fgcolor: str | None#
height: int | None#
name: str | None#
type: keggtools._types.GraphicTypeAlias | None#
width: int | None#
x: int | None#
y: int | None#
keggtools.IMMUNE_SYSTEM_PATHWAYS: dict[str, str]#
keggtools.LIPID_METABOLISM: dict[str, str]#
keggtools.METABOLISM_OF_COFACTORS_AND_VITAMINS: dict[str, str]#
keggtools.METABOLISM_OF_OTHER_AMINO_ACIDS: dict[str, str]#
keggtools.METABOLISM_OF_TERPENOIDS_AND_POLYKETIDES: dict[str, str]#
keggtools.NUCLEOTIDE_METABOLISM: dict[str, str]#
class keggtools.Pathway(/, **data)#

Bases: pydantic_xml.BaseXmlModel

KEGG Pathway object.

The KEGG pathway object stores graphics information and related objects.

Parameters:

data (Any)

entries: list[Entry] = []#
get_entry_by_id(entry_id)#

Get pathway Entry object by id.

Parameters:

entry_id (str) – Id of Entry.

Returns:

Returns Entry instance if id is found in Pathway. Otherwise returns None.

Return type:

Optional[Entry]

get_genes()#

List all genes from pathway.

Returns:

List of entry ids with type gene.

Return type:

List[str]

image: str | None#
name: str#
number: str#
org: str#
reactions: list[Reaction] = []#
relations: list[Relation] = []#
title: str | None#
class keggtools.Relation(/, **data)#

Bases: pydantic_xml.BaseXmlModel

Relation model class.

Parameters:

data (Any)

entry1: str#
entry2: str#
subtypes: list[Subtype] = []#
type: keggtools._types.RelationTypeAlias#
class keggtools.Renderer(kegg_pathway, gene_dict=None, cache_or_resolver=None, upper_color=(255, 0, 0), lower_color=(0, 0, 255))#

Renderer for KEGG Pathway.

Parameters:
property cmap_downreg: list[str]#

Generated color map as list of hexadecimal strings for downregulated genes in gene dict.

Return type:

list[str]

property cmap_upreg: list[str]#

Generated color map as list of hexadecimal strings for upregulated genes in gene dict.

Return type:

list[str]

get_gene_color(gene_id, default_color=(255, 255, 255))#

Get overlay color for given gene.

Parameters:
  • gene_id (str) – Identify of gene.

  • default_color (tuple[int, int, int]) – Default color to return if gene is not found in gene_dict. Format in RGB tuple.

  • default_color

Returns:

Color of gene by expression level specified in gene_dict.

Return type:

str

graph: pydot.Dot#
lower_color: tuple = (0, 0, 255)#
overlay: dict[str, float]#
pathway: keggtools.models.Pathway#
render(display_unlabeled_genes=True)#

Render KEGG pathway.

Parameters:

display_unlabeled_genes (bool) – Entries in the KGML format can have space-seperated entry names. Set this parameter to False to hide the entries.

Return type:

None

resolver: keggtools.resolver.Resolver = None#
to_binary(extension)#

Export pydot graph to binary data.

Parameters:

extension (str) – Extension of file to export. Use format string like “png”, “svg”, “pdf” or “jpeg”.

Returns:

File content are bytes object.

Return type:

bytes

Raises:

TypeError – If variable with generated dot graph is not type bytes.

to_file(filename, extension)#

Export pydot graph to file.

Parameters:
  • filename (str) – Filename to save file at.

  • extension (str) – Extension of file to export. Use format string like “png”, “svg”, “pdf” or “jpeg”.

Return type:

None

to_string()#

Pydot graph instance to dot string.

Returns:

Generated dot string of pathway.

Return type:

str

upper_color: tuple = (255, 0, 0)#
class keggtools.Resolver(cache=None)#

KEGG pathway resolver class.

Request interface for KEGG API endpoint.

Parameters:

cache (keggtools.storage.Storage | str | None)

check_organism(organism)#

Check if organism code exist.

Parameters:

organism (str) – 3 letter organism code used by KEGG database.

Returns:

Returns True if organism code is found in list of valid organisms.

Return type:

bool

get_compounds(**kwargs)#

Get dict of components. Request from KEGG API if not in cache.

Parameters:

kwargs (Any) – other arguments to requests.get.

Returns:

Dict of compound identifier to compound name.

Return type:

Dict[str, str]

get_organism_list(**kwargs)#

Get organism codes from file or KEGG API.

Parameters:

kwargs (Any) – other arguments to requests.get.

Returns:

Dict with format {<org>: <org-name>}

Return type:

Dict[str, str]

get_pathway(organism, code, **kwargs)#

Load and parse KGML pathway by identifier.

Parameters:
  • organism (str) – 3 letter organism code used by KEGG database.

  • code (str) – Pathway identify used by KEGG database.

  • kwargs (Any) – other arguments to requests.get.

Returns:

Returns parsed Pathway instance.

Return type:

Pathway

get_pathway_list(organism, **kwargs)#

Request list of pathways linked to organism.

Parameters:
  • organism (str) – 3 letter organism code used by KEGG database.

  • kwargs (Any) – other arguments to requests.get.

Returns:

Dict in format {<pathway-id>: <name>}.

Return type:

Dict[str, str]

storage: keggtools.storage.Storage = None#
class keggtools.Storage(cachedir=None)#

Storage handler class.

Parameters:

cachedir (str | None)

build_cache_path(filename)#

Build absolute filename for caching directory.

Parameters:

filename (str) – Name of file (is used as suffix to cache directory).

Returns:

Full filename with is inside cache folder.

Return type:

str

cachedir = None#
check_cache_dir()#

Checks if cache dir exist. Raises “NotADirectoryError” of caching folder not found.

Raises:

NotADirectoryError – Error if cache folder does not exist.

Return type:

None

exist(filename)#

Check if filename exist in caching dir.

Parameters:

filename (str) – Filename to check.

Returns:

Returns True if file with given name exist in cachedir.

Return type:

bool

load(filename)#

Load string from file.

Parameters:

filename (str) – Filename of file to load from cache folder.

Returns:

File content string.

Return type:

str

load_dump(filename)#

Load binary dump from file.

Parameters:

filename (str) – Filename of file to load from cache folder.

Returns:

Object from file.

Return type:

Any

save(filename, data)#

Save string as file in local storage. Returns absolute filename of save file.

Parameters:
  • filename (str) – Filename to storage file at.

  • data (str) – String data to save to cache file.

Returns:

Full filename to cached file.

Return type:

str

save_dump(filename, data)#

Save binary dump as file in local storage. Returns absolute filename of save file.

Parameters:
  • filename (str) – Filename to storage file at.

  • data (Any) – Data to store to cache file. Can be any object.

Returns:

Full filename to cached file.

Return type:

str

class keggtools.Subtype(/, **data)#

Bases: pydantic_xml.BaseXmlModel

Subtype model class.

Parameters:

data (Any)

name: keggtools._types.RelationSubtypeAlias#
value: str#
keggtools.XENOBIOTICS_BIODEGRADATION_AND_METABOLISM: dict[str, str]#
keggtools.__version__: str = '1.0.4'#
keggtools.msig_to_kegg_id()#

Load dataframe to map canonical pathway id of MSigDB to KEGG pathway id.

Returns:

Dataframe containing MSigDB id and KEGG pathway id.

Return type:

pandas.DataFrame

Raises:

AssertionError – If mapping file does not exist.

keggtools.plot_enrichment_result(enrichment, ax=None, figsize=(7, 7), cmap='coolwarm', min_study_count=1, max_pval=None, use_percent_study_count=True)#

Plot enrichment results.

Parameters:
Return type:

matplotlib.axes.Axes