keggtools.utils#

Basic utils for HTTP requests, parsing and rendering.

Classes#

ColorGradient

Create color gradient.

Functions#

is_valid_gene_name(value)

Check if gene identifer is valid. String must match "<org>:<number>".

is_valid_hex_color(value)

Check if string is a valid hex color.

is_valid_pathway_name(value)

Check if combined pathway identifer is valid. String must match "path:<org><number>".

is_valid_pathway_number(value)

Check if pathway number has correct 5 digit format.

is_valid_pathway_org(value)

Check if organism identifier is valid.

merge_entrez_geneid(diffexp[, gene_column, ...])

Use pybiomart to merge entrez gene id to differential expression dataframe.

msig_to_kegg_id()

Load dataframe to map canonical pathway id of MSigDB to KEGG pathway id.

parse_tsv(data)

Parse .tsv file from string.

parse_tsv_to_dict(data[, col_keys, col_values])

Parse .tsv file from string and build dict from first two columns. Other columns are ignored.

parse_xml(xml_object_or_string)

Returns XML Element object from string or XML Element.

Module Contents#

class keggtools.utils.ColorGradient(start, stop, steps=100)#

Create color gradient.

Parameters:
get_list()#

Get gradient color as list.

Returns:

Returns list of hexadecimal color strings with a gradient.

Return type:

List[str]

start: tuple#
steps: int = 100#
stop: tuple#
static to_css(color)#

Convert color tuple to CSS rgb color string.

Parameters:

color (tuple) – RGB color tuple containing 3 integers

Returns:

Color as CSS string (e.g. “rgb(0, 0, 0)”).

Return type:

str

static to_hex(color)#

Convert color tuple to hex color string.

Parameters:

color (tuple) – RGB color tuple containing 3 integers.

Returns:

Hexadecimal color string (e.g. “#000000”).

Return type:

str

keggtools.utils.is_valid_gene_name(value)#

Check if gene identifer is valid. String must match “<org>:<number>”.

Parameters:

value (str) – String value to check.

Returns:

Returns True if value matches format of gene name.

Return type:

bool

keggtools.utils.is_valid_hex_color(value)#

Check if string is a valid hex color.

Parameters:

value (str) – String value to check.

Returns:

Returns True if value is valid hexadecimal color string.

Return type:

bool

keggtools.utils.is_valid_pathway_name(value)#

Check if combined pathway identifer is valid. String must match “path:<org><number>”.

Parameters:

value (str) – String value to check.

Returns:

Returns True if value matches format of pathway name.

Return type:

bool

keggtools.utils.is_valid_pathway_number(value)#

Check if pathway number has correct 5 digit format.

Parameters:

value (str) – String value to check.

Returns:

Returns True if value has the correct format of pathway number.

Return type:

bool

keggtools.utils.is_valid_pathway_org(value)#

Check if organism identifier is valid.

Parameters:

value (str) – String value to check.

Returns:

Returns True if value is a valid organism code.

Return type:

bool

keggtools.utils.merge_entrez_geneid(diffexp, gene_column='names', dataset_name='hsapiens_gene_ensembl', symbol_source='hgnc_symbol', entrez_source='entrezgene_id', use_cache=True)#

Use pybiomart to merge entrez gene id to differential expression dataframe.

Parameters:
  • diffexp (pandas.DataFrame) – Pandas dataframe containing to differential expression data.

  • gene_column (str) – Name of column in differential expression dataframe that contains to gene symbol.

  • dataset_name (str) – Biomart dataset to use for conversion.

  • symbol_source (str) – Biomart source dataset for gene symbol.

  • entrez_source (str) – Biomart source dataset for entrez id.

  • use_cache (bool) – Use cache for pybioart requests. Defaults to True.

Returns:

Returns differential expression dataframe with merged column for entrez id.

Return type:

pandas.DataFrame

keggtools.utils.msig_to_kegg_id()#

Load dataframe to map canonical pathway id of MSigDB to KEGG pathway id.

Returns:

Dataframe containing MSigDB id and KEGG pathway id.

Return type:

pandas.DataFrame

Raises:

AssertionError – If mapping file does not exist.

keggtools.utils.parse_tsv(data)#

Parse .tsv file from string.

Parameters:

data (str) – Tsv string to parse into list.

Returns:

List of items.

Return type:

list

keggtools.utils.parse_tsv_to_dict(data, col_keys=0, col_values=1)#

Parse .tsv file from string and build dict from first two columns. Other columns are ignored.

Parameters:
  • data (str) – Tsv string to parse.

  • col_keys (int) – Number of colum to parse as dict keys (0-index).

  • col_values (int) – Number of colum to parse as dict values (0-index).

Returns:

Dict of two tsv columns.

Return type:

Dict[str, str]

keggtools.utils.parse_xml(xml_object_or_string)#

Returns XML Element object from string or XML Element.

Parameters:
Returns:

XML element instance.

Return type:

xml.etree.ElementTree.Element