Join the Kedro community

M
M
M
D
M

Using custom resolvers to provide credentials in catalog.yml

I am trying to use custom resolvers to provide credentials in catalog.yml

document_classification:
type: ibis.TableDataset
table_name: document_classification
connection:
backend: ${oc.env:BACKEND}
host: ${oc.env:HOST}
port: ${oc.env:PORT}
database: ${oc.env:DATABASE}
user: ${oc.env:USER}
password: ${oc.env:PASSWORD}


CONFIG_LOADER_ARGS = {
"base_env": "base",
"default_run_env": "local",
"custom_resolvers" : {
"oc.env" : oc.env

}
}

Is it the right way to do it

4
d
V
N
34 comments

CONFIG_LOADER_ARGS = {
    "base_env": "base",
    "default_run_env": "local",
    "custom_resolvers" : {
        "oc.env" : oc.env
    }
} 
This part needs to into settings.py

also don't forget to import

from omegaconf.resolvers import oc

<strike>yeaah that's done .<br /><br />I have also created .env file<br /><br /></strike>

BACKEND=""
HOST=""
PORT=""
DATABASE=""
USER="vishalp"
PASSWORD=""
<strike><br /><br />Using python-dotenv to load these env variables . But when I print the </strike><strike>USER </strike><strike> env on console , it is skipping the last character for some reason , very weird .<br /><br /><br />09/17/24 14:58:58] INFO Using 'conf/logging.yml' as logging configuration. You can change this by setting the KEDRO_LOGGING_CONFIG environment variable accordingly. __init__.py:249<br /> INFO .env file loaded successfully env_loader.py:13<br /> DEBUG BACKEND : postgres env_loader.py:29<br /> DEBUG HOST : </strike><strike>*</strike><strike>****** env_loader.py:29<br /> DEBUG PORT : **** env_loader.py:29<br /> DEBUG DATABASE : ****** env_loader.py:29<br /> DEBUG USER : vishal env_loader.py:29<br /> DEBUG PASSWORD : ***** env_loader.py:29<br /> INFO All Env Variables loaded Successfully</strike>

just check the DEBUG msg for USER : , you will find the trailing "p" is skipped

Please ignore the above. It is resolved, For some reason , dotenv was not loading the updated variables

is there a better way to declare all these items in catalog, like there is too much of redundancy


document_classification:
  type: ibis.TableDataset
  table_name: document_classification
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}

case_master:
  type: ibis.TableDataset
  table_name: case_master
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}

user_master:
  type: ibis.TableDataset
  table_name: user_master
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}

You probably don't even need factory yet, use interpolation (template value basically)

https://docs.kedro.org/en/stable/configuration/advanced_configuration.html

i was just trying a code snippet given in official kedro docs as mentioned below , but it looks like the catalog is not resolved properly when we use oc.env resolver


from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from kedro.io import DataCatalog
from pathlib import Path

project_root = "/home/vishal/Documents/workspace/mlops/data-pipelines/"
conf_path = str(Path(project_root) / settings.CONF_SOURCE)

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_loader = OmegaConfigLoader(
    conf_source=conf_path, base_env="base", default_run_env="local"
)

# These lines show how to access the catalog and credentials configurations.
conf_catalog = conf_loader["catalog"]
conf_credentials = conf_loader["credentials"]

# # Fetch the catalog with resolved credentials from the configuration.
# catalog = DataCatalog.from_config(catalog=conf_catalog, credentials=conf_credentials)

Error

in <module>:15                                                                                   │
│                                                                                                  │
│   12 )                                                                                           │
│   13                                                                                             │
│   14 # These lines show how to access the catalog and credentials configurations.                │
│ ❱ 15 conf_catalog = conf_loader["catalog"]                                                       │
│   16 conf_credentials = conf_loader["credentials"]                                               │
│   17                                                                                             │
│   18 # # Fetch the catalog with resolved credentials from the configuration.                     │
│                                                                                                  │

UnsupportedInterpolationType: Unsupported interpolation type oc.env
    full_key: document_classification.connection.backend
    object_type=dict

The section explains this in detail, but in short you need to turn on this settings because oc.env by default are enabled for credentials only.

For the context, this is a bit of legacy, since Kedro introduced credentials year ago and resolver comes later. In the future we are thinking to introduce a credentials resolver.

What does it mean by oc.env is enabled for credentials only ?? - Can you explain this a bit more

OmegaConf also comes with some built-in resolvers that you can use with the OmegaConfigLoader in Kedro. All built-in resolvers except for oc.env are enabled by default. oc.env is only turned on for loading credentials. You can, however, turn this on for all configurations through your project’s src/<package_name>/settings.py in a similar way:

Hi, team. I have a similar error, obtained when running kedro viz command:

error on omegaconf/base.py

custom_resolver
    raise UnsupportedInterpolationType(
omegaconf.errors.UnsupportedInterpolationType: Unsupported interpolation type path
    full_key: _path
    object_type=dict

I'm not really sure if that's related to it, but I have:

  • In conf/base/catalog_globals.yml:
  • _path: ${path:}/${run_folder:}/

  • In conf/base/environ.yml:
  • local:path: data/runs/

Could someone help troubleshoot this?

is this related to the thread? I am a bit confused.

sorry for the confusion...actually, the error is the same (UnsupportedInterpolationType) but its coming from running another command:
kedro viz

do you get the same error when you just run any kedro command?

no..I only get this error when running kedro viz

can you try _path: ${path}/${run_folder}/ ? the ${path:} is telling OmegaConf that there's a resolver called path, hence the UnsupportedInterpolationType

Maybe you are trying to use variable interpolation (template value)? Can you give an example what is the expected value? is environ.yml a catalog?

Hello, Juan and Nok. Thank you for your reply.
I tried replacing the _path: ${path:}/${run_folder:}/ by _path: ${path}/${run_folder}/ and it still showed the error above.
environ.yml is contained in conf/base and has the following :

# Pass the path to your Databricks volume here.
databricks:
  path: /Volumes/prod_us_cpibaws_5edb792/${env:CATALOG_ENV,'default'}/temp
local:
  path: data/runs/

hi , could you paste the full traceback after you changed the _path definition?

Hi, Juan. Thanks again for your reply, and apologies for the delay.

Here is the full traceback:

Starting Kedro Viz ...
Process Process-1:
Traceback (most recent call last):
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 112, in run_server
    load_and_populate_data(path, env, include_hooks, extra_params, pipeline_name)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 62, in load_and_populate_data
    catalog, pipelines, session_store, stats_dict = kedro_data_loader.load_data(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 105, in load_data
    catalog = context.catalog
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 187, in catalog
    return self._get_catalog()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 223, in _get_catalog
    conf_catalog = self.config_loader["catalog"]
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 199, in __getitem__
    base_config = self.load_and_merge_dir_config(  # type: ignore[no-untyped-call]
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 339, in load_and_merge_dir_config
    for k, v in OmegaConf.to_container(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/omegaconf.py", line 573, in to_container
    return BaseContainer._to_content(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 292, in _to_content
    value = get_node_value(key)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 244, in get_node_value
    conf._format_and_raise(key=key, value=None, cause=e)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 231, in _format_and_raise
    format_and_raise(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
    _raise(ex, cause)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 797, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set env var OC_CAUSE=1 for full trace
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 242, in get_node_value
    node = node._dereference_node()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 246, in _dereference_node
    node = self._dereference_node_impl(throw_on_resolution_failure=True)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 277, in _dereference_node_impl
    return parent._resolve_interpolation_from_parse_tree(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 584, in _resolve_interpolation_from_parse_tree
    resolved = self.resolve_parse_tree(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 765, in resolve_parse_tree
    return visitor.visit(parse_tree)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 206, in accept
    return visitor.visitConfigValue(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 101, in visitConfigValue
    return self.visit(ctx.getChild(0))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 342, in accept
    return visitor.visitText(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 301, in visitText
    return self._unescape(list(ctx.getChildren()))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 389, in _unescape
    text = str(self.visitInterpolation(node))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 125, in visitInterpolation
    return self.visit(ctx.getChild(0))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 921, in accept
    return visitor.visitInterpolationNode(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 158, in visitInterpolationNode
    return self.node_interpolation_callback(inter_key, self.memo)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 746, in node_interpolation_callback
    return self._resolve_node_interpolation(inter_key=inter_key, memo=memo)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 676, in _resolve_node_interpolation
    raise InterpolationKeyError(f"Interpolation key '{inter_key}' not found")
omegaconf.errors.InterpolationKeyError: Interpolation key 'path' not found
    full_key: _path
    object_type=dict

, if you look at the traceback; Kedro-viz tried to access the catalog = context.catalog and that's when the error is thrown

Can you confirm this runs fine when you do kedro run? This help us to narrow down the scope of the issue as kedro-viz mostly just get these data from kedro, if kedro run works we normally don't expect issue on kedro-viz side.

Hi, . Thanks again for your reply.
It works fine with kedro run. We are running kedro run -p <pipeline_name> --env <environment_name> successfully.

I'm not sure if it helps, but our settings.py is the following:

"""Project settings. There is no need to edit this file unless you want to change values
from the Kedro defaults. For further information, including these default values, see
<a target="_blank" rel="noopener noreferrer" href="https://docs.kedro.org/en/stable/kedro_project_setup/settings.html">https://docs.kedro.org/en/stable/kedro_project_setup/settings.html</a>."""

# from kedro_mlflow.framework.hooks import MlflowHook

# Instantiated project hooks.
from cpib_models.hooks import ConfEnvironHooks, MLFlowRunHook, SparkHooks  # noqa: E402

from .settings_utils.resolvers import set_resolvers

# Hooks are executed in a Last-In-First-Out (LIFO) order.
HOOKS = (SparkHooks(), ConfEnvironHooks(), MLFlowRunHook())

# Installed plugins for which to disable hook auto-registration.
# DISABLE_HOOKS_FOR_PLUGINS = ("kedro-viz",)

from pathlib import Path  # noqa: E402

from kedro_viz.integrations.kedro.sqlite_store import SQLiteStore  # noqa: E402

# Class that manages storing KedroSession data.

SESSION_STORE_CLASS = SQLiteStore
# Keyword arguments to pass to the `SESSION_STORE_CLASS` constructor.
SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2])}

# Directory that holds configuration.
# CONF_SOURCE = "conf"

# Class that manages how configuration is loaded.
from kedro.config import OmegaConfigLoader  # noqa: E402

CONFIG_LOADER_CLASS = OmegaConfigLoader
# Keyword arguments to pass to the `CONFIG_LOADER_CLASS` constructor.
CONFIG_LOADER_ARGS = {
    "base_env": "base",
    "default_run_env": "local",
    "config_patterns": {
        "spark": ["spark*", "spark*/**"],
    },
}

set_resolvers()

resolvers.py is the following:

import os

import mlflow
from omegaconf import OmegaConf


def get_model_id():
    """
    Set run_folder
    """

    model_id = os.getenv("MODEL_ID")

    return model_id


def get_run_id(tracking_uri="databricks"):
    """
    Logic to get the run_id from running environment
    """
    run_id = os.getenv("RUN_ID")
    if not run_id:
        model_id = get_model_id()
        if model_id:
            if os.getenv("MODEL_VERSION"):
                run_id = (
                    mlflow.MlflowClient(tracking_uri=tracking_uri)
                    .get_model_version(
                        name=model_id, version=os.getenv("MODEL_VERSION")
                    )
                    .run_id
                )
            else:
                run_id = (
                    mlflow.MlflowClient(tracking_uri=tracking_uri)
                    .get_latest_versions(
                        name=model_id, stages=[os.getenv("STAGE", "Production")]
                    )[0]
                    .run_id
                )

    return run_id


def get_run_folder(tracking_uri="databricks"):
    """
    Set run_folder
    """
    from uuid import uuid4

    run_folder = None
    run_id = get_run_id()
    active_run = mlflow.active_run()

    if run_id:
        run_folder = (
            mlflow.MlflowClient(tracking_uri=tracking_uri)
            .get_run(run_id)
            .data.params.get("run_folder")
        )
    elif active_run:
        run_folder = active_run.data.params.get("run_folder")
    if run_folder is None:
        run_folder = os.getenv("RUN_FOLDER")
    if run_folder is None:
        run_folder = str(uuid4())

    return run_folder


def shift_current_month(months: int) -> str:
    """Generate string for the current month shifted 'monhts' month

    Args:
        months (int): Number of months to go backwards

    Returns:
        str: Shifted date
    """
    from datetime import datetime

    from dateutil.relativedelta import relativedelta

    date = (datetime.now().replace(day=1) - relativedelta(months=months)).strftime(
        "%Y-%m-%d"
    )
    return date


def get_packege_version(pkg="") -> str:
    """Get package version"""
    from pip._internal.commands.show import search_packages_info

    version = next(search_packages_info([pkg])).version
    return version


# %%
def set_resolvers():
    """
    Set the resolvers for OmegaConf
    """
    if not OmegaConf.has_resolver("env"):
        OmegaConf.register_new_resolver(
            "env",
            lambda key, default=None: os.getenv(key, default),
        )
    if not OmegaConf.has_resolver("shift_current_month"):
        OmegaConf.register_new_resolver(
            "shift_current_month",
            shift_current_month,
        )
    if not OmegaConf.has_resolver("package_version"):
        OmegaConf.register_new_resolver(
            "package_version",
            get_packege_version,
        )

thank you, we will have to investigate this further. I will create a github issue on this and we can look at it soon.

was an issue ever opened ?

hey created an issue now https://github.com/kedro-org/kedro-viz/issues/2142. Thanks for checking in.

, we wil look at it this week, might need to jump on a quick call if we are not able to reproduce the error at our end. will reach out 🙂

Add a reply
Sign up and join the conversation on Slack
Join