I am trying to use custom resolvers to provide credentials in catalog.ymldocument_classification:
type: ibis.TableDataset
table_name: document_classification
connection:
backend: ${oc.env:BACKEND}
host: ${oc.env:HOST}
port: ${oc.env:PORT}
database: ${oc.env:DATABASE}
user: ${oc.env:USER}
password: ${oc.env:PASSWORD}
CONFIG_LOADER_ARGS = {
"base_env": "base",
"default_run_env": "local",
"custom_resolvers" : {
"oc.env" : oc.env
}
}
Is it the right way to do it
CONFIG_LOADER_ARGS = { "base_env": "base", "default_run_env": "local", "custom_resolvers" : { "oc.env" : oc.env } }This part needs to into
settings.py
<strike>yeaah that's done .<br /><br />I have also created .env file<br /><br /></strike>
BACKEND="" HOST="" PORT="" DATABASE="" USER="vishalp" PASSWORD=""<strike><br /><br />Using python-dotenv to load these env variables . But when I print the </strike><strike>USER </strike><strike> env on console , it is skipping the last character for some reason , very weird .<br /><br /><br />09/17/24 14:58:58] INFO Using 'conf/logging.yml' as logging configuration. You can change this by setting the KEDRO_LOGGING_CONFIG environment variable accordingly. __init__.py:249<br /> INFO .env file loaded successfully env_loader.py:13<br /> DEBUG BACKEND : postgres env_loader.py:29<br /> DEBUG HOST : </strike><strike>*</strike><strike>****** env_loader.py:29<br /> DEBUG PORT : **** env_loader.py:29<br /> DEBUG DATABASE : ****** env_loader.py:29<br /> DEBUG USER : vishal env_loader.py:29<br /> DEBUG PASSWORD : ***** env_loader.py:29<br /> INFO All Env Variables loaded Successfully</strike>
Please ignore the above. It is resolved, For some reason , dotenv was not loading the updated variables
is there a better way to declare all these items in catalog, like there is too much of redundancy
document_classification: type: ibis.TableDataset table_name: document_classification connection: backend: ${oc.env:BACKEND} host: ${oc.env:HOST} port: ${oc.env:PORT} database: ${oc.env:DATABASE} user: ${oc.env:USER} password: ${oc.env:PASSWORD} case_master: type: ibis.TableDataset table_name: case_master connection: backend: ${oc.env:BACKEND} host: ${oc.env:HOST} port: ${oc.env:PORT} database: ${oc.env:DATABASE} user: ${oc.env:USER} password: ${oc.env:PASSWORD} user_master: type: ibis.TableDataset table_name: user_master connection: backend: ${oc.env:BACKEND} host: ${oc.env:HOST} port: ${oc.env:PORT} database: ${oc.env:DATABASE} user: ${oc.env:USER} password: ${oc.env:PASSWORD}
Yes! You're now looking for dataset factories
https://docs.kedro.org/en/stable/data/kedro_dataset_factories.html
You probably don't even need factory yet, use interpolation (template value basically)
https://docs.kedro.org/en/stable/configuration/advanced_configuration.html
i was just trying a code snippet given in official kedro docs as mentioned below , but it looks like the catalog is not resolved properly when we use oc.env resolver
from kedro.config import OmegaConfigLoader from kedro.framework.project import settings from kedro.io import DataCatalog from pathlib import Path project_root = "/home/vishal/Documents/workspace/mlops/data-pipelines/" conf_path = str(Path(project_root) / settings.CONF_SOURCE) # Instantiate an `OmegaConfigLoader` instance with the location of your project configuration. conf_loader = OmegaConfigLoader( conf_source=conf_path, base_env="base", default_run_env="local" ) # These lines show how to access the catalog and credentials configurations. conf_catalog = conf_loader["catalog"] conf_credentials = conf_loader["credentials"] # # Fetch the catalog with resolved credentials from the configuration. # catalog = DataCatalog.from_config(catalog=conf_catalog, credentials=conf_credentials)
Error
in <module>:15 │ │ │ │ 12 ) │ │ 13 │ │ 14 # These lines show how to access the catalog and credentials configurations. │ │ ❱ 15 conf_catalog = conf_loader["catalog"] │ │ 16 conf_credentials = conf_loader["credentials"] │ │ 17 │ │ 18 # # Fetch the catalog with resolved credentials from the configuration. │ │ │ UnsupportedInterpolationType: Unsupported interpolation type oc.env full_key: document_classification.connection.backend object_type=dict
The section explains this in detail, but in short you need to turn on this settings because oc.env
by default are enabled for credentials only.
For the context, this is a bit of legacy, since Kedro introduced credentials
year ago and resolver comes later. In the future we are thinking to introduce a credentials resolver.
What does it mean by oc.env is enabled for credentials only ?? - Can you explain this a bit more
OmegaConf
also comes with some built-in resolvers that you can use with theOmegaConfigLoader
in Kedro. All built-in resolvers except foroc.env
are enabled by default.oc.env
is only turned on for loading credentials. You can, however, turn this on for all configurations through your project’ssrc/<package_name>/settings.py
in a similar way:
Hi, team. I have a similar error, obtained when running kedro viz command:
error on omegaconf/base.py
custom_resolver raise UnsupportedInterpolationType( omegaconf.errors.UnsupportedInterpolationType: Unsupported interpolation type path full_key: _path object_type=dict
_path: ${path:}/${run_folder:}/
local:path: data/runs/
sorry for the confusion...actually, the error is the same (UnsupportedInterpolationType) but its coming from running another command:
kedro viz
can you try _path: ${path}/${run_folder}/
? the ${path:}
is telling OmegaConf that there's a resolver called path
, hence the UnsupportedInterpolationType
Maybe you are trying to use variable interpolation (template value)? Can you give an example what is the expected value? is environ.yml
a catalog?
Hello, Juan and Nok. Thank you for your reply.
I tried replacing the _path: ${path:}/${run_folder:}/
by _path: ${path}/${run_folder}/
and it still showed the error above.environ.yml
is contained in conf/base and has the following :
# Pass the path to your Databricks volume here. databricks: path: /Volumes/prod_us_cpibaws_5edb792/${env:CATALOG_ENV,'default'}/temp local: path: data/runs/
hi , could you paste the full traceback after you changed the _path
definition?
Hi, Juan. Thanks again for your reply, and apologies for the delay.
Here is the full traceback:
Starting Kedro Viz ... Process Process-1: Traceback (most recent call last): File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 112, in run_server load_and_populate_data(path, env, include_hooks, extra_params, pipeline_name) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 62, in load_and_populate_data catalog, pipelines, session_store, stats_dict = kedro_data_loader.load_data( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 105, in load_data catalog = context.catalog File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 187, in catalog return self._get_catalog() File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 223, in _get_catalog conf_catalog = self.config_loader["catalog"] File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 199, in __getitem__ base_config = self.load_and_merge_dir_config( # type: ignore[no-untyped-call] File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 339, in load_and_merge_dir_config for k, v in OmegaConf.to_container( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/omegaconf.py", line 573, in to_container return BaseContainer._to_content( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 292, in _to_content value = get_node_value(key) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 244, in get_node_value conf._format_and_raise(key=key, value=None, cause=e) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 231, in _format_and_raise format_and_raise( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 899, in format_and_raise _raise(ex, cause) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 242, in get_node_value node = node._dereference_node() File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 246, in _dereference_node node = self._dereference_node_impl(throw_on_resolution_failure=True) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 277, in _dereference_node_impl return parent._resolve_interpolation_from_parse_tree( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 584, in _resolve_interpolation_from_parse_tree resolved = self.resolve_parse_tree( File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 765, in resolve_parse_tree return visitor.visit(parse_tree) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit return tree.accept(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 206, in accept return visitor.visitConfigValue(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 101, in visitConfigValue return self.visit(ctx.getChild(0)) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit return tree.accept(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 342, in accept return visitor.visitText(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 301, in visitText return self._unescape(list(ctx.getChildren())) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 389, in _unescape text = str(self.visitInterpolation(node)) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 125, in visitInterpolation return self.visit(ctx.getChild(0)) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit return tree.accept(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 921, in accept return visitor.visitInterpolationNode(self) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 158, in visitInterpolationNode return self.node_interpolation_callback(inter_key, self.memo) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 746, in node_interpolation_callback return self._resolve_node_interpolation(inter_key=inter_key, memo=memo) File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 676, in _resolve_node_interpolation raise InterpolationKeyError(f"Interpolation key '{inter_key}' not found") omegaconf.errors.InterpolationKeyError: Interpolation key 'path' not found full_key: _path object_type=dict
, if you look at the traceback; Kedro-viz tried to access the catalog = context.catalog
and that's when the error is thrown
Can you confirm this runs fine when you do kedro run
? This help us to narrow down the scope of the issue as kedro-viz mostly just get these data from kedro
, if kedro run
works we normally don't expect issue on kedro-viz
side.
Hi, . Thanks again for your reply.
It works fine with kedro run. We are running kedro run -p <pipeline_name> --env <environment_name> successfully.
I'm not sure if it helps, but our settings.py is the following:
"""Project settings. There is no need to edit this file unless you want to change values from the Kedro defaults. For further information, including these default values, see <a target="_blank" rel="noopener noreferrer" href="https://docs.kedro.org/en/stable/kedro_project_setup/settings.html">https://docs.kedro.org/en/stable/kedro_project_setup/settings.html</a>.""" # from kedro_mlflow.framework.hooks import MlflowHook # Instantiated project hooks. from cpib_models.hooks import ConfEnvironHooks, MLFlowRunHook, SparkHooks # noqa: E402 from .settings_utils.resolvers import set_resolvers # Hooks are executed in a Last-In-First-Out (LIFO) order. HOOKS = (SparkHooks(), ConfEnvironHooks(), MLFlowRunHook()) # Installed plugins for which to disable hook auto-registration. # DISABLE_HOOKS_FOR_PLUGINS = ("kedro-viz",) from pathlib import Path # noqa: E402 from kedro_viz.integrations.kedro.sqlite_store import SQLiteStore # noqa: E402 # Class that manages storing KedroSession data. SESSION_STORE_CLASS = SQLiteStore # Keyword arguments to pass to the `SESSION_STORE_CLASS` constructor. SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2])} # Directory that holds configuration. # CONF_SOURCE = "conf" # Class that manages how configuration is loaded. from kedro.config import OmegaConfigLoader # noqa: E402 CONFIG_LOADER_CLASS = OmegaConfigLoader # Keyword arguments to pass to the `CONFIG_LOADER_CLASS` constructor. CONFIG_LOADER_ARGS = { "base_env": "base", "default_run_env": "local", "config_patterns": { "spark": ["spark*", "spark*/**"], }, } set_resolvers()
resolvers.py
is the following:
import os import mlflow from omegaconf import OmegaConf def get_model_id(): """ Set run_folder """ model_id = os.getenv("MODEL_ID") return model_id def get_run_id(tracking_uri="databricks"): """ Logic to get the run_id from running environment """ run_id = os.getenv("RUN_ID") if not run_id: model_id = get_model_id() if model_id: if os.getenv("MODEL_VERSION"): run_id = ( mlflow.MlflowClient(tracking_uri=tracking_uri) .get_model_version( name=model_id, version=os.getenv("MODEL_VERSION") ) .run_id ) else: run_id = ( mlflow.MlflowClient(tracking_uri=tracking_uri) .get_latest_versions( name=model_id, stages=[os.getenv("STAGE", "Production")] )[0] .run_id ) return run_id def get_run_folder(tracking_uri="databricks"): """ Set run_folder """ from uuid import uuid4 run_folder = None run_id = get_run_id() active_run = mlflow.active_run() if run_id: run_folder = ( mlflow.MlflowClient(tracking_uri=tracking_uri) .get_run(run_id) .data.params.get("run_folder") ) elif active_run: run_folder = active_run.data.params.get("run_folder") if run_folder is None: run_folder = os.getenv("RUN_FOLDER") if run_folder is None: run_folder = str(uuid4()) return run_folder def shift_current_month(months: int) -> str: """Generate string for the current month shifted 'monhts' month Args: months (int): Number of months to go backwards Returns: str: Shifted date """ from datetime import datetime from dateutil.relativedelta import relativedelta date = (datetime.now().replace(day=1) - relativedelta(months=months)).strftime( "%Y-%m-%d" ) return date def get_packege_version(pkg="") -> str: """Get package version""" from pip._internal.commands.show import search_packages_info version = next(search_packages_info([pkg])).version return version # %% def set_resolvers(): """ Set the resolvers for OmegaConf """ if not OmegaConf.has_resolver("env"): OmegaConf.register_new_resolver( "env", lambda key, default=None: os.getenv(key, default), ) if not OmegaConf.has_resolver("shift_current_month"): OmegaConf.register_new_resolver( "shift_current_month", shift_current_month, ) if not OmegaConf.has_resolver("package_version"): OmegaConf.register_new_resolver( "package_version", get_packege_version, )
thank you, we will have to investigate this further. I will create a github issue on this and we can look at it soon.
hey created an issue now https://github.com/kedro-org/kedro-viz/issues/2142. Thanks for checking in.
, we wil look at it this week, might need to jump on a quick call if we are not able to reproduce the error at our end. will reach out 🙂
giving an update, the issue was solved when running kedro viz run --include-hooks
Thank you everyone and and 🙂 🙌