Join the Kedro community

M
M
D
M
M

Passing credentials to a node in kedro

Hey guys,

Do we know how to pass credentials to a node in kedro? Is it only meant to be accessed by a dataset loader? I have a code that makes API calls (LLMs) and either I get them set as env or pass them from my local credentials? I can manually load for sure but i was looking for a better way. Maybe similar to parameters, credentials: openai?

1
I
N
F
25 comments

I've been down this road before! There's lots of ways to handle this.

One easy way is to just use a hook that loads from a .env file with the dotenv package.

Langchain won't ever make it to my codebase so it's not applicable but dotenv approach makes sense. Will use that one. thanks for the links

Weird, my hooks don't get invoked:

logger = get_logger(__name__)


class EnvHook:
    @hook_impl
    def after_catalog_created(self, *args, **kwargs) -> None:
        """ User credentials.yml file over .env. If not found, fall back to .env """
        creds = kwargs.get("conf_creds")

        if creds is None:
            load_dotenv()
        else:
            env = creds["envs"]
            for k, v in env.items():
                k = k.upper()   # Convert envs to upper case if they are not already.

                if k in os.environ:
                    logger.warn(f"Environment variable {k} is already set to {v}")
                    continue  # In case env is already set don't touch it
                if v is not None:
                    os.environ[k] = v
I have set it update in settings

from projx.hooks import EnvHook
# Hooks are executed in a Last-In-First-Out (LIFO) order.
HOOKS = (EnvHook(),)
Do we know why my hook don't get executed?

According this line: https://github.com/kedro-org/kedro/blob/main/kedro/framework/context/context.py#L244

It should call after catalog is loaded but when I invoked kedro context.catalog.list(), my hook is not being executed.

why not create a "CredentialsDataset" and feed it to the node? in that way it won't expose the credentials on kedro-viz

Can you attach a debugger or breakpoint inside the hook? but I suspect **kwargs is just None since it is not a hook spec

put conf_cred explicitly as an argument

I put a breakpoint but it never hits for some reason. That's why I can't debug further.

I didn't use dataset cuz I don't wanna pass it to nodes, this way each node can reach it when they need it

I see you put the hook correctly so the hook should get registered.

HOOKS = (EnvHook(),)

https://docs.kedro.org/en/stable/hooks/introduction.html#hook-implementation

Try to copy one of these dummy hook and see if they get run at all - most likely some tiny mistake somewhere

Will give it a try. Otherwise I couldn't make sense of it either.

So I still can't make it work. Full setup as follows:

conftest.py

@fixture(scope='session')
def kedro_context(config_loader):
    return KedroContext(
        package_name="projx",
        project_path=PROJECT_PATH,
        config_loader=config_loader,
        hook_manager=_create_hook_manager(),
        env="test"
    )
test_run.py
    def test_catalog_hook(kedro_context):
        # Invoke catalog loading
        catalog = kedro_context.catalog.list()
        
        # By this time we should be running the catalog hook
        print("stop")
hooks.py
class EnvHook:
    @hook_impl
    def after_catalog_created(self, *args, **kwargs) -> None:
        """ User credentials.yml file over .env. If not found, fall back to .env """
        print("Hello")
Trying this and debugger doesn't stop as hook code. I don't see print statements when i run as well

Also tried different hooks, none of gets executed. What I find is that if I pass session, then i can reach the hook code but using only kedro context isn't possible somehow

    def test_project_catalog(kedro_context, kedro_session):
        # Invoke catalog parsing to see if there is a problem in the catalog
        print(kedro_context.catalog.list())
        kedro_session.load_context()        # -> This invokes hooks 

I believe i found the cause:

In session we do:

hook_manager = _create_hook_manager()
        _register_hooks(hook_manager, settings.HOOKS)
        _register_hooks_entry_points(hook_manager, settings.DISABLE_HOOKS_FOR_PLUGINS)
while during context creation i don't call the register hooks function which i assume ignores them all. Is there a reason why these functions are private?

Going down in rabbit hole, I can't figure out why but hooks work if I create a session but without session, using only context, i can't seem to register them so there is no way running without a session 🀷

Ah, alright, so you doing it in tests instead of doing a run

yes, i wanted to test the hook itself πŸ˜„

Would you mind opening an issue about this? I have an idea to generate some more useful scaffolding tests. I don't think our documentation cover how to test hooks properly, internally we have some team worked on a testing plugin before.

^ I am trying to ask help from the team as well, if there is no one taking this I will come back to this later!

Sure, will do that now

Add a reply
Sign up and join the conversation on Slack
Join