Hi folks,
We have our own MLFlow server on internal S3.
Below are the setting I used locally:
os.environ["MLFLOW_TRACKING_URI"] = "<a target="_blank" rel="noopener noreferrer" href="https://xxx.com/mlflow/">https://xxx.com/mlflow/</a>" os.environ["MLFLOW_S3_ENDPOINT_URL"] = "<a target="_blank" rel="noopener noreferrer" href="http://s3xxx.com">http://s3xxx.com</a>" os.environ["S3_BUCKET_PATH"] = "<a target="_blank" rel="noopener noreferrer" href="s3://xxx/mlflow">s3://xxx/mlflow</a>" os.environ["AWS_ACCESS_KEY_ID"] = "xxx" os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx" os.environ['MLFLOW_TRACKING_USERNAME'] = 'xxx' os.environ['MLFLOW_TRACKING_PASSWORD'] = 'xxx' os.environ["MLFLOW_TRACKING_SERVER_CERT_PATH"] = "C:\\xxx\\ca-bundle.crt" EXPERIMENT_NAME = "ZeMC012"In order to use in Kedro framework, I create a mlflow.yml file in conf/local folder and the content like this:
server: mlflow_tracking_uri: <a target="_blank" rel="noopener noreferrer" href="https://xxx.com/mlflow/">https://xxx.com/mlflow/</a> MLFLOW_S3_ENDPOINT_URL: <a target="_blank" rel="noopener noreferrer" href="http://s3xxx.com">http://s3xxx.com</a> S3_BUCKET_PATH: <a target="_blank" rel="noopener noreferrer" href="s3://xxx/mlflow">s3://xxx/mlflow</a> AWS_ACCESS_KEY_ID: xxx AWS_SECRET_ACCESS_KEY: xxx MLFLOW_TRACKING_USERNAME: xxx MLFLOW_TRACKING_PASSWORD: xxx MLFLOW_EXPERIMENT_NAME: ZeMC012 MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crtBut I got error
ValidationError: 8 validation errors for KedroMlflowConfig
Hi Shu-Chun,
Have you tried using the Kedro-MLflow plugin? Here's the link for more details: Kedro-MLflow Setup. It helps generate a correct mlflow.yml
file, and as I understand, there should be multiple sections included.
After I used kedro mlflow init
to generate mlflow.yml, I don't see the those parameters in the template:
MLFLOW_S3_ENDPOINT_URL: <a target="_blank" rel="noopener noreferrer" href="http://s3xxx.com">http://s3xxx.com</a> S3_BUCKET_PATH: <a target="_blank" rel="noopener noreferrer" href="s3://xxx/mlflow">s3://xxx/mlflow</a> MLFLOW_TRACKING_USERNAME: xxx MLFLOW_TRACKING_PASSWORD: xxx MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crtWhere and how should I put those parameters?
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)
MaxRetryError: HTTPSConnectionPool(host='xxx.com', port=443): Max retries exceeded with url: /mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
SSLError: HTTPSConnectionPool(host='xxx.com', port=443): Max retries exceeded with url: /mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
MlflowException: API request to <a target="_blank" rel="noopener noreferrer" href="https://xxx.com/mlflow/api/2.0/mlflow/experiments/get-by-name">https://xxx.com/mlflow/api/2.0/mlflow/experiments/get-by-name</a> failed with exception HTTPSConnectionPool(host='dad-rbg.icp.infineon.com', port=443): Max retries exceeded with url: /mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
, you can add those settings manually under the tracking
section. It seems the errors are occurring because the connection to the MLflow server wasn't properly established, likely due to a missing MLFLOW_TRACKING_SERVER_CERT_PATH
.
what do you mean about tracking
section? Which file could I add MLFLOW_TRACKING_SERVER_CERT_PATH
?
It looks like you should try to split them into two groups. Some variables, like MLFLOW_S3_ENDPOINT_URL
, S3_BUCKET_PATH
, and MLFLOW_TRACKING_SERVER_CERT_PATH
, should remain as OS environment variables, as they were originally. The credentials for MLflow tracking (username and password) should be specified in mlflow.yml
under the credentials
section (as shown in the manual: Kedro Data Catalog - Dataset Access Credentials). Alternatively, you could try specifying them as environment variables as well.
But after I run kedro mlflow init
The mlflow.yml file is written:
# All credentials needed for mlflow must be stored in credentials .yml as a dict # they will be exported as environment variable # If you want to set some credentials, e.g. AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY # > in `credentials.yml`: # your_mlflow_credentials: # AWS_ACCESS_KEY_ID: 132456 # AWS_SECRET_ACCESS_KEY: 132456 # > in this file `mlflow.yml`: # credentials: mlflow_credentialsHere mixes up AWS credential and mlflow credentail, which is not clear for me. Do I need both?
server: mlflow_tracking_uri: <a target="_blank" rel="noopener noreferrer" href="https://xxx.com/mlflow/">https://xxx.com/mlflow/</a> mlflow_registry_uri: null credentials: mlflow_credentials request_header_provider: type: null pass_context: False init_kwargs: {}And in credentials.yml, I have:
mlflow_credentials: MLFLOW_TRACKING_USERNAME: xxx MLFLOW_TRACKING_PASSWORD: xxxBoth mlflow.yml and credentials.yml are in
conf/local
folder.MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crt