superset helmchart 启用缓存

huangapple go评论71阅读模式
英文:

superset helmchart enable cache

问题

我已经通过Helm Chart在我的Kubernetes环境中安装了Superset,我从官方文档和存储库https://github.com/apache/superset 获取了所有信息。

我尝试通过Helm Chart而不是通过UI每隔12小时自动刷新仪表板上的数据,我了解到可以通过启用Superset缓存来实现这一目标,因此数据将被缓存12小时,然后动态刷新,每个访问Superset UI的用户都可以看到相同的值。

我现在的问题是...我可以在superset/config.py文件中看到缓存配置:

# Default cache for Superset objects
CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for datasource metadata and query results
DATA_CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for dashboard filter state (`CACHE_TYPE` defaults to `SimpleCache` when
#  running in debug mode unless overridden)
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()),
    # should the timeout be reset when retrieving a cached value
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

# Cache for explore form data state (`CACHE_TYPE` defaults to `SimpleCache` when
#  running in debug mode unless overridden)
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
    # should the timeout be reset when retrieving a cached value
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

根据文档,我正在使用Helm Chart的configOverrides部分来覆盖默认值并启用配置、数据、过滤和探索缓存,但我找不到如何执行此操作的示例,我尝试了一切都在helmrelease中失败。

如果我尝试覆盖一个或多个缓存值,它会失败(config.py https://github.com/apache/superset/blob/master/superset/config.py),这是我尝试覆盖的不同方式之一,检查helm值文件、模板和superser config.py(以及其他文章):

configOverrides:
  cache_config: |
    CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_cache_'
      }    
  data_cache_config: |
    DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_data_'
    }    
  filter_cache_config: |
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_filter_'
    }    
  explore_cache_config: |
    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_explore_'
    }    

请帮助一下吗?或者将我重定向到一些有示例的好文档!另外,我现在使用的Redis安装是由Helm Chart创建的默认安装,我没有对它进行任何更改。

英文:

I have superset installed via helmchart in my kubernetes environment, I took everything from official documentation and repository: https://github.com/apache/superset

I'm trying to archive a data autorefresh of the dashboard every 12 hours via helmchart and not via the UI; I read that this can be done enabling superset cache so data will be cached for 12 hours and then dynamically refreshed and everyone that access superset UI can see the same values.

My problem now is one.... I can see the cache configuration on the superset/config.py file:

# Default cache for Superset objects
CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for datasource metadata and query results
DATA_CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for dashboard filter state (`CACHE_TYPE` defaults to `SimpleCache` when
#  running in debug mode unless overridden)
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()),
    # should the timeout be reset when retrieving a cached value
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

# Cache for explore form data state (`CACHE_TYPE` defaults to `SimpleCache` when
#  running in debug mode unless overridden)
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
    # should the timeout be reset when retrieving a cached value
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

As per documentation I'm using the configOverrides section of the helmchart to overwrite the default values and enable the cache of config, data, filter ad explore but I can't find any example of how to do it and everything I try always fail in helmrelease.

I try to read the helmchart but looks that it take all the configOverrides section and I was not able to find where it overwrite those specific values.

Some example of what I try to overwrite, for example enabling some flag works without problem:

configOverrides:
  enable_flags: |
    FEATURE_FLAGS = {
        "DASHBOARD_NATIVE_FILTERS": True,
        "ENABLE_TEMPLATE_PROCESSING": True,
        "DASHBOARD_CROSS_FILTERS": True,
        "DYNAMIC_PLUGINS": True,
        "VERSIONED_EXPORT": True,
        "DASHBOARD_RBAC": True,
    }

But if I try to overwrite one or more cache value it fail (config.py https://github.com/apache/superset/blob/master/superset/config.py), this is one of the different way I try to overwrite checking the helm value file, the template and the superser config.py (and checkign other articles):

configOverrides:
  cache_config: |
    CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_cache_'
      }
  data_cache_config: |
    DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_data_'
    }
  filter_cache_config: |
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_filter_'
    }
  explore_cache_config: |
    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_explore_'
    }

Any help pls? Or redirect to some good documentation that has example! Ps the redis installation I have it's the default one created by the helmchart, I didn't change anything on it.

答案1

得分: 2

你的 configOverrides 应该像这样:

configOverrides:
  cache_config: |
    from datetime import timedelta
    from superset.superset_typing import CacheConfig

    CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_cache_'
      }
    
    DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_data_'
    }
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_filter_'
    }

    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_explore_'
    }    

具体信息如下:

在使用你的设置运行 helm 安装后,你的配置文件将会类似于下面这样:

import os
from cachelib.redis import RedisCache
...
CACHE_CONFIG = {
      'CACHE_TYPE': 'redis',
      'CACHE_DEFAULT_TIMEOUT': 300,
      'CACHE_KEY_PREFIX': 'superset_',
      'CACHE_REDIS_HOST': env('REDIS_HOST'),
      'CACHE_REDIS_PORT': env('REDIS_PORT'),
      'CACHE_REDIS_PASSWORD': env('REDIS_PASSWORD'),
      'CACHE_REDIS_DB': env('REDIS_DB', 1),
}
DATA_CACHE_CONFIG = CACHE_CONFIG
...
# 覆盖配置
# cache_config
CACHE_CONFIG: CacheConfig = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
    'CACHE_KEY_PREFIX': 'superset_cache_'
  }

# data_cache_config
DATA_CACHE_CONFIG: CacheConfig = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
    'CACHE_KEY_PREFIX': 'superset_data_'
}

# enable_flags
FEATURE_FLAGS = {
    "DASHBOARD_NATIVE_FILTERS": True,
    "ENABLE_TEMPLATE_PROCESSING": True,
    "DASHBOARD_CROSS_FILTERS": True,
    "DYNAMIC_PLUGINS": True,
    "VERSIONED_EXPORT": True,
    "DASHBOARD_RBAC": True,
}

# explore_cache_config
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
    'CACHE_KEY_PREFIX': 'superset_explore_'
}
# filter_cache_config
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
    'CACHE_KEY_PREFIX': 'superset_filter_'
}

当我查看 Pod 日志时,由于函数 timedelta 未定义而导致了许多错误,以下是我看到的日志示例:

File "/app/pythonpath/superset_config.py", line 42, in <module>
    'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
    NameError: name 'timedelta' is not defined

有关的文件 /app/pythonpath/superset_config.py 是通过导入 此处 加载的,正如文件顶部的注释所提到的。

请注意,你正在编写一个全新的 .py 文件,这意味着你需要在 configOverrides 部分的顶部添加 from datetime import timedelta

但是,由于 Helm Chart 中的文档警告如下:警告: 顺序不受保证,文件可以作为 helm --set-file configOverrides.my-override=my-file.py 传递,而你明显希望使用函数 timedelta,我们必须将所有三个块合并到同一部分,如下所示:

configOverrides:
  cache_config: |
    from datetime import timedelta

    CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_cache_'
      }
    
    DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_data_'
    }
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_filter_'
    }

    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': int(timedelta(hours=6).total_seconds()),
        'CACHE_KEY_PREFIX': 'superset_explore_'
    }    

此外,你希望使用类型 CacheConfig,因此我们还应该在顶部导入它。

英文:

TL;DR; your configOverrides should look like this:

configOverrides:
  cache_config: |
    from datetime import timedelta
    from superset.superset_typing import CacheConfig

    CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_cache_&#39;
      }
    
    DATA_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_data_&#39;
    }
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_filter_&#39;
    }

    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_explore_&#39;
    }

Details:

After running a helm install with your settings, your config file will look a bit like this:

import os
from cachelib.redis import RedisCache
...
CACHE_CONFIG = {
      &#39;CACHE_TYPE&#39;: &#39;redis&#39;,
      &#39;CACHE_DEFAULT_TIMEOUT&#39;: 300,
      &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_&#39;,
      &#39;CACHE_REDIS_HOST&#39;: env(&#39;REDIS_HOST&#39;),
      &#39;CACHE_REDIS_PORT&#39;: env(&#39;REDIS_PORT&#39;),
      &#39;CACHE_REDIS_PASSWORD&#39;: env(&#39;REDIS_PASSWORD&#39;),
      &#39;CACHE_REDIS_DB&#39;: env(&#39;REDIS_DB&#39;, 1),
}
DATA_CACHE_CONFIG = CACHE_CONFIG
...
# Overrides
# cache_config
CACHE_CONFIG: CacheConfig = {
    &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
    &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
    &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_cache_&#39;
  }

# data_cache_config
DATA_CACHE_CONFIG: CacheConfig = {
    &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
    &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
    &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_data_&#39;
}

# enable_flags
FEATURE_FLAGS = {
    &quot;DASHBOARD_NATIVE_FILTERS&quot;: True,
    &quot;ENABLE_TEMPLATE_PROCESSING&quot;: True,
    &quot;DASHBOARD_CROSS_FILTERS&quot;: True,
    &quot;DYNAMIC_PLUGINS&quot;: True,
    &quot;VERSIONED_EXPORT&quot;: True,
    &quot;DASHBOARD_RBAC&quot;: True,
}

# explore_cache_config
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
    &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
    &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
    &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_explore_&#39;
}
# filter_cache_config
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
    &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
    &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
    &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_filter_&#39;
}

When I looked at the pod logs, there were a lot of errors due to the function timedelta not being defined, here is a sample of the logs I can see:

File &quot;/app/pythonpath/superset_config.py&quot;, line 42, in &lt;module&gt;
    &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
    NameError: name &#39;timedelta&#39; is not defined

The file in question, /app/pythonpath/superset_config.py , is loaded via an import here as mentioned in the comment at the top of the file.

Notice that you're writing a fresh new .py file; which means that you need to add from datetime import timedelta at the top in the configOverrides section.

However, since the doc in the helm chart states the following warning WARNING: the order is not guaranteed Files can be passed as helm --set-file configOverrides.my-override=my-file.py, and you clearly want to use the function timedelta, we must combine all three blocks under the same section like this:

configOverrides:
  cache_config: |
    from datetime import timedelta

    CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_cache_&#39;
      }
    
    DATA_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_data_&#39;
    }
    FILTER_STATE_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_filter_&#39;
    }

    EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
        &#39;CACHE_TYPE&#39;: &#39;RedisCache&#39;,
        &#39;CACHE_DEFAULT_TIMEOUT&#39;: int(timedelta(hours=6).total_seconds()),
        &#39;CACHE_KEY_PREFIX&#39;: &#39;superset_explore_&#39;
    }

Furthermore, you wanted to use the type CacheConfig, so we should also include an import for it at the top.

huangapple
  • 本文由 发表于 2023年3月7日 21:31:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75662620.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定