Atlas新增clickhouse类型(TYPE)并同步元数据

前言:类型(Type)系统:

Atlas允许用户为他们想要管理的元数据对象定义模型。该模型由称为“类型”的定义组成。称为“实体”的“类型”实例表示受管理的实际元数据对象。Type System是一个允许用户定义和管理类型和实体的组件。开箱即用的Atlas管理的所有元数据对象(例如Hive表)都使用类型建模并表示为实体。要在Atlas中存储新类型的元数据,需要了解类型系统组件的概念。

一、类型实体定义

  • 要想将clickhouse的元数据同步到Atlas中,首先需要定义clickhouse相关的类型(这里是参考了spark相关类型写的,具体属性可以根据自己公司实际情况进行调整,不一定所有属性都是有用的)

    • clickhouse_db类型:

    • curl -i -X POST -H "Content-Type: application/json" -d '{
          "enumTypes": [],
          "structTypes": [],
          "classificationDefs": [],
          "entityDefs": [
              {
            "category": "ENTITY",
            "version": 1,
            "name": "clickhouse_db",
            "description": "clickhouse_db",
            "typeVersion": "1.0",
            "serviceType": "clickhouse",
            "attributeDefs": [
              {
                "name": "location",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 5
              },
              {
                "name": "clusterName",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 8
              },
              {
                "name": "parameters",
                "typeName": "map<string,string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "ownerType",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              }
            ],
            "superTypes": [
              "DataSet"
            ],
            "subTypes": [],
            "relationshipAttributeDefs": [
              {
                "name": "inputToProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "dataset_process_inputs",
                "isLegacyAttribute": false
              },
              {
                "name": "schema",
                "typeName": "array<avro_schema>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "avro_schema_associatedEntities",
                "isLegacyAttribute": false
              },
              {
                "name": "tables",
                "typeName": "array<clickhouse_table>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "clickhouse_table_db",
                "isLegacyAttribute": false
              },
              {
                "name": "meanings",
                "typeName": "array<AtlasGlossaryTerm>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "AtlasGlossarySemanticAssignment",
                "isLegacyAttribute": false
              },
              {
                "name": "outputFromProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "process_dataset_outputs",
                "isLegacyAttribute": false
              }
            ],
            "businessAttributeDefs": {}
          }
          ],
          "relationshipDefs": []
      }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
      
      
      
      
      
    • clickhouse_table类型:

    • curl -i -X POST -H "Content-Type: application/json" -d '{
          "enumTypes": [],
          "structTypes": [],
          "classificationDefs": [],
          "entityDefs": [
              {
            "category": "ENTITY",
            "version": 1,
            "name": "clickhouse_table",
            "description": "clickhouse_table",
            "typeVersion": "1.0",
            "serviceType": "clickhouse",
            "attributeDefs": [
              {
                "name": "tableType",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "provider",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 5
              },
              {
                "name": "partitionColumnNames",
                "typeName": "array<string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "bucketSpec",
                "typeName": "map<string,string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "ownerType",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "createTime",
                "typeName": "date",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "parameters",
                "typeName": "map<string,string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "comment",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 9
              },
              {
                "name": "unsupportedFeatures",
                "typeName": "array<string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "viewOriginalText",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 9
              },
              {
                "name": "schemaDesc",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 5
              },
              {
                "name": "partitionProvider",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              }
            ],
            "superTypes": [
              "DataSet"
            ],
            "subTypes": [],
            "relationshipAttributeDefs": [
              {
                "name": "inputToProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "dataset_process_inputs",
                "isLegacyAttribute": false
              },
              {
                "name": "schema",
                "typeName": "array<avro_schema>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "avro_schema_associatedEntities",
                "isLegacyAttribute": false
              },
              {
                "name": "sd",
                "typeName": "clickhouse_storagedesc",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "clickhouse_table_storagedesc",
                "isLegacyAttribute": false
              },
              {
                "name": "columns",
                "typeName": "array<clickhouse_column>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "constraints": [
                  {
                    "type": "ownedRef"
                  }
                ],
                "relationshipTypeName": "clickhouse_table_columns",
                "isLegacyAttribute": false
              },
              {
                "name": "meanings",
                "typeName": "array<AtlasGlossaryTerm>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "AtlasGlossarySemanticAssignment",
                "isLegacyAttribute": false
              },
              {
                "name": "db",
                "typeName": "clickhouse_db",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "clickhouse_table_db",
                "isLegacyAttribute": false
              },
              {
                "name": "outputFromProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "process_dataset_outputs",
                "isLegacyAttribute": false
              }
            ]
          }
          ],
          "relationshipDefs": []
      }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
    • clickhouse_column类型:

    • curl -i -X POST -H "Content-Type: application/json" -d '{
          "enumTypes": [],
          "structTypes": [],
          "classificationDefs": [],
          "entityDefs": [
              {
            "category": "ENTITY",
            "version": 1,
            "name": "clickhouse_column",
            "description": "clickhouse_column",
            "typeVersion": "1.0",
            "serviceType": "clickhouse",
            "attributeDefs": [
              {
                "name": "type",
                "typeName": "string",
                "isOptional": false,
                "cardinality": "SINGLE",
                "valuesMinCount": 1,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": true,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "nullable",
                "typeName": "boolean",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "metadata",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "comment",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 9
              }
            ],
            "superTypes": [
              "DataSet"
            ],
            "subTypes": [],
            "relationshipAttributeDefs": [
              {
                "name": "inputToProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "dataset_process_inputs",
                "isLegacyAttribute": false
              },
              {
                "name": "schema",
                "typeName": "array<avro_schema>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "avro_schema_associatedEntities",
                "isLegacyAttribute": false
              },
              {
                "name": "meanings",
                "typeName": "array<AtlasGlossaryTerm>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "AtlasGlossarySemanticAssignment",
                "isLegacyAttribute": false
              },
              {
                "name": "table",
                "typeName": "clickhouse_table",
                "isOptional": false,
                "cardinality": "SINGLE",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "clickhouse_table_columns",
                "isLegacyAttribute": false
              },
              {
                "name": "outputFromProcesses",
                "typeName": "array<Process>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "process_dataset_outputs",
                "isLegacyAttribute": false
              }
            ],
            "businessAttributeDefs": {
              "Description": [
                {
                  "name": "index_desc",
                  "typeName": "string",
                  "isOptional": true,
                  "cardinality": "SINGLE",
                  "valuesMinCount": 0,
                  "valuesMaxCount": 1,
                  "isUnique": false,
                  "isIndexable": true,
                  "includeInNotification": false,
                  "searchWeight": 5,
                  "options": {
                    "applicableEntityTypes": "[\"hive_column\",\"clickhouse_column\",\"clickhouse_column\"]",
                    "maxStrLength": "10000"
                  }
                }
              ]
            }
          }
          ],
          "relationshipDefs": []
      }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
    • clickhouse_storagedesc类型:

    • curl -i -X POST -H "Content-Type: application/json" -d '{
          "enumTypes": [],
          "structTypes": [],
          "classificationDefs": [],
          "entityDefs": [
              {
            "category": "ENTITY",
            "version": 1,
            "name": "clickhouse_storagedesc",
            "description": "clickhouse_storagedesc",
            "typeVersion": "1.0",
            "serviceType": "clickhouse",
            "attributeDefs": [
              {
                "name": "location",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": 10
              },
              {
                "name": "inputFormat",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "outputFormat",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "serde",
                "typeName": "string",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "compressed",
                "typeName": "boolean",
                "isOptional": false,
                "cardinality": "SINGLE",
                "valuesMinCount": 1,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": true,
                "includeInNotification": false,
                "searchWeight": -1
              },
              {
                "name": "parameters",
                "typeName": "map<string,string>",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": 0,
                "valuesMaxCount": 1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1
              }
            ],
            "superTypes": [
              "Referenceable"
            ],
            "subTypes": [],
            "relationshipAttributeDefs": [
              {
                "name": "meanings",
                "typeName": "array<AtlasGlossaryTerm>",
                "isOptional": true,
                "cardinality": "SET",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "AtlasGlossarySemanticAssignment",
                "isLegacyAttribute": false
              },
              {
                "name": "table",
                "typeName": "clickhouse_table",
                "isOptional": true,
                "cardinality": "SINGLE",
                "valuesMinCount": -1,
                "valuesMaxCount": -1,
                "isUnique": false,
                "isIndexable": false,
                "includeInNotification": false,
                "searchWeight": -1,
                "relationshipTypeName": "clickhouse_table_storagedesc",
                "isLegacyAttribute": false
              }
            ],
            "businessAttributeDefs": {}
          }
          ],
          "relationshipDefs": []
      }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"

 二、定义关系类型

这一步很关键,不然创建了实体以后,实体之间是建立不起来联系的,比如从表名跳转不到列名上

#/v2/types/typedefs
{
  "entityDefs": [],
  "classificationDefs": [],
  "structDefs": [],
  "enumDefs": [],
  "relationshipDefs": [
    {
      "category": "RELATIONSHIP",
      "version": 1,
      "name": "clickhouse_table_db",
      "description": "clickhouse_table_db",
      "typeVersion": "1.0",
      "serviceType": "clickhouse",
      "attributeDefs": [],
      "relationshipCategory": "AGGREGATION",
      "propagateTags": "NONE",
      "endDef1": {
        "type": "clickhouse_table",
        "name": "db",
        "isContainer": false,
        "cardinality": "SINGLE",
        "isLegacyAttribute": false
      },
      "endDef2": {
        "type": "clickhouse_db",
        "name": "tables",
        "isContainer": true,
        "cardinality": "SET",
        "isLegacyAttribute": false
      }
    },
    {
      "category": "RELATIONSHIP",
      "version": 1,
      "name": "clickhouse_table_columns",
      "description": "clickhouse_table_columns",
      "typeVersion": "1.0",
      "serviceType": "clickhouse",
      "attributeDefs": [],
      "relationshipCategory": "COMPOSITION",
      "propagateTags": "NONE",
      "endDef1": {
        "type": "clickhouse_table",
        "name": "columns",
        "isContainer": true,
        "cardinality": "SET",
        "isLegacyAttribute": false
      },
      "endDef2": {
        "type": "clickhouse_column",
        "name": "table",
        "isContainer": false,
        "cardinality": "SINGLE",
        "isLegacyAttribute": false
      }
    },
    {
      "category": "RELATIONSHIP",
      "version": 1,
      "name": "clickhouse_table_storagedesc",
      "description": "clickhouse_table_storagedesc",
      "typeVersion": "1.0",
      "serviceType": "clickhouse",
      "attributeDefs": [],
      "relationshipCategory": "ASSOCIATION",
      "propagateTags": "NONE",
      "endDef1": {
        "type": "clickhouse_table",
        "name": "sd",
        "isContainer": false,
        "cardinality": "SINGLE",
        "isLegacyAttribute": false
      },
      "endDef2": {
        "type": "clickhouse_storagedesc",
        "name": "table",
        "isContainer": false,
        "cardinality": "SINGLE",
        "isLegacyAttribute": false
      }
    }
  ]
}

三、采集元数据

这部分这里就不详细说明了,实现方法有非常多,由于clickhouse元数据不涉及太多血缘,所以我们直接用clickhouse自带的元数据做了一张表,里面包含了一些主要的信息,例如库名、表名、字段英文名、字段中文名等,我给出一个简单的sql,你们可以根据自己实际情况调整,或者自己写一个hook工具都可以

select
        database `db_name`,
        table `tbl_name`,
        name `column_name`,
        type `column_type`,
        default_expression `is_nullable`,
        is_in_partition_key `partition_key`,
        is_in_primary_key `column_key`,
        comment `column_comment`,
        position `column_position`
    from system.columns

四、同步元数据

同步元数据有两种方式,一种是Atlas自带的API,另一种是往Atlas里的Kafka写消息,下面我分别介绍:

1.自带API

API文档可以从以下路径找:

找到这个API后,点“try it out”,输入以下JSON:

  {"entities": [
                {
                    "typeName": "clickhouse_table",
                    "attributes": {
                        "owner": "bi",
                        "ownerType": "USER",
                        "sd": {
                            "typeName": "clickhouse_storagedesc",
                            "attributes": {
                                "qualifiedName": "bi_app.wuxl_0316_rr@primary_storage",
                                "name": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
                                "location": "hdfs://HDFS80727/bi/bi_app.db/wuxl_0316_rr",
                                "compressed": false,
                                "inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                                "outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                                "parameters": {
                                    "serialization.format": "1"
                                }
                            },
                            "guid": "-28224574948884002",
                            "version": 0,
                            "proxy": false
                        },
                        "tableType": "MANAGED",
                        "createTime": 1709003223000,
                        "qualifiedName": "bi_app.wuxl_0316_rr@primary",
                        "columns": [
                            {
                                "typeName": "clickhouse_column",
                                "attributes": {
                                    "qualifiedName": "bi_app.wuxl_0316_rr.column1@primary",
                                    "name": "column1",
                                    "comment": "ziduan1",
                                    "type": "string",
                                    "table": {
                                        "typeName": "clickhouse_table",
                                        "attributes": {
                                            "qualifiedName": "bi_app.wuxl_0316_rr@primary"
                                        },
                                        "guid": "-28224574948884003",
                                        "version": 0,
                                        "proxy": false
                                    }
                                },
                                "guid": "-28224574948884005",
                                "version": 0,
                                "proxy": false
                            },
                            {
                                "typeName": "clickhouse_column",
                                "attributes": {
                                    "qualifiedName": "bi_app.wuxl_0316_rr.column2@primary",
                                    "name": "column2",
                                    "comment": "ziduan2",
                                    "type": "string",
                                    "table": {
                                        "typeName": "clickhouse_table",
                                        "attributes": {
                                            "qualifiedName": "bi_app.wuxl_0316_rr@primary"
                                        },
                                        "guid": "-28224574948884003",
                                        "version": 0,
                                        "proxy": false
                                    }
                                },
                                "guid": "-28224574948884006",
                                "version": 0,
                                "proxy": false
                            }
                        ],
                        "name": "wuxl_0316_rr",
                        "comment": "测试表",
                        "parameters": {
                            "transient_lastDdlTime": "1709003223"
                        },
                        "db": {
                            "typeName": "clickhouse_db",
                            "attributes": {
                                "owner": "bi",
                                "ownerType": "USER",
                                "qualifiedName": "bi_app@primary",
                                "clusterName": "primary",
                                "name": "bi_app",
                                "description": "",
                                "location": "hdfs://HDFS80727/bi/bi_app.db",
                                "parameters": {

                                }
                            },
                            "guid": "-28224574948884001",
                            "version": 0,
                            "proxy": false
                        }
                    },
                    "guid": "-28224574948884003",
                    "version": 0,
                    "proxy": false,
                    "relationships":{
                        "typeName":"clickhouse_table_db",
                        "db":["-28224574948884001"],
                        "end1": {
                          "typeName": "clickhouse_table",
                          "guid": "-28224574948884003"
                        },
                        "end2": {
                          "typeName": "clickhouse_db",
                          "guid": "-28224574948884001"
                        }
                    }
                },
                {
                    "typeName": "clickhouse_db",
                    "attributes": {
                        "owner": "bi",
                        "ownerType": "USER",
                        "qualifiedName": "bi_app@primary",
                        "clusterName": "primary",
                        "name": "bi_app",
                        "description": "",
                        "location": "hdfs://HDFS80727/bi/bi_app.db",
                        "parameters": {

                        }
                    },
                    "guid": "-28224574948884001",
                    "version": 0,
                    "proxy": false,
                    "relationships":{}
                },
                {
                    "typeName": "clickhouse_storagedesc",
                    "attributes": {
                        "qualifiedName": "bi_app.wuxl_0316_rr@primary_storage",
                        "name": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
                        "location": "hdfs://HDFS80727/bi/bi_app.db/wuxl_0316_rr",
                        "compressed": false,
                        "inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                        "outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                        "parameters": {
                            "serialization.format": "1"
                        }
                    },
                    "guid": "-28224574948884002",
                    "version": 0,
                    "proxy": false,
                    "relationships":{
                        "typeName":"clickhouse_table_storagedesc",
                        "table":["-28224574948884003"],
                        "end1": {
                          "typeName": "clickhouse_storagedesc",
                          "guid": "-28224574948884002"
                        },
                        "end2": {
                          "typeName": "clickhouse_table",
                          "guid": "-28224574948884003"
                        }
                    }
                },
                {
                    "typeName": "clickhouse_column",
                    "attributes": {
                        "qualifiedName": "bi_app.wuxl_0316_rr.column1@primary",
                        "name": "column1",
                        "comment": "ziduan1",
                        "type": "string",
                        "table": {
                            "typeName": "clickhouse_table",
                            "attributes": {
                                "qualifiedName": "bi_app.wuxl_0316_rr@primary"
                            },
                            "guid": "-28224574948884003",
                            "version": 0,
                            "proxy": false
                        }
                    },
                    "guid": "-28224574948884005",
                    "version": 0,
                    "proxy": false,
                    "relationships":{
                        "typeName":"clickhouse_table_columns",
                        "table":["-28224574948884003"],
                        "end1": {
                          "typeName": "clickhouse_column",
                          "guid": "-28224574948884005"
                        },
                        "end2": {
                          "typeName": "clickhouse_table",
                          "guid": "-28224574948884003"
                        }
                    }
                },
                {
                    "typeName": "clickhouse_column",
                    "attributes": {
                        "qualifiedName": "bi_app.wuxl_0316_rr.column2@primary",
                        "name": "column2",
                        "comment": "ziduan2",
                        "type": "string",
                        "table": {
                            "typeName": "clickhouse_table",
                            "attributes": {
                                "qualifiedName": "bi_app.wuxl_0316_rr@primary"
                            },
                            "guid": "-28224574948884003",
                            "version": 0,
                            "proxy": false
                        }
                    },
                    "guid": "-28224574948884006",
                    "version": 0,
                    "proxy": false,
                    "relationships":{
                        "typeName":"clickhouse_table_columns",
                        "table":["-28224574948884003"],
                        "end1": {
                          "typeName": "clickhouse_column",
                          "guid": "-28224574948884006"
                        },
                        "end2": {
                          "typeName": "clickhouse_table",
                          "guid": "-28224574948884003"
                        }
                    }
                }
            ]}

 

2.Kafka

直接往Atlas自带的“ATLAS_HOOK” topic里写消息,atlas会解析并创建实体和实体间的关系

-- 使用Flinksql往Atlas自带的topic里写消息
CREATE TABLE ads_zdm_offsite_platform_daren_rank_df_to_kafka (
        data string
) WITH (
  'connector' = 'kafka',
  'topic' = 'ATLAS_HOOK',
  'properties.bootstrap.servers' = 'localhost:9092', 
  'format' = 'raw'
);

insert into ads_zdm_offsite_platform_daren_rank_df_to_kafka
select '{"version":{"version":"1.0.0","versionParts":[1]},"msgCompressionKind":"NONE","msgSplitIdx":1,"msgSplitCount":1,"msgSourceIP":"10.45.1.116","msgCreatedBy":"bi","msgCreationTime":1710575827820,"message":{"type":"ENTITY_CREATE_V2","user":"bi","entities":{"entities":[{"typeName":"clickhouse_table","attributes":{"owner":"bi","ownerType":"USER","sd":{"typeName":"clickhouse_storagedesc","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary_storage","name":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","location":"hdfs://HDFS80727/bi/test.db/wuxl_0316_ss","compressed":false,"inputFormat":"org.apache.hadoop.mapred.TextInputFormat","outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","parameters":{"serialization.format":"1"}},"guid":"-861237351166887","version":0,"proxy":false},"tableType":"MANAGED","createTime":1710575827000,"qualifiedName":"test.wuxl_0316_ss@primary","columns":[{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_1@primary","name":"column_tt_1","comment":"测试字段1","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166890","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_2@primary","name":"column_tt_2","comment":"测试字段2","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166891","version":0,"proxy":false}],"name":"wuxl_0316_ss","comment":"测试表","parameters":{"transient_lastDdlTime":"1710575827"},"db":{"typeName":"clickhouse_db","attributes":{"owner":"bi","ownerType":"USER","qualifiedName":"test@primary","clusterName":"primary","name":"test","description":"","location":"hdfs://HDFS80727/bi/test.db","parameters":{}},"guid":"-861237351166886","version":0,"proxy":false}},"guid":"-861237351166888","version":0,"proxy":false},{"typeName":"clickhouse_db","attributes":{"owner":"bi","ownerType":"USER","qualifiedName":"test@primary","clusterName":"primary","name":"test","description":"","location":"hdfs://HDFS80727/bi/test.db","parameters":{}},"guid":"-861237351166886","version":0,"proxy":false},{"typeName":"clickhouse_storagedesc","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary_storage","name":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","location":"hdfs://HDFS80727/bi/test.db/wuxl_0316_ss","compressed":false,"inputFormat":"org.apache.hadoop.mapred.TextInputFormat","outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","parameters":{"serialization.format":"1"}},"guid":"-861237351166887","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_1@primary","name":"column_tt_1","comment":"测试字段1","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166890","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_2@primary","name":"column_tt_2","comment":"测试字段2","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166891","version":0,"proxy":false}]}}}' as data
;

五、其他说明

要注意第四步里边的guid,由于是新建的实体,Atlas还没有为其创建guid(全局唯一ID),所以这里可以自己生成一个“-xxxxxxxxx”的id,生成的这个id不会写到Atlas里,这个的作用是用来表述实体间的关系的,比如字段对应的表是哪个,在创建字段实体的时候,就要指定它对应的表的虚拟guid,这样Atlas在创建的时候就会创建对应的关系

 

上一篇:谷歌发布Bard AI以与ChatGPT/GPT-4竞争


下一篇:好消息!国外发布STOP等几款勒索病毒最新解密工具