前言:类型(Type)系统:
Atlas允许用户为他们想要管理的元数据对象定义模型。该模型由称为“类型”的定义组成。称为“实体”的“类型”实例表示受管理的实际元数据对象。Type System是一个允许用户定义和管理类型和实体的组件。开箱即用的Atlas管理的所有元数据对象(例如Hive表)都使用类型建模并表示为实体。要在Atlas中存储新类型的元数据,需要了解类型系统组件的概念。
一、类型实体定义
-
要想将clickhouse的元数据同步到Atlas中,首先需要定义clickhouse相关的类型(这里是参考了spark相关类型写的,具体属性可以根据自己公司实际情况进行调整,不一定所有属性都是有用的)
-
clickhouse_db类型:
-
curl -i -X POST -H "Content-Type: application/json" -d '{ "enumTypes": [], "structTypes": [], "classificationDefs": [], "entityDefs": [ { "category": "ENTITY", "version": 1, "name": "clickhouse_db", "description": "clickhouse_db", "typeVersion": "1.0", "serviceType": "clickhouse", "attributeDefs": [ { "name": "location", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 5 }, { "name": "clusterName", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 8 }, { "name": "parameters", "typeName": "map<string,string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "ownerType", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 } ], "superTypes": [ "DataSet" ], "subTypes": [], "relationshipAttributeDefs": [ { "name": "inputToProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "dataset_process_inputs", "isLegacyAttribute": false }, { "name": "schema", "typeName": "array<avro_schema>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "avro_schema_associatedEntities", "isLegacyAttribute": false }, { "name": "tables", "typeName": "array<clickhouse_table>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "clickhouse_table_db", "isLegacyAttribute": false }, { "name": "meanings", "typeName": "array<AtlasGlossaryTerm>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "AtlasGlossarySemanticAssignment", "isLegacyAttribute": false }, { "name": "outputFromProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "process_dataset_outputs", "isLegacyAttribute": false } ], "businessAttributeDefs": {} } ], "relationshipDefs": [] }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
-
clickhouse_table类型:
-
curl -i -X POST -H "Content-Type: application/json" -d '{ "enumTypes": [], "structTypes": [], "classificationDefs": [], "entityDefs": [ { "category": "ENTITY", "version": 1, "name": "clickhouse_table", "description": "clickhouse_table", "typeVersion": "1.0", "serviceType": "clickhouse", "attributeDefs": [ { "name": "tableType", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "provider", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 5 }, { "name": "partitionColumnNames", "typeName": "array<string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "bucketSpec", "typeName": "map<string,string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "ownerType", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "createTime", "typeName": "date", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "parameters", "typeName": "map<string,string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "comment", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 9 }, { "name": "unsupportedFeatures", "typeName": "array<string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "viewOriginalText", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 9 }, { "name": "schemaDesc", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 5 }, { "name": "partitionProvider", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 } ], "superTypes": [ "DataSet" ], "subTypes": [], "relationshipAttributeDefs": [ { "name": "inputToProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "dataset_process_inputs", "isLegacyAttribute": false }, { "name": "schema", "typeName": "array<avro_schema>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "avro_schema_associatedEntities", "isLegacyAttribute": false }, { "name": "sd", "typeName": "clickhouse_storagedesc", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "clickhouse_table_storagedesc", "isLegacyAttribute": false }, { "name": "columns", "typeName": "array<clickhouse_column>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "constraints": [ { "type": "ownedRef" } ], "relationshipTypeName": "clickhouse_table_columns", "isLegacyAttribute": false }, { "name": "meanings", "typeName": "array<AtlasGlossaryTerm>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "AtlasGlossarySemanticAssignment", "isLegacyAttribute": false }, { "name": "db", "typeName": "clickhouse_db", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "clickhouse_table_db", "isLegacyAttribute": false }, { "name": "outputFromProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "process_dataset_outputs", "isLegacyAttribute": false } ] } ], "relationshipDefs": [] }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
-
clickhouse_column类型:
-
curl -i -X POST -H "Content-Type: application/json" -d '{ "enumTypes": [], "structTypes": [], "classificationDefs": [], "entityDefs": [ { "category": "ENTITY", "version": 1, "name": "clickhouse_column", "description": "clickhouse_column", "typeVersion": "1.0", "serviceType": "clickhouse", "attributeDefs": [ { "name": "type", "typeName": "string", "isOptional": false, "cardinality": "SINGLE", "valuesMinCount": 1, "valuesMaxCount": 1, "isUnique": false, "isIndexable": true, "includeInNotification": false, "searchWeight": -1 }, { "name": "nullable", "typeName": "boolean", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "metadata", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "comment", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 9 } ], "superTypes": [ "DataSet" ], "subTypes": [], "relationshipAttributeDefs": [ { "name": "inputToProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "dataset_process_inputs", "isLegacyAttribute": false }, { "name": "schema", "typeName": "array<avro_schema>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "avro_schema_associatedEntities", "isLegacyAttribute": false }, { "name": "meanings", "typeName": "array<AtlasGlossaryTerm>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "AtlasGlossarySemanticAssignment", "isLegacyAttribute": false }, { "name": "table", "typeName": "clickhouse_table", "isOptional": false, "cardinality": "SINGLE", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "clickhouse_table_columns", "isLegacyAttribute": false }, { "name": "outputFromProcesses", "typeName": "array<Process>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "process_dataset_outputs", "isLegacyAttribute": false } ], "businessAttributeDefs": { "Description": [ { "name": "index_desc", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": true, "includeInNotification": false, "searchWeight": 5, "options": { "applicableEntityTypes": "[\"hive_column\",\"clickhouse_column\",\"clickhouse_column\"]", "maxStrLength": "10000" } } ] } } ], "relationshipDefs": [] }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
-
clickhouse_storagedesc类型:
-
curl -i -X POST -H "Content-Type: application/json" -d '{ "enumTypes": [], "structTypes": [], "classificationDefs": [], "entityDefs": [ { "category": "ENTITY", "version": 1, "name": "clickhouse_storagedesc", "description": "clickhouse_storagedesc", "typeVersion": "1.0", "serviceType": "clickhouse", "attributeDefs": [ { "name": "location", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": 10 }, { "name": "inputFormat", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "outputFormat", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "serde", "typeName": "string", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 }, { "name": "compressed", "typeName": "boolean", "isOptional": false, "cardinality": "SINGLE", "valuesMinCount": 1, "valuesMaxCount": 1, "isUnique": false, "isIndexable": true, "includeInNotification": false, "searchWeight": -1 }, { "name": "parameters", "typeName": "map<string,string>", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": 0, "valuesMaxCount": 1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1 } ], "superTypes": [ "Referenceable" ], "subTypes": [], "relationshipAttributeDefs": [ { "name": "meanings", "typeName": "array<AtlasGlossaryTerm>", "isOptional": true, "cardinality": "SET", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "AtlasGlossarySemanticAssignment", "isLegacyAttribute": false }, { "name": "table", "typeName": "clickhouse_table", "isOptional": true, "cardinality": "SINGLE", "valuesMinCount": -1, "valuesMaxCount": -1, "isUnique": false, "isIndexable": false, "includeInNotification": false, "searchWeight": -1, "relationshipTypeName": "clickhouse_table_storagedesc", "isLegacyAttribute": false } ], "businessAttributeDefs": {} } ], "relationshipDefs": [] }' --user admin:admin "http://localhost:21000/api/atlas/v2/types/typedefs"
-
二、定义关系类型
这一步很关键,不然创建了实体以后,实体之间是建立不起来联系的,比如从表名跳转不到列名上
#/v2/types/typedefs
{
"entityDefs": [],
"classificationDefs": [],
"structDefs": [],
"enumDefs": [],
"relationshipDefs": [
{
"category": "RELATIONSHIP",
"version": 1,
"name": "clickhouse_table_db",
"description": "clickhouse_table_db",
"typeVersion": "1.0",
"serviceType": "clickhouse",
"attributeDefs": [],
"relationshipCategory": "AGGREGATION",
"propagateTags": "NONE",
"endDef1": {
"type": "clickhouse_table",
"name": "db",
"isContainer": false,
"cardinality": "SINGLE",
"isLegacyAttribute": false
},
"endDef2": {
"type": "clickhouse_db",
"name": "tables",
"isContainer": true,
"cardinality": "SET",
"isLegacyAttribute": false
}
},
{
"category": "RELATIONSHIP",
"version": 1,
"name": "clickhouse_table_columns",
"description": "clickhouse_table_columns",
"typeVersion": "1.0",
"serviceType": "clickhouse",
"attributeDefs": [],
"relationshipCategory": "COMPOSITION",
"propagateTags": "NONE",
"endDef1": {
"type": "clickhouse_table",
"name": "columns",
"isContainer": true,
"cardinality": "SET",
"isLegacyAttribute": false
},
"endDef2": {
"type": "clickhouse_column",
"name": "table",
"isContainer": false,
"cardinality": "SINGLE",
"isLegacyAttribute": false
}
},
{
"category": "RELATIONSHIP",
"version": 1,
"name": "clickhouse_table_storagedesc",
"description": "clickhouse_table_storagedesc",
"typeVersion": "1.0",
"serviceType": "clickhouse",
"attributeDefs": [],
"relationshipCategory": "ASSOCIATION",
"propagateTags": "NONE",
"endDef1": {
"type": "clickhouse_table",
"name": "sd",
"isContainer": false,
"cardinality": "SINGLE",
"isLegacyAttribute": false
},
"endDef2": {
"type": "clickhouse_storagedesc",
"name": "table",
"isContainer": false,
"cardinality": "SINGLE",
"isLegacyAttribute": false
}
}
]
}
三、采集元数据
这部分这里就不详细说明了,实现方法有非常多,由于clickhouse元数据不涉及太多血缘,所以我们直接用clickhouse自带的元数据做了一张表,里面包含了一些主要的信息,例如库名、表名、字段英文名、字段中文名等,我给出一个简单的sql,你们可以根据自己实际情况调整,或者自己写一个hook工具都可以
select
database `db_name`,
table `tbl_name`,
name `column_name`,
type `column_type`,
default_expression `is_nullable`,
is_in_partition_key `partition_key`,
is_in_primary_key `column_key`,
comment `column_comment`,
position `column_position`
from system.columns
四、同步元数据
同步元数据有两种方式,一种是Atlas自带的API,另一种是往Atlas里的Kafka写消息,下面我分别介绍:
1.自带API
API文档可以从以下路径找:
找到这个API后,点“try it out”,输入以下JSON:
{"entities": [
{
"typeName": "clickhouse_table",
"attributes": {
"owner": "bi",
"ownerType": "USER",
"sd": {
"typeName": "clickhouse_storagedesc",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary_storage",
"name": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
"location": "hdfs://HDFS80727/bi/bi_app.db/wuxl_0316_rr",
"compressed": false,
"inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"parameters": {
"serialization.format": "1"
}
},
"guid": "-28224574948884002",
"version": 0,
"proxy": false
},
"tableType": "MANAGED",
"createTime": 1709003223000,
"qualifiedName": "bi_app.wuxl_0316_rr@primary",
"columns": [
{
"typeName": "clickhouse_column",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr.column1@primary",
"name": "column1",
"comment": "ziduan1",
"type": "string",
"table": {
"typeName": "clickhouse_table",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary"
},
"guid": "-28224574948884003",
"version": 0,
"proxy": false
}
},
"guid": "-28224574948884005",
"version": 0,
"proxy": false
},
{
"typeName": "clickhouse_column",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr.column2@primary",
"name": "column2",
"comment": "ziduan2",
"type": "string",
"table": {
"typeName": "clickhouse_table",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary"
},
"guid": "-28224574948884003",
"version": 0,
"proxy": false
}
},
"guid": "-28224574948884006",
"version": 0,
"proxy": false
}
],
"name": "wuxl_0316_rr",
"comment": "测试表",
"parameters": {
"transient_lastDdlTime": "1709003223"
},
"db": {
"typeName": "clickhouse_db",
"attributes": {
"owner": "bi",
"ownerType": "USER",
"qualifiedName": "bi_app@primary",
"clusterName": "primary",
"name": "bi_app",
"description": "",
"location": "hdfs://HDFS80727/bi/bi_app.db",
"parameters": {
}
},
"guid": "-28224574948884001",
"version": 0,
"proxy": false
}
},
"guid": "-28224574948884003",
"version": 0,
"proxy": false,
"relationships":{
"typeName":"clickhouse_table_db",
"db":["-28224574948884001"],
"end1": {
"typeName": "clickhouse_table",
"guid": "-28224574948884003"
},
"end2": {
"typeName": "clickhouse_db",
"guid": "-28224574948884001"
}
}
},
{
"typeName": "clickhouse_db",
"attributes": {
"owner": "bi",
"ownerType": "USER",
"qualifiedName": "bi_app@primary",
"clusterName": "primary",
"name": "bi_app",
"description": "",
"location": "hdfs://HDFS80727/bi/bi_app.db",
"parameters": {
}
},
"guid": "-28224574948884001",
"version": 0,
"proxy": false,
"relationships":{}
},
{
"typeName": "clickhouse_storagedesc",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary_storage",
"name": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
"location": "hdfs://HDFS80727/bi/bi_app.db/wuxl_0316_rr",
"compressed": false,
"inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"parameters": {
"serialization.format": "1"
}
},
"guid": "-28224574948884002",
"version": 0,
"proxy": false,
"relationships":{
"typeName":"clickhouse_table_storagedesc",
"table":["-28224574948884003"],
"end1": {
"typeName": "clickhouse_storagedesc",
"guid": "-28224574948884002"
},
"end2": {
"typeName": "clickhouse_table",
"guid": "-28224574948884003"
}
}
},
{
"typeName": "clickhouse_column",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr.column1@primary",
"name": "column1",
"comment": "ziduan1",
"type": "string",
"table": {
"typeName": "clickhouse_table",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary"
},
"guid": "-28224574948884003",
"version": 0,
"proxy": false
}
},
"guid": "-28224574948884005",
"version": 0,
"proxy": false,
"relationships":{
"typeName":"clickhouse_table_columns",
"table":["-28224574948884003"],
"end1": {
"typeName": "clickhouse_column",
"guid": "-28224574948884005"
},
"end2": {
"typeName": "clickhouse_table",
"guid": "-28224574948884003"
}
}
},
{
"typeName": "clickhouse_column",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr.column2@primary",
"name": "column2",
"comment": "ziduan2",
"type": "string",
"table": {
"typeName": "clickhouse_table",
"attributes": {
"qualifiedName": "bi_app.wuxl_0316_rr@primary"
},
"guid": "-28224574948884003",
"version": 0,
"proxy": false
}
},
"guid": "-28224574948884006",
"version": 0,
"proxy": false,
"relationships":{
"typeName":"clickhouse_table_columns",
"table":["-28224574948884003"],
"end1": {
"typeName": "clickhouse_column",
"guid": "-28224574948884006"
},
"end2": {
"typeName": "clickhouse_table",
"guid": "-28224574948884003"
}
}
}
]}
2.Kafka
直接往Atlas自带的“ATLAS_HOOK” topic里写消息,atlas会解析并创建实体和实体间的关系
-- 使用Flinksql往Atlas自带的topic里写消息
CREATE TABLE ads_zdm_offsite_platform_daren_rank_df_to_kafka (
data string
) WITH (
'connector' = 'kafka',
'topic' = 'ATLAS_HOOK',
'properties.bootstrap.servers' = 'localhost:9092',
'format' = 'raw'
);
insert into ads_zdm_offsite_platform_daren_rank_df_to_kafka
select '{"version":{"version":"1.0.0","versionParts":[1]},"msgCompressionKind":"NONE","msgSplitIdx":1,"msgSplitCount":1,"msgSourceIP":"10.45.1.116","msgCreatedBy":"bi","msgCreationTime":1710575827820,"message":{"type":"ENTITY_CREATE_V2","user":"bi","entities":{"entities":[{"typeName":"clickhouse_table","attributes":{"owner":"bi","ownerType":"USER","sd":{"typeName":"clickhouse_storagedesc","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary_storage","name":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","location":"hdfs://HDFS80727/bi/test.db/wuxl_0316_ss","compressed":false,"inputFormat":"org.apache.hadoop.mapred.TextInputFormat","outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","parameters":{"serialization.format":"1"}},"guid":"-861237351166887","version":0,"proxy":false},"tableType":"MANAGED","createTime":1710575827000,"qualifiedName":"test.wuxl_0316_ss@primary","columns":[{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_1@primary","name":"column_tt_1","comment":"测试字段1","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166890","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_2@primary","name":"column_tt_2","comment":"测试字段2","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166891","version":0,"proxy":false}],"name":"wuxl_0316_ss","comment":"测试表","parameters":{"transient_lastDdlTime":"1710575827"},"db":{"typeName":"clickhouse_db","attributes":{"owner":"bi","ownerType":"USER","qualifiedName":"test@primary","clusterName":"primary","name":"test","description":"","location":"hdfs://HDFS80727/bi/test.db","parameters":{}},"guid":"-861237351166886","version":0,"proxy":false}},"guid":"-861237351166888","version":0,"proxy":false},{"typeName":"clickhouse_db","attributes":{"owner":"bi","ownerType":"USER","qualifiedName":"test@primary","clusterName":"primary","name":"test","description":"","location":"hdfs://HDFS80727/bi/test.db","parameters":{}},"guid":"-861237351166886","version":0,"proxy":false},{"typeName":"clickhouse_storagedesc","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary_storage","name":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","location":"hdfs://HDFS80727/bi/test.db/wuxl_0316_ss","compressed":false,"inputFormat":"org.apache.hadoop.mapred.TextInputFormat","outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","parameters":{"serialization.format":"1"}},"guid":"-861237351166887","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_1@primary","name":"column_tt_1","comment":"测试字段1","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166890","version":0,"proxy":false},{"typeName":"clickhouse_column","attributes":{"qualifiedName":"test.wuxl_0316_ss.column_tt_2@primary","name":"column_tt_2","comment":"测试字段2","type":"string","table":{"typeName":"clickhouse_table","attributes":{"qualifiedName":"test.wuxl_0316_ss@primary"},"guid":"-861237351166888","version":0,"proxy":false}},"guid":"-861237351166891","version":0,"proxy":false}]}}}' as data
;
五、其他说明
要注意第四步里边的guid,由于是新建的实体,Atlas还没有为其创建guid(全局唯一ID),所以这里可以自己生成一个“-xxxxxxxxx”的id,生成的这个id不会写到Atlas里,这个的作用是用来表述实体间的关系的,比如字段对应的表是哪个,在创建字段实体的时候,就要指定它对应的表的虚拟guid,这样Atlas在创建的时候就会创建对应的关系