一、动态分区的好处就不说了,随着时间的推移,不可能一个度量值组都放在一个分区中,处理速度非常慢,如何动态添加分区,如何动态处理分区,成为了很多新手BI工程师一个头痛的问题,废话不多说,分享一下我的经验。
二、首先讲一下大致的流程,主要是通过SSIS进行任务的处理,本文主要是按照月进行分区,当然分区的规则大家可以根据自己的需求制定。
该包用到的所有变量
三、对上面四个步骤分别讲解一下。
1、得到所有分区:
①、主要设置如下图
②、输出的结果集应该传给变量Partitions
③、SQLStatement为:(主要依据创建分区的语句中需要的参数的值)
SELECT 'RmyyHisDW' AS DataSoureID,--数据源
'RmyyMZ' AS CubeName,--分区来自哪一个cube
'RmyyMZ' AS CubeID,
'Fact Mz Visit Table' AS MeasureGroup,--指定是一个度量值组
'Fact Mz Visit Table' AS MeasureGroupID,
'Fact Mz Visit Table' + Cast(MonthInfo.YearMonth AS VARCHAR(6)) AS Partition,--分区名称=度量值组名称+年月
'SELECT [dbo].[fact_mz_visit_table].[patient_id],
[dbo].[fact_mz_visit_table].[times],
[dbo].[fact_mz_visit_table].[name],
[dbo].[fact_mz_visit_table].[age],
[dbo].[fact_mz_visit_table].[ampm],
[dbo].[fact_mz_visit_table].[charge_type],
[dbo].[fact_mz_visit_table].[clinic_type],
[dbo].[fact_mz_visit_table].[contract_code],
[dbo].[fact_mz_visit_table].[visit_dept],
[dbo].[fact_mz_visit_table].[doctor_code],
[dbo].[fact_mz_visit_table].[gh_date],
[dbo].[fact_mz_visit_table].[gh_date_time],
[dbo].[fact_mz_visit_table].[gh_opera],
[dbo].[fact_mz_visit_table].[haoming_code],
[dbo].[fact_mz_visit_table].[icd_code],
[dbo].[fact_mz_visit_table].[icd_code1],
[dbo].[fact_mz_visit_table].[icd_code2],
[dbo].[fact_mz_visit_table].[icd_code3],
[dbo].[fact_mz_visit_table].[response_type],
[dbo].[fact_mz_visit_table].[visit_date],
[dbo].[fact_mz_visit_table].[visit_date_time],
[dbo].[fact_mz_visit_table].[visit_flag]
FROM [dbo].[fact_mz_visit_table]
WHERE visit_flag <> 9 and where_clause' AS SQL,--要进行分区的SQL
cast(MinDateKey as varchar(8)) as MinDateKey,--最小datekey
cast(MaxDateKey as varchar(8)) as MaxDateKey--最大datekey
FROM (SELECT t1.YearMonth,
(SELECT Min(datekey)
FROM dim_date t2
WHERE CONVERT(VARCHAR(6), t2.Date, 112) = t1.YearMonth) AS MinDateKey,
(SELECT Max(datekey)
FROM dim_date t2
WHERE CONVERT(VARCHAR(6), t2.Date, 112) = t1.YearMonth) AS MaxDateKey
FROM (SELECT DISTINCT CONVERT(VARCHAR(6), Date, 112) AS YearMonth
FROM dim_date) AS t1) MonthInfo
WHERE EXISTS(SELECT *
FROM fact_mz_visit_table
WHERE visit_date BETWEEN MonthInfo.MinDateKey AND MonthInfo.MaxDateKey)
注意:SQL字段中最后面有个where_clause ,在“判断分区脚本任务”中的C#脚本中会替换成后面的where条件,也就是将MinDateKey和MaxDateKey加入条件限制,进行分区。
④、步骤③执行的结果为
2、Foreach 循环容器(主要循环执行上面的sql语句执行的结果)相关设置如下图
注意:变量映射按照sql语句中的字段名的顺序
3、判断分区是否存在,主要是通过步骤2中传出的参数判断cube中是否有该分区,有则不创建,无则通过Anaysis Services执行DDL任务来创建。
①、具体设置如下:
②、点击编辑脚本任务
需要引用AMO
③主要代码为
/*
Microsoft SQL Server Integration Services Script Task
Write scripts using Microsoft Visual C# 2008.
The ScriptMain is the entry point class of the script.
*/ using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using Microsoft.AnalysisServices; namespace ST_f33f263fa3864817a3291fc4715774d3.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{ #region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion /*
The execution engine calls this method when the task executes.
To access the object model, use the Dts property. Connections, variables, events,
and logging features are available as members of the Dts property as shown in the following examples. To reference a variable, call Dts.Variables["MyCaseSensitiveVariableName"].Value;
To post a log entry, call Dts.Log("This is my log text", 999, null);
To fire an event, call Dts.Events.FireInformation(99, "test", "hit the help message", "", 0, true); To use the connections collection use something like the following:
ConnectionManager cm = Dts.Connections.Add("OLEDB");
cm.ConnectionString = "Data Source=localhost;Initial Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;"; Before returning from this method, set the value of Dts.TaskResult to indicate success or failure. To open Help, press F1.
*/ public void Main()
{
// TODO: Add your code here
// Dts.TaskResult = (int)ScriptResults.Success;
//将参数赋给变量
String sPartition = (String)Dts.Variables["Partition"].Value;
String sCubeName = (String)Dts.Variables["CubeName"].Value;
String sMeasureGroup = (String)Dts.Variables["MeasureGroup"].Value;
String sServer = "localhost";
String sDataBaseID = (String)Dts.Variables["DatabaseID"].Value; String sCubeID = (String)Dts.Variables["CubeID"].Value;
String sMeasureGroupID = (String)Dts.Variables["MeasureGroupID"].Value;
String sDataSoureID = (String)Dts.Variables["DataSoureID"].Value;
String sSQL = (String)Dts.Variables["SQL"].Value;
String sMaxDateKey = (String)Dts.Variables["MaxDateKey"].Value;
String sMinDateKey = (String)Dts.Variables["MinDateKey"].Value;
string aSql = sSQL.Replace("where_clause", "visit_date >=" + sMinDateKey + " and visit_date <=" + sMaxDateKey); ConnectionManager cm = Dts.Connections.Add("MSOLAP100");
cm.ConnectionString = "Provider=MSOLAP.4;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=" + sDataBaseID; Microsoft.AnalysisServices.Server aServer = new Server();
aServer.Connect(sServer);
Microsoft.AnalysisServices.Database aDatabase = aServer.Databases.FindByName(sDataBaseID);
Microsoft.AnalysisServices.Cube aCube = aDatabase.Cubes.FindByName(sCubeName);
Microsoft.AnalysisServices.MeasureGroup aMeasureGroup = aCube.MeasureGroups.FindByName(sMeasureGroup);
//判断分区是否存在
if (aMeasureGroup.Partitions.Contains(sPartition))
{
Dts.Variables["IsNetePresent"].Value = false;
Dts.Variables["Xmla_Script"].Value = "";
Dts.TaskResult = (int)ScriptResults.Success;
}
else
{
Dts.Variables["IsNetePresent"].Value = true;
Dts.Variables["Xmla_Script"].Value =
"<Create xmlns=\"http://schemas.microsoft.com/analysisservices/2003/engine\">"
+ "<ParentObject>"
+ "<DatabaseID>" + sDataBaseID + "</DatabaseID>"
+ "<CubeID>" + sCubeID + "</CubeID>"
+ "<MeasureGroupID>" + sMeasureGroupID + "</MeasureGroupID>"
+ "</ParentObject>"
+ "<ObjectDefinition>"
+ "<Partition xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" "
+"xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:ddl2=\"http://schemas.microsoft.com/analysisservices/2003/engine/2\" xmlns:ddl2_2=\"http://schemas.microsoft.com/analysisservices/2003/engine/2/2\" xmlns:ddl100_100=\"http://schemas.microsoft.com/analysisservices/2008/engine/100/100\" xmlns:ddl200=\"http://schemas.microsoft.com/analysisservices/2010/engine/200\" xmlns:ddl200_200=\"http://schemas.microsoft.com/analysisservices/2010/engine/200/200\">"
+ "<ID>" + sPartition + "</ID>"
+ "<Name>" + sPartition + "</Name>"
+ "<Source xsi:type=\"QueryBinding\">"
+ "<DataSourceID>" + sDataSoureID + "</DataSourceID>"
+ "<QueryDefinition>" + aSql + "</QueryDefinition>"
+ "</Source>"
+ "<StorageMode>Molap</StorageMode> <ProcessingMode>Regular</ProcessingMode>"
+ "<ProactiveCaching> <SilenceInterval>-PT1S</SilenceInterval> <Latency>-PT1S</Latency> <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval> <ForceRebuildInterval>-PT1S</ForceRebuildInterval>"
+ "<Source xsi:type=\"ProactiveCachingInheritedBinding\" /> </ProactiveCaching>"
+ "</Partition>"
+ "</ObjectDefinition>"
+ "</Create>";
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
}
④、判断是否执行下一步
4、不存在创建分区(主要执行步骤3传过来的Xmla_Script),具体设置如下:
5、执行任务,查看结果: