The first thing I did is load the document and namespace where package path is the location of the package.
doc.Load(Variables.PackagePath); XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); nsmgr.AddNamespace("DTS", "www.microsoft.com/SqlServer/Dts");
Then I retrieved each node with a lineageid and name using an xpath query.
foreach (XmlNode childnode in doc.SelectNodes("//*[@lineageId != '' and @name != '']"))
I then pulled out the TaskName, ColumnName, and LineageID. I included the TaskName because LineageIDs are not unique within a package.
XmlNode ExecutableNode = childnode.SelectSingleNode("ancestor::DTS:Executable[1]", nsmgr); TaskName = ExecutableNode.SelectSingleNode("DTS:Property[@DTS:Name='ObjectName']", nsmgr).InnerText; ColumnName = childnode.Attributes["name"].Value; LineageID = Convert.ToInt32(childnode.Attributes["lineageId"].Value);
I then further deduped within the task by only inserting unique taskname, columnname, and lineageid pairs using a hashtable
DistinctColumnKey =LineageID+ColumnName+TaskName;
if (!DistinctColumn.ContainsKey(DistinctColumnKey))
{
DistinctColumn.Add(DistinctColumnKey,DBNull.Value);
ColumnNamesBuffer.AddRow();
ColumnNamesBuffer.LineageID = LineageID;
ColumnNamesBuffer.ColumnName = ColumnName;
ColumnNamesBuffer.TaskName = TaskName;
}
I usually insert the column names into a cache connection manager and then use throughout the package. Here’s the entire script.
public override void CreateNewOutputRows() { Int32 LineageID; String ColumnName; String TaskName; XmlDocument doc = new XmlDocument(); Hashtable DistinctColumn = new Hashtable(); String DistinctColumnKey; doc.Load(Variables.PackagePath); XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); nsmgr.AddNamespace("DTS", "www.microsoft.com/SqlServer/Dts"); foreach (XmlNode childnode in doc.SelectNodes("//*[@lineageId != '' and @name != '']")) { XmlNode ExecutableNode = childnode.SelectSingleNode("ancestor::DTS:Executable[1]", nsmgr); TaskName = ExecutableNode.SelectSingleNode("DTS:Property[@DTS:Name='ObjectName']", nsmgr).InnerText; ColumnName = childnode.Attributes["name"].Value; LineageID = Convert.ToInt32(childnode.Attributes["lineageId"].Value); DistinctColumnKey =LineageID+ColumnName+TaskName; if (!DistinctColumn.ContainsKey(DistinctColumnKey)) { DistinctColumn.Add(DistinctColumnKey,DBNull.Value); ColumnNamesBuffer.AddRow(); ColumnNamesBuffer.LineageID = LineageID; ColumnNamesBuffer.ColumnName = ColumnName; ColumnNamesBuffer.TaskName = TaskName; } } }
Hi, I think this is what I'm looking for, but too new to know! Is this a script task or script component? Whichever it is, where does it go? What is a cache connection manager? How/where is that set up? Once it is loaded with this data, how do I get it out? Thank you.
ReplyDeleteHi Angelia ,
ReplyDeleteCan you pls share the package example .
Regards,
Val
ColumnNamesBuffer doesnot contain in context...what is this error? plz tell
ReplyDelete