US State Match (since v7.1): Strings are treated as states, and codes are matched with the subdivision names.For example, Canada and CA would be an exact match. Country Match (since v7.1): Strings are treated as country names, and country codes are matched with English short country names.US Zip Code Match: Strings are parsed to attempt to identify a 5 digit zip code which is used for comparison, ignoring other characters.Phone Number Match: Strings are parsed to attempt to identify a 10 digit phone number which is used for comparison, ignoring any other characters. Company Name: Strings are treated as company names, and common company suffixes such as INC, CORP and LLC are ignored when comparing.For example, Bill and William would be an exact match. First Name Match: Strings are treated as first names, and common nicknames are treated as the same name.Address Match: Strings are treated as street addresses, and common street name prefixes and directions are treated as equivalent.Fuzzy Match: Strings are considered a match if their similarity is above the specified similarity threshold.The Similarity Threshold slider is disabled because matches must always have a similarity of 1.00. Exact Match: Fields must be exactly the same to be considered a match.Only string types such as STR, NTEXT, TEXT and WSTR can be used with a fuzzy match filter, a warning will be shown if any other type is selected. If you select Fuzzy Match as the match type the similarity threshold will be enabled. Fuzzy match columns must match above a specified similarity threshold. Exact match columns must be matched exactly to be considered duplicates. The match type of a filter can be set to either an Exact Match, the generic Fuzzy Match, or a more specific match type. Column types other than IMAGE and BYTES can be used in filters. This is the name of the column that will be compared. Each filter has three properties, column name, match type and similarity threshold. Click the 'Add Filter' button to add one. When a new duplicate detector component is added, it will not have any filters. The General page of the Diff Detector Component allows you to specify the general settings of the component. The component includes the following three pages to configure how you want to read data. Comparisons can be either exact or approximate (fuzzy).Įach row is compared to other rows based on the specified filters, is determined to either be unique or a duplicate and is directed to the applicable output. Comparison filters can be added for fields in the source that determine how they will be compared. The Duplicate Detector Component is an SSIS data flow pipeline component that can be used to compare rows within a single data source and identify duplicate rows.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |