Overview
Knowlix Data Quality Management maintains database integrity through automated and manual cleanup processes. Remove duplicate records, archive outdated information, standardize text formatting, and ensure consistent data quality across all modules. Three core capabilities address common data quality challenges: Deduplication: Identify and merge duplicate contacts, leads, opportunities, products, or other records. Automated similarity matching finds near-duplicates with configurable matching rules. Merge duplicates while preserving related records and communication history. Record Recycling: Automatically archive or delete obsolete records based on age and inactivity. Configure rules to clean old archived leads, expired opportunities, inactive products, or historical data no longer needed for operations. Field Formatting: Standardize text data across the database. Trim extra spaces, enforce capitalization rules, format phone numbers to international standards, and convert HTML content to plain text. Maintain consistent formatting without manual editing. 💡 Pro Tip: Ask Your Knowlix for data quality tasks: “Find duplicate contacts” or “Show me archived opportunities older than 1 year”Deduplication System
Identify and merge duplicate records to eliminate data redundancy.Viewing Duplicates
The Duplicates dashboard groups similar records detected by deduplication rules. To access duplicates: Navigate to Data Cleaning → Deduplication The left sidebar shows active deduplication rules with the count of duplicates detected by each rule. Duplicate groupings display:- Similarity Rating: Percentage match between records (higher = more similar)
- Created On: Original record creation date
- Name: Record title or identifier
- Field Values: Values from fields used for duplicate detection
- Used In: Other models referencing this record
- ID: Unique database identifier
- Is Master: Designates which record becomes the merge target
Merging Duplicate Records
To merge duplicates:- Review the grouped similar records
- Select one record as the master (the record that will remain after merging)
- Click Merge at the top of the grouping
- Click OK to confirm
Deduplication Rules
Configure rules defining how duplicate detection works. To manage deduplication rules: Navigate to Data Cleaning → Configuration → Deduplication Creating or editing rules:- Click New or select an existing rule
-
Configure rule settings:
- Model: Which record type to scan (Contacts, Leads, Products, etc.)
- Domain: (Optional) Filter which records within the model are eligible
- Duplicate Removal: Archive or Delete merged records
- Merge Mode: Manual (requires review) or Automatic (merges without approval)
- Similarity Threshold: (Automatic mode only) Minimum similarity percentage for auto-merge
- Active: Enable or disable the rule
-
Add at least one deduplication rule line:
- Unique ID Field: Which field to compare (Name, Email, Phone, etc.)
- Match If: Matching criteria (Exact Match or Case/Accent Insensitive Match)
- Save the rule
- Rules run automatically daily via scheduled tasks
- Manually trigger any rule by opening it and clicking Deduplicate
- The Duplicates smart button shows how many duplicates were found
- Suggestion Threshold: Minimum similarity to suggest duplicates (below this threshold, no suggestion)
- Cross-Company: (Multi-company databases) Allow duplicate detection across companies
- Model: Contact
- Domain: (empty - all contacts)
- Duplicate Removal: Archive
- Merge Mode: Manual
- Unique ID Field: Email
- Match If: Case/Accent Insensitive Match
Record Recycling
Automatically identify and remove outdated records.Viewing Recyclable Records
The Recycle Records dashboard lists records matching recycle rule criteria. To access recyclable records: Navigate to Data Cleaning → Recycle Records The left sidebar displays active recycle rules. Record display columns:- Record ID: Unique identifier
- Record Name: Record title
Recycling Records
To recycle a record: Click the Validate button on the record row. The record is archived or deleted based on rule configuration. Discarding records: Click Discard if a record shouldn’t be recycled. Discarded records won’t appear in future scans by that rule. Viewing discarded records: Use the “Discarded” search filter to see previously discarded records.Recycle Rules
Configure rules for identifying outdated records. To create recycle rules: Navigate to Data Cleaning → Configuration → Recycle Records No default rules exist. Click New to create one. Rule configuration:- Select Model (record type to scan)
- (Optional) Set Filter to narrow eligible records
- Configure time criteria:
- Time Field: Which date field determines record age (Last Updated, Created Date, etc.)
- Delta: Time period value (e.g., 7, 30, 365)
- Delta Unit: Time unit (Days, Weeks, Months, Years)
- Choose Recycle Mode:
- Manual: Requires review before recycling
- Automatic: Recycles without approval
- Select Recycle Action:
- Archive: Move records to archived state
- Delete: Permanently remove from database
- (Delete only) Include Archived: Also delete already-archived records matching criteria
- Save the rule
- Model: Lead/Opportunity
- Filter: Active is not set AND Lost Reason is “Too Expensive”
- Time Field: Last Updated on
- Delta: 1
- Delta Unit: Years
- Recycle Mode: Automatic
- Recycle Action: Delete
- Include Archived: Enabled
Field Cleaning and Formatting
Standardize text formatting across database fields.Viewing Field Cleaning Suggestions
The Field Cleaning Records dashboard displays formatting suggestions. To access field cleaning: Navigate to Data Cleaning → Field Cleaning The left sidebar shows active cleaning rules. Display columns:- Record ID: Unique identifier
- Record Name: Record title
- Field: Which field contains the value to format
- Current: Existing field value
- Suggested: Proposed formatted value
Applying Formatting Changes
To apply a suggestion: Click the Validate button on the record row. The current value updates to the suggested format. Discarding suggestions: Click Discard if the current format is correct. The record won’t appear in future scans by that rule. Viewing discarded records: Use the “Discarded” search filter.Field Cleaning Rules
Configure formatting standards for database fields. To manage cleaning rules: Navigate to Data Cleaning → Configuration → Field Cleaning A default “Contact” rule exists for formatting contact records. Edit it or create new rules. Creating cleaning rules:- Click New
- Select Model (record type)
- Click Add a line in the Rules section
- In the popup, configure:
- Field To Clean: Which field to format
- Action: Formatting operation to apply
- Choose Cleaning Mode:
- Manual: Requires review before applying changes
- Automatic: Applies changes without approval
- Save
- All Spaces: Removes every space character
- Example: “Dr. John Doe” becomes “Dr.JohnDoe”
- Superfluous Spaces: Removes leading, trailing, and successive spaces
- Example: “Dr. John Doe” becomes “Dr. John Doe”
- First Letters to Uppercase: Capitalizes first letter of each word
- Example: “lumber inc, lorraine douglas” becomes “Lumber Inc, Lorraine Douglas”
- All Uppercase: Converts entire string to capitals
- Example: “lumber inc” becomes “LUMBER INC”
- All Lowercase: Converts to all lowercase
- Example: “LUMBER INC” becomes “lumber inc”
- Example (Belgium): “061928374” becomes “+32 61 92 83 74”
- Example (US): “800 555-0101” becomes “+1 800-555-0101”
- Example:
<h1>John Doe</h1><p>Lorem ipsum</p>becomes “John Doe Lorem ipsum”
Merge Action Manager
Control which database models support manual merging. To access merge settings: Enable developer mode, then navigate to Data Cleaning → Configuration → Merge Action Manager Display columns:- Model: Technical database model name
- Model Description: User-friendly model name
- Type: Base Object (system models) or Custom Object (user-created models)
- Transient Model: Whether the model stores temporary data
- Can Be Merged: Whether the merge action is enabled for this model
Best Practices
Run Deduplication Rules Regularly
Don’t wait for duplicate problems to accumulate. Enable automatic deduplication rules or manually run them weekly to maintain clean data.Use Manual Mode for Initial Testing
When creating new deduplication or recycle rules, start with Manual mode. Review suggested actions to ensure rules work as intended before switching to Automatic mode.Choose Appropriate Similarity Thresholds
High thresholds (90%+) reduce false positives but may miss legitimate duplicates. Lower thresholds (70-80%) catch more duplicates but require more manual review. Adjust based on your data quality needs.Configure Time-Based Recycle Rules Conservatively
Start with longer time periods (2-3 years for deletions) and shorten as you gain confidence. Accidentally deleting recent data is worse than keeping old data slightly longer.Always Archive Before Deleting
Create two recycle rules for the same data: first rule archives after 1 year, second rule deletes after 3 years. This creates a safety buffer for data recovery.Test Field Cleaning Rules on Small Samples
Before enabling automatic field cleaning, run rules manually and review suggestions. Ensure formatting changes match your expectations before automating.Combine Multiple Formatting Rules
Apply trimming before capitalization formatting. For example, trim superfluous spaces first, then apply “First Letters to Uppercase” for clean, consistently formatted names.Document Your Data Quality Policies
Maintain written policies about when duplicates are merged, how long records are retained, and what formatting standards apply. Train team members on these policies.Review Discarded Records Periodically
Quarterly review discarded duplicates and recycled records. If you’re frequently discarding the same types of records, your rules may need adjustment.Backup Before Major Cleanup Operations
Before running large-scale automatic deletions or merges, back up your database. This provides rollback capability if rules behave unexpectedly.Use Cross-Company Deduplication Carefully
In multi-company databases, cross-company duplicate detection can be valuable but may merge records that should remain separate for business reasons. Review cross-company duplicates carefully.Monitor Automated Cleaning Results
Even with automatic rules, periodically review what’s being cleaned. Spot-check merged records, deleted records, and formatted fields to ensure quality.Related Documentation
- Contacts - Contact deduplication and merging
- CRM - Lead and opportunity cleanup
- Data Import/Export - Cleaning data before import
Need Help?
Ask Your Knowlix:- “Find duplicate contacts in the database”
- “Show me archived opportunities older than 2 years”
- “Create a rule to format phone numbers”
- “Merge these duplicate leads”
- “Which recycle rules are running automatically?”
- “Format all contact names to title case”
