Link detection scans files and identifies any links in the files. This option is enabled when setting the behaviors for the job during job creation. It will run for both simulation and transfer jobs. The Job reports will display link information when available for the job.
When doing the content analysis for link detection, DryvIQ needs a seekable stream. To obtain that, DryvIQ downloads the file into memory if it is small enough or into a temp location on the processing node if the file is too large. DryvIQ analyzes that stream, resets it, and uploads the file to the destination. After the transfer is complete, DryvIQ removes the temp file if it was needed to for file analysis.
Link detection only scans the latest version of each file and reports the links detected. It does not scan previous versions.
Supported File Types
Link detection currently only identifies links in files with the following extensions:
DOCX (available in Microsoft Word 2007 and newer)
PPTX (available in Microsoft PowerPoint 2007 and newer)
XLSX (available in Microsoft Excel 2007 and newer).
Hyperlinks: These are links to websites or documents. Hyperlinks can be http/https/ftp/ftps URLs or links to files. In Microsoft Word, Excel, and PowerPoint files, these links are created using the Link option on the Insert tab or by right-clicking on selected text/cell and selecting Link from the shortcut menu.
References to other Excel spreadsheets: In Microsoft Excel files, these are links to cells in other Microsoft Excel files. These links are made by creating a formula that references a cell or range of cells in another Microsoft Excel file. The cells are formatted similar to the following examples:
Links documents/object: In Microsoft PowerPoint files, this is content that has been imported into the presentation. This content is imported using the Object option on the Insert tab or using the Paste Special option to insert a link to a Microsoft Word Document Object.
Unformatted links: DryvIQ will not count unformatted links (URLs that are added as plain text in the document).
IncludeText fields: In Microsoft Word files, link detection does not support links added through IncludeText fields using the Insert Quick Parts option.
Job filter exclusions take precedence over Link Detection. Therefore, if a job filter exclusion is set to ignore DOCX, PPTX, or XLSX files, Link Detection will also ignore these files.
Enabling Link Detection in the UI
Link Detection is available as one of the Behavior options when creating a job. This feature is disabled by default. Select the Allow link detection on supported files toggle to enable the feature.
Viewing Link Information
When enabled, link detection will identify the links in files and make the information available for review on the individual Job reports and the roll-up reports. Information is available on the Content Insights, Items, and Log pages.
It is important to note that link counts for spreadsheets will not always match depending on how the link was added to a cell. If the links are added to multiple cells at the same time, DryvIQ reads the link as one link shared across cells. In this instance, all shared links count as one link. If the links are added to multiple cells separately (one cell at a time), DryvIQ counts each cell as separate. In this instance, each link is counted individually.
The Content insights page will include the link detection results. The information can be found in the Number of links detected chart (under the Content analysis). This chart lists the files with links and the number of links detected in each file.
This information can be exported to a csv file for further review using the Export this report link. The export includes the following information.
The ID assigned to the file on the source platform
The filename on the source platform. The source and destination file names may not match if DryvIQ needed to sanitize the the filename due to character or length restrictions for the destination platform.
The path where the file is located on the source platform.
The ID assigned to the file on the destination platform
The filename on the destination platform. The source and destination file names may not match if DryvIQ needed to sanitize the the filename due to character or length restrictions for the destination platform.
The path where the file is located on the destination platform.
The URL for the link detected.
The number of times the link was found in the file.
Link counts for spreadsheets will not always match depending on how the link was added to a cell. If the links are added to multiple cells at the same time, DryvIQ reads the link as one link shared across cells. In this instance, all shared links count as one link. If the links are added to multiple cells separately (one cell at a time), DryvIQ counts each cell as separate. In this instance, each link is counted individually.
In addition, the Items page has additional options to allow you to view link information. You can filter the Items page to show just items that have links by selecting Feature as the filter and then selecting Files with Links as the value for the filter. The page will display only files that have links.
You can also set a column on the Items page to display the number of links detected in each file. Select the arrow next to the column heading and select Number of links from the list that appears. The selected column will now display the link count for each file. A dash (-) will appear in this column if no links were found in the file.
An entry will be added to the audit log for each file scanned for link detection. The activity will be flagged as “Informational activity,” and the details for the file will read, “The ‘Link detection’ extension analyzed the content of the file.”
Enabling Link Detection Through the API
When creating the job through the REST API, you can enable Link Detection in the transfer block by sending “link_detection” set to “true.”